Measures of Variability/Dispersion in Educational statistics: Measurs of Dispersion

3.3: Measures of Variability/Dispersion

The measures of central tendencies mean, median and mode, these 3 averages gives us an idea of the concentration of the observations about the central part of distribution. But these averages do not explain the characteristics of the distribution.

Ex: The marks of two students obtained in a test is as follows:

A=41, 46, 54, 40, 50, 51 . ^. . Average =282/6 =47

B= 30, 33, 32, 48, 69, 70 . ^. . Average =282/6 =47

Here the average marks of both A and B is same but we cannot say the performance of both A and B is same. Because here we cannot consider the performance of student in each subject. Here the Student ‘A’ got above 40 in each subject but ‘B’ is not. Therefore the performance of the student ‘A’ is better than ‘B’.

Measures of Variability

There are 4 measures of variability namely,

1. Range (R) 2. Quartile Deviation (Q.D) 3. Mean/Average Deviation (M.D)

4. Standard Deviation (S.D)

1. RANGE (R)

It is defined as the difference between the highest and lowest scores.

Range= Highest score – Lowest Scores

R= H - L

It is one of the least reliable measures of variability, for it is affected by fluctuations in the extreme scores.

Its only merit is that it can be easily calculated and readily understood.

Co-efficient of Range

It is defined as the ratio of the difference between the highest and lowest score to the sum of the highest and lowest score.

i.e. Co-efficient of Range = H - L

H+L
Range is a number between 0 and 1 scores are more consistent if the co-efficient of range is very near to 0 and not consistent if it is near to 1.

Merits: It is simple to understand and easy to calculate.

Limitations

1. It helps us to make only a rough comparison of two or more groups for variability.

2. It takes account of only the two extreme scores of a series and is unreliable when N is small or when there are large gaps (i.e., Zero f’s) in the frequency distribution.

3. It is affected greatly by fluctuations in sampling. Its value is never stable. In a class where normally the height of students ranges from 150 cms to 180 cms, if a dwarf, whose height is 90 cm is admitted, the range would shoot up from 90cm to 180cm.

4. The range does not take into account the composition of a series or the distribution of the items within the extremes. The range of a symmetrical and an asymmetrical distribution can be identical.

Use of Range-----

1. When a knowledge of extreme scores is all that is wanted;

2. When the data are too scant or too scattered to justify the computation of a more precise measure of variability.

3. Quality control.

4. In studying the fluctuation in prices.

5. Weather forecast.

_6._{Day to day activities like sales in a shop, earning of a
family in a week etc.}

QUARTILE DEVIATION (Q.D) or Q

(Semi inter quartile Range)

Range tells us only the difference between highest and lowest score within the distribution. The inter-quartile range measures approximately how far from the median. We can include on half of the scores (50%) of the given set of data. To compute this range we divide the given data in to four equal parts, each of which contains 25% of the items in the distribution. The quartiles are thus the highest value in each of these 4 parts.

-------------*-------------*----------------*---------------

Q1 Q2 Q3

First Quartile is ‘Q₁’ When Q₁ = N+1 Item

Second Quartile (median) is Q₂ When Q₂ = 2 X N+1 Item

Third Quartile is Q₃ When Q₃ = 3 X N+1 Item

Inter quartile range is the difference between Q₃ and Q₁ i.e., (Q₃- Q₁).

One half of the inter quartile range is a measure called Quartile Deviation.

Quartile Deviation (Q.D)= Q.D = Q3 – Q1

1. Find the Q.D for the following ungrouped data

25, 29, 36, 42, 48, 56, 62, 65, 67, 70, 72

Sl. No.	X
01	25
02	29
03	36 Q1
04	42
05	48
06	56
07	62
08	65
09	67 Q3
10	70
11	72

Q1 = N +1 item = 11+1 = 12/4 =3 rd item.

4 4

Q2 = 2 X N +1 item = 2X3 =6 th item.

Q3 = 3X N +1 item = 3 X 3 = 9^th item

Q.D = Q3 –Q1 = 67 – 36 = 31/2 = 15.5

2 2

Q.D = 15.5

2. Find the Q.D for the following ungrouped data

X	f	F
10	4	4
20	7	11 Q1
30	15	26
40	8	34 Q3
50	7	41
80	2	43

Q1 = N + 1 = 43 + 1 = 44/4 = 11^th item

4 4

i.e., Q1 = 20

Q3 = 3 X N + 1 = 3 X 11 = 33 rd item

i.e., Q3 = 40

. N = 43

. . Q.D = Q3 –Q1 = 40- 20

2 2

= 20/2 = 10

Q.D = 10

C - I	f	F
70 – 79	14	150
60 – 69	16	136
50 - 59	40	120
40 - 49	10	80
30 - 39	0	70
20- 29	20	70
10 - 19	40	50
0 - 9	10	10

3. Find the Q.D for the following grouped data

Q1 = N+1 = 150 + 1 = 151/4 = 37.75

4 4

Q1 = 37.75

Q3 = 3X N+1 = 3 X 37.75 = 113.25

Q3 = 113.25

Q1 = L + N/4 – F X I Q3 = L + 3N/4 – F X i

Fm fm

= 9.5 + 37.5 – 10 X 10 = 49.5+ 112.5 – 80 X 10

40 40

= 9.5 + 27.5/4 = 49.5 + 32.5/4

= 9.5 + 6.875 = 49.5 + 8.125

Q1 = 16.375

Q3 = 57.625

. . Q.D = Q3 – Q1 = 57.625 – 16.375 = 41.25/2 = 20.625

2 2

Q.D= 20.625

Merits of Quartile Deviation

1. It is a more representative and trustworthy measure of variability than the overall range;

2. It us a good index of score density at the middle of the distribution;

3. Quartiles are useful in indicating the skewness of a distribution;

Q3 – Q2 > Q2 – Q1 à Indicates + ve Skewness.

Q3 – Q2 < Q2 – Q1 à Indicates –ve Skewness.

Q3 – Q2 = Q2 – Q1 à Indicates Zero Skewness;

4. Like the median, Q.D is applicable to open-end distributions.

Limitations of Q.D

1. It is not capable for further algebraic treatment;

2. It is possible for two distributions to have equal Q2 but quite dissimilar variability of the lower and upper 25% of scores;

3. It is affected to a considerable extent by fluctuations in sampling. A change in the value of a single item may, in certain cases, affect its value considerably.

Use a Quartile Deviation

1. When the median is a measure of a central tendency.

2. When the distribution is incomplete at either end.

3. When there are scattered or extreme scores which would disproportionately influence the S.D.

4. When the concentration around the median- the middle 50% of primary interest.

STANDARD DEVIATION (S.D)

Standard deviation is the “Square root of the mean of the squares of individual deviations from the mean in a series.”

------James Drever.

Varience (⌐²) = ∑fd² , S.D (⌐) = √∑fd²/N

Short cut method:

S.D = √∑d²/N – (∑d/N)² or S.D = i √ ∑fd² /N - (∑fd/N)²

Steps to find S.D (Long method)

1. Find the mean using the formula, M = ∑fx

2. Calculate the value of‘d’. i.e. d = X- M for all the values of x.

3. Square the value of‘d’ i.e., d².

4. Find ∑ fd².

5. Calculate ⌐²= ∑fd²/N, this is the varience of distribution.

6. Take the positive square root of the varience to get standard deviation of the distribution.

i.e., S.D = √∑fd²/N

Problems:

1. Find the S.D for the following ungrouped data:

6, 8, 10, 12, 14

Scores (X)	6	8	10	12	14
Deviations (d=X-M)	-4	-2	0	2	4
d²	16	4	0	4	16

∑d²=40

Mean = 50/5 = 10. M = 10

. . S.D = √∑d²/N = √40/5 = √8 = 2.83

2. Find the S.D for the following distribution

X	f	fX	d=X-M	d²	fd²
5	1	5	-9.7	94.09	94.09
10	2	20	-4.7	22.09	44.18
12	3	36	-2.7	7.29	21.87
14	12	168	-0.7	0.49	5.88
15	4	60	0.3	0.09	0.36
17	5	85	2.3	5.29	26.45
22	3	66	7.3	53.29	159.87

N= 30 ∑fd² = 352.7

M = ∑fx = 440 = 14.7

N 30

S.D = √∑fd²/N = √352.7/30 = √11.756 = 3.42

S.D = 3.42

Finding S.D by short-cut method

Procedure:

1. Assume a value for the mean.

2. Lay off the deviation from the AM by intervals.

3. Find fd by multiplying frequencies with deviations for each C.I. Add the product.

4. Find fd² by multiplying d with fd. Add the products.

5. Use the formula, S.D = i x √ ∑fd² /N - (∑fd/N)²

1. Find S.D from the following grouped data

C. I	x	f	d	fd	fd²
10 – 14	12	3	-3	-9	27
15 - 19	17	5	-2	-10	20
20 – 24	22	9	-1	-9 (-28)	9
25 – 29	27	18	0	0	0
30 – 34	32	11	1	11	11
35 – 39	37	5	2	10	20
40 – 44	42	6	3	18	54
45 – 49	47	2	4	8	32
50- 54	52	1	5	5 (52)	25

N= 60 ∑fd= 24 ∑fd²= 198

S.D = i √ ∑fd² /N - (∑fd/N)²

= 5√198/60 - (24/60)²

= 5√ 3.3 – (0.4)²

= 5 √3.3 – 0.16

= 5 √3.14

= 5x1.77

= 8.85

S.D = 8.85

Merits of S.D:

1. S.D is rigidly defined and its value is always definite.

2. It is based on all the observations of the data.

3. It is amenable to algebraic treatment and possesses many mathematical properties. This is why it is used in many advanced studies.

4. It is less affected by fluctuations in sampling than most other measures of variability.

Limitations of S.D

1. It is difficult to understand and interpret S.D.

2. It gives more weight to extreme items and less to those which are near the mean, because the squares of the deviations, which are big in size, would be proportionately greater than the squares of those which are comparatively small.

Use the S.D ---

1. When a measure, having the greatest stability and reliability, is sought;

2. When extreme deviations should exercise a proportionately greater effect upon variability;

3. When the coefficient of correlation and other statistics are subsequently to be computed;

4. When the interpretations related to the normal probability curve are desired.

Measures of Variability/Dispersion in Educational statistics

Tuesday, 13 July 2021

Measurs of Dispersion

No comments:

Post a Comment