3.3: Measures of Variability/Dispersion
The measures of
central tendencies mean, median and mode, these 3 averages gives us an idea of
the concentration of the observations about the central part of distribution.
But these averages do not explain the characteristics of the distribution.
Ex: The marks of two students obtained in a test is as follows:
A=41, 46, 54, 40, 50, 51 . .
. Average
=282/6 =47
B= 30, 33, 32, 48, 69, 70
. . . Average =282/6 =47
Here the average marks of both A and B is same but we cannot say
the performance of both A and B is same. Because here we cannot consider the
performance of student in each subject. Here the Student ‘A’ got above 40 in
each subject but ‘B’ is not. Therefore the performance of the student ‘A’ is
better than ‘B’.
Measures of Variability
There are 4 measures of variability namely,
1.
Range (R) 2. Quartile
Deviation (Q.D) 3. Mean/Average
Deviation (M.D)
4. Standard Deviation (S.D)
1. RANGE (R)
It is defined as
the difference between the highest and lowest scores.
Range= Highest score – Lowest
Scores
|
R=
H - L |
It is one of the least reliable
measures of variability, for it is affected by fluctuations in the extreme
scores.
Its only merit is that it can be easily
calculated and readily understood.
Co-efficient of Range
It is defined as the ratio of the difference
between the highest and lowest score to the sum of the highest and lowest
score.
i.e. Co-efficient of Range = H - L
H+L
Range is a number between 0 and 1 scores are
more consistent if the co-efficient of
range is very near to 0 and not consistent if it is near to 1.
Merits: It is simple to
understand and easy to calculate.
Limitations
1. It helps us to
make only a rough comparison of two or more groups for variability.
2. It takes account
of only the two extreme scores of a series and is unreliable when N is small or
when there are large gaps (i.e., Zero f’s) in the frequency distribution.
3. It is affected
greatly by fluctuations in sampling. Its value is never stable. In a class
where normally the height of students ranges from 150 cms to 180 cms, if a
dwarf, whose height is 90 cm is admitted, the range would shoot up from 90cm to
180cm.
4. The range does not
take into account the composition of a series or the distribution of the items
within the extremes. The range of a symmetrical and an asymmetrical
distribution can be identical.
Use of Range-----
1. When a knowledge
of extreme scores is all that is wanted;
2. When the data are
too scant or too scattered to justify the computation of a more precise measure
of variability.
3. Quality control.
4. In studying the
fluctuation in prices.
5. Weather forecast.
6.
Day to day activities like sales in a shop, earning of a
family in a week etc.
QUARTILE
DEVIATION (Q.D) or Q
(Semi inter quartile Range)
Range tells us only the difference between
highest and lowest score within the distribution. The inter-quartile range
measures approximately how far from the median. We can include on half of the
scores (50%) of the given set of data. To compute this range we divide the
given data in to four equal parts, each of which contains 25% of the items in the
distribution. The quartiles are thus the highest value in each of these 4
parts.
Q1 Q2 Q3
First Quartile is ‘Q1’ When Q1 = N+1 Item
4
Second Quartile (median) is Q2
When Q2 = 2 X N+1 Item
4
Third Quartile is Q3 When Q3 = 3 X N+1 Item
4
Inter quartile range is the
difference between Q3 and Q1
i.e., (Q3- Q1).
One half of the inter quartile range is a
measure called Quartile Deviation.
Quartile Deviation (Q.D)= Q.D = Q3 – Q1
2
1. Find
the Q.D for the following ungrouped data
25, 29, 36, 42, 48, 56, 62, 65,
67, 70, 72
|
Sl.
No. |
X |
|
01 |
25 |
|
02 |
29 |
|
03 |
|
|
04 |
42 |
|
05 |
48 |
|
06 |
56 |
|
07 |
62 |
|
08 |
65 |
|
09 |
|
|
10 |
70 |
|
11 |
72 |
Q1 =
N +1 item = 11+1 = 12/4 =3 rd
item.
4 4
Q2 = 2 X N +1 item = 2X3 =6 th item.
4
Q3 = 3X N +1 item = 3 X 3 = 9th item
4
Q.D = Q3 –Q1 = 67 – 36
= 31/2
= 15.5
2 2
|
Q.D = 15.5 |
2. Find the Q.D for the
following ungrouped data
|
X |
f
|
F |
|
10 |
4 |
4 |
|
20 |
7 |
11 Q1 |
|
30 |
15 |
26 |
|
40 |
8 |
34 Q3 |
|
50 |
7 |
41 |
|
80 |
2 |
43 |
Q1 = N
+ 1 = 43 + 1
= 44/4 = 11th item
i.e.,
Q1 = 20
Q3
= 3 X N + 1 = 3 X 11 = 33 rd
item
4
i.e.,
Q3 = 40
.
N = 43
.
. Q.D = Q3 –Q1 = 40- 20
2 2
= 20/2
= 10
|
Q.D = 10 |
|
C - I |
f |
F |
|
70 – 79 |
14 |
150 |
|
60 – 69 |
16 |
136 |
|
50 - 59 |
40 |
120 |
|
40 - 49 |
10 |
80 |
|
30 - 39 |
0 |
70 |
|
20- 29 |
20 |
70 |
|
10 - 19 |
40 |
50 |
|
0 - 9 |
10 |
10 |
3.
Find the Q.D for the following grouped data
Q1 = N+1 = 150 + 1 = 151/4
= 37.75
4 4
Q1
= 37.75
Q3 = 3X N+1 = 3 X 37.75 = 113.25
4
Q3 =
113.25
Q1 = L + N/4 – F X I
Q3 = L + 3N/4 – F X i
Fm
fm
= 9.5 + 37.5 – 10 X 10 = 49.5+ 112.5 – 80 X 10
40
40
= 9.5 + 27.5/4
= 49.5 + 32.5/4
= 9.5 + 6.875
= 49.5 + 8.125
|
Q1 = 16.375 |
|
Q3 = 57.625 |
.
. .
Q.D = Q3 – Q1 = 57.625
– 16.375 = 41.25/2 = 20.625
2 2
|
Q.D= 20.625 |
Merits
of Quartile Deviation
1.
It
is a more representative and trustworthy measure of variability than the
overall range;
2.
It
us a good index of score density at the middle of the distribution;
3.
Quartiles
are useful in indicating the skewness of a distribution;
Q3
– Q2 > Q2 – Q1 à
Indicates + ve Skewness.
Q3
– Q2 < Q2 – Q1 à Indicates –ve Skewness.
Q3
– Q2 = Q2 – Q1 à
Indicates Zero Skewness;
4.
Like
the median, Q.D is applicable to open-end distributions.
Limitations
of Q.D
1. It is not capable for further
algebraic treatment;
2. It is possible for two
distributions to have equal Q2 but quite
dissimilar variability of the lower and upper 25% of scores;
3. It is affected to a considerable
extent by fluctuations in sampling. A change in the value of a single item may,
in certain cases, affect its value considerably.
Use
a Quartile Deviation
1.
When
the median is a measure of a central tendency.
2.
When
the distribution is incomplete at either end.
3.
When
there are scattered or extreme scores which would disproportionately influence
the S.D.
4.
When
the concentration around the median- the middle 50% of primary interest.
STANDARD DEVIATION
(S.D)
Standard
deviation is the “Square root of the mean of the squares of individual
deviations from the mean in a series.”
------James Drever.
Varience (⌐2) = ∑fd2 ,
S.D (⌐)
= √∑fd2/N
N
Short cut method:
S.D = √∑d2/N – (∑d/N)2 or
S.D = i √ ∑fd2 /N - (∑fd/N)2
Steps to find S.D (Long method)
1. Find the mean using the
formula, M = ∑fx
N
2. Calculate the value of‘d’. i.e. d = X- M for all the values of x.
3. Square the value of‘d’ i.e., d2.
4. Find ∑ fd2.
5. Calculate ⌐2=
∑fd2/N,
this is the varience of distribution.
6. Take the positive square root of
the varience to get standard deviation of the distribution.
i.e., S.D = √∑fd2/N
Problems:
1. Find
the S.D for the following ungrouped data:
6, 8, 10, 12, 14
|
Scores (X) |
6 |
8 |
10 |
12 |
14 |
|
Deviations (d=X-M) |
-4 |
-2 |
0 |
2 |
4 |
|
d2 |
16 |
4 |
0 |
4 |
16 |
∑d2 =40
Mean = 50/5 = 10.
M = 10
.
. .
S.D = √∑d2/N = √40/5 = √8 = 2.83
2. Find
the S.D for the following distribution
|
X |
f |
fX |
d=X-M |
d2 |
fd2 |
|
5 |
1 |
5 |
-9.7 |
94.09 |
94.09 |
|
10 |
2 |
20 |
-4.7 |
22.09 |
44.18 |
|
12 |
3 |
36 |
-2.7 |
7.29 |
21.87 |
|
14 |
12 |
168 |
-0.7 |
0.49 |
5.88 |
|
15 |
4 |
60 |
0.3 |
0.09 |
0.36 |
|
17 |
5 |
85 |
2.3 |
5.29 |
26.45 |
|
22 |
3 |
66 |
7.3 |
53.29 |
159.87 |
N= 30
∑fd2 = 352.7
M = ∑fx
= 440 = 14.7
N
30
S.D = √∑fd2/N = √352.7/30 = √11.756 = 3.42
|
S.D = 3.42 |
Finding
S.D by short-cut method
Procedure:
1. Assume a value for the mean.
2. Lay off the deviation from
the AM by intervals.
3. Find fd by multiplying
frequencies with deviations for each C.I. Add the product.
4. Find fd2 by
multiplying d with fd. Add the products.
5. Use the formula, S.D = i x √ ∑fd2 /N - (∑fd/N)2
1. Find S.D from the following grouped data
|
C.
I |
x |
f |
d |
fd |
fd2 |
|
10
– 14 |
12 |
3 |
-3 |
-9 |
27 |
|
15
- 19 |
17 |
5 |
-2 |
-10 |
20 |
|
20
– 24 |
22 |
9 |
-1 |
-9 (-28) |
9 |
|
25 – 29 |
27 |
18 |
0 |
0 |
0 |
|
30
– 34 |
32 |
11 |
1 |
11 |
11 |
|
35
– 39 |
37 |
5 |
2 |
10 |
20 |
|
40
– 44 |
42 |
6 |
3 |
18 |
54 |
|
45
– 49 |
47 |
2 |
4 |
8 |
32 |
|
50-
54 |
52 |
1 |
5 |
5 (52) |
25 |
N= 60 ∑fd= 24 ∑fd2= 198
S.D
= i √ ∑fd2 /N - (∑fd/N)2
= 5√198/60 - (24/60)2
= 5√ 3.3 – (0.4)2
= 5 √3.3 – 0.16
= 5 √3.14
= 5x1.77
= 8.85
|
S.D = 8.85 |
Merits
of S.D:
1.
S.D
is rigidly defined and its value is always definite.
2.
It
is based on all the observations of the data.
3.
It
is amenable to algebraic treatment and possesses many mathematical properties.
This is why it is used in many advanced studies.
4.
It
is less affected by fluctuations in sampling than most other measures of
variability.
Limitations
of S.D
1. It is difficult to understand
and interpret S.D.
2. It gives more weight to extreme
items and less to those which are near the mean, because the squares of the
deviations, which are big in size, would be proportionately greater than the
squares of those which are comparatively small.
Use
the S.D ---
1.
When
a measure, having the greatest stability and reliability, is sought;
2.
When
extreme deviations should exercise a proportionately greater effect upon
variability;
3.
When the coefficient of correlation and other
statistics are subsequently to be computed;
4.
When
the interpretations related to the normal probability curve are desired.
No comments:
Post a Comment