In the previous section, we saw some solved examples on variance and standard deviation. In this section, we will see how they are related to continuous frequency distribution.
Calculating variance and standard deviation for continuous frequency distribution
•
We know that, when the data is large, we arrange the observations into
various groups. It is called continuous frequency distribution.
•
We know the method to calculate mean in such cases. Also, in a previous section 15.2, we have seen the method to calculate the following items:
♦ Mean deviation about mean for a continuous frequency distribution.
♦ Mean deviation about median for a continuous frequency distribution.
•
So now we will see the method to calculate the following items:
♦ Variance for a continuous frequency distribution.
♦ Standard deviation for a continuous frequency distribution.
•
The method is similar to what we saw in the solved examples 15.9 and 15.10 of the previous section. There we
dealt with discrete values. But here we will be dealing with groups of
values. These groups are called class intervals.
•
So for each class interval, we choose a value to represent that class
interval. Usually we choose the midpoint as the representative value.
• For example, if the class interval is 30-35, then the representative value will be the midpoint, which is:
$\frac{30 + 35}{2}~=~\frac{65}{2}~=~32.5$
•
Once the representative value is fixed, the procedure is the same as before.
•
The following solved example will demonstrate the method:
Solved Example 15.12
Find the variance and standard deviation for the following data:
Table15.24 |
Solution:
1. We are asked to find the variance.
•
So our first aim is to find the mean $(\bar{x})$. For that, we want $f_i x_i$.
• For calculating $f_i x_i$, we want $x_i$.
• $x_i$ values are the midpoint values. They are calculated in the third column of the table 15.25 below:
Table 15.25 |
•
$f_i x_i$ is calculated in the column IV. So we can write:
$\bar{x}~=~\frac{\sum{f_i x_i}}{\sum{f_i}}~=~\frac{3100}{50}~=~62$
2. Now we can calculate the variance. For that, we can use columns II, V and VI.
• We have:
$\sigma^2~=~\frac{\sum{f_i (x_i - \bar{x})^2}}{\sum{f_i}}~=~\frac{10050}{50}~=~201$
3. Finally, we can calculate the standard deviation.
$\sigma~=~\sqrt{\sigma^2}~=~\sqrt{201}~=~14.18$
Shortcut method for calculating variance and standard deviation
• In a previous section, we used a shortcut method (step-deviation method) for finding $\bar{x}$.
•
If $\bar{x}$ obtained is large, our present calculations of variance will be tedious and lengthy.
•
So we need a method to reduce the size of $\bar{x}$. It can be written in 5 steps:
1. In the step-deviation method, we first reduce $x_i$ into $d_i$.
•
$d_i = x_i - a$
♦ Where 'a' is the assumed mean.
2. Next we reduce $d_i$ to $u_i$.
•
$u_i = \frac{d_i}{h}$
♦ Where 'h' is the width of class.
3. We can combine the above two results as follows:
$u_i = \frac{x_i - a}{h}$
•
From this, we get:
$x_i = a + h u_i$
4. Once we calculate the $u_i$ values, we find $\bar{u}$.
•
Using $\bar{u}$, we find $\bar{x}$ using the formula:
$\bar{x} = a + h \bar{u}$
5. Let us use the above results in (3) and (4), to modify variance. It can be done as follows:
$\begin{array}{ll}
{}&{\sigma^2}
& {~=~}& {\frac{1}{\sum{f_i}}\left[\sum{f_i \left[x_i - \bar{x} \right]^2} \right]} &{} \\
{}&{}
&
{~=~}& {\frac{1}{\sum{f_i}}\left[\sum{f_i \left[(a + h u_i) - (a + h
\bar{u}) \right]^2} \right]~\color{green}{\text{- - - I}}} &{} \\
{}&{}
& {~=~}& {\frac{1}{\sum{f_i}}\left[\sum{f_i \left[a + h u_i - a - h \bar{u} \right]^2} \right]} &{} \\
{}&{}
& {~=~}& {\frac{1}{\sum{f_i}}\left[\sum{f_i \left[h u_i - h \bar{u} \right]^2} \right]} &{} \\
{}&{}
& {~=~}& {\frac{1}{\sum{f_i}}\left[\sum{f_i \left[h( u_i - \bar{u}) \right]^2} \right]} &{} \\
{}&{}
& {~=~}& {\frac{1}{\sum{f_i}}\left[\sum{f_i h^2 ( u_i - \bar{u})^2} \right]} &{} \\
{}&{}
& {~=~}& {\frac{h^2}{\sum{f_i}}\left[\sum{f_i (u_i - \bar{u})^2} \right]} &{} \\
\end{array}$
◼ Remarks:
•
Line marked as I:
In this line, we replace $x_i$ and $\bar{x}$ using the results in (3) and (4).
•
So now we have three formulas for calculating variance:
(i) $\sigma^2~=~\frac{\sum{f_i (x_i - \bar{x})^2}}{\sum{f_i}}$
(ii) $\sigma^2~=~\left(\frac{1}{\sum{f_i}}\right)^2 \left[\left(\sum{f_i}\right) \left(\sum{f_i x^2 _i} \right)
- \left(\sum{f_i x_i} \right)^2 \right]$
(iii) $\sigma^2~=~\frac{h^2}{\sum{f_i}}\left[\sum{f_i (u_i - \bar{u})^2} \right]$
Note:
Formula (i) is the basic formula. Formula (iii) is used for the shortcut method. Both (i) and (iii) follow the same pattern.
Let us apply the shortcut method for solved example 15.12 above. It will be our next solved example.
Solved example 15.13
Find the variance and standard deviation for the data in table 15.24 above.
Solution:
1. We are asked to find the variance.
•
So our first aim is to find the mean $(\bar{x})$. For that, we want $f_i x_i$.
• For calculating $f_i x_i$, we want $x_i$.
• $x_i$ values are the midpoint values. They are calculated in the third column of the table 15.26 below:
Table 15.26 |
•
Now we reduce the sizes of all $x_i$ values. For that, we consider a middle value as the assumed mean.
For our present case, we consider 65 as the assumed mean (a). Then we
subtract 'a' from all $x_i$ values. The values obtained after
subtraction are denoted as $d_i$. This is calculated in column IV.
•
Next, we reduce the size of $d_i$. This is achieved by dividing with a
common factor (h). For our present case, 10 is the common factor. It is the class width. The
values obtained after division are denoted as $u_i$. This is calculated
in column V.
• Finally, $f_i u_i$ is calculated in the column VI. So we can write:
$\bar{u}~=~\frac{\sum{f_i u_i}}{\sum{f_i}}~=~\frac{-15}{50}~=~-0.3$
2. Now we can calculate ($u_i - \bar{u}$). It is done in column VII.
•
Using ($u_i - \bar{u}$), we can calculate $f_i(u_i - \bar{u})^2$. It is done in column VIII.
3. Now we can substitute the values.
• We have:
$\sigma^2~=~\frac{h^2}{\sum{f_i}}\left[\sum{f_i (u_i - \bar{u})^2} \right]~=~\frac{10^2}{50}\left[100.5 \right]~=~201$
4. Finally, we can calculate the standard deviation.
$\sigma~=~\sqrt{\sigma^2}~=~\sqrt{201}~=~14.18$
Link to a few more solved examples is given below:
In the next section, we will see Analysis of frequency distributions.