In the previous section, we completed a discussion on the coefficient of variance. In this section, we will see some miscellaneous examples.
Solved example 15.17
The variance of 20 observations is 5. If each observation is multiplied by 2, find the new variance of the resulting observations.
Solution:
1. Analysis of the original data set:
(i) Let the observations in the original data set be denoted as $x_i$.
•
Then we can write: ${\sigma^2}_x~=~\frac{\sum(x_i - \bar x)^2}{20}~=~5$
(ii) In the above equation, we know that: $\bar x ~=~\frac{\sum x_i}{20}$
(iii) Expanding the equation in (i), we get:
$\begin{array}{ll}
{}&{5}
& {~=~}& {\frac{\sum(x^2_i - 2 x_i \bar x + {\bar x}^2)}{20}}
&{} \\
{\Rightarrow}&{5}
& {~=~}& {\frac{\sum{x^2_i} ~-~ 2 \bar x \sum{x_i} ~+~ \sum{{\bar x}^2}}{20}}
&{} \\
{\Rightarrow}&{100}
& {~=~}& {\sum{x^2_i} ~-~ 2 \bar x \sum{x_i} ~+~ \sum{{\bar x}^2}}
&{} \\
\end{array}$
2. Analysis of the modified data set:
(i) Let the observations in the modified data set be denoted as $y_i$.
•
Then we can write: ${\sigma^2}_y~=~\frac{\sum(y_i - \bar y)^2}{20}~=~K$
♦ We are asked to find K.
(ii) In the above equation, we know that: $\bar y ~=~\frac{\sum y_i}{20}$
(iii) Expanding the equation in (i), we get:
$\begin{array}{ll}
{}&{K}
& {~=~}& {\frac{\sum(y^2_i - 2 y_i \bar y + {\bar y}^2)}{20}}
&{} \\
{\Rightarrow}&{K}
& {~=~}& {\frac{\sum{y^2_i} ~-~ 2 \bar y \sum{y_i} ~+~ \sum{{\bar y}^2}}{20}}
&{} \\
{\Rightarrow}&{20K}
& {~=~}& {\sum{y^2_i} ~-~ 2 \bar y \sum{y_i} ~+~ \sum{{\bar y}^2}}
&{} \\
\end{array}$
3. Writing y in terms of x:
(i) Given that, each observation in the original data set is multiplied by 2, to get the modified data set.
(ii) So we can write: $y_i~=~ 2 x_i$
•
Substituting this in 2(iii), we get:
$\begin{array}{ll}
{}&{20K}
& {~=~}& {\sum{y^2_i} ~-~ 2 \bar y \sum{y_i} ~+~ \sum{{\bar y}^2}}
&{} \\
{\Rightarrow}&{20K}
& {~=~}& {\sum{(2 x_i)^2} ~-~ 2 \bar y \sum{2 x_i} ~+~ \sum{{\bar y}^2}}
&{} \\
{\Rightarrow}&{20K}
& {~=~}& {\sum{(2 x_i)^2} ~-~ 2 (2 \bar x) \sum{2 x_i} ~+~ \sum{(2 \bar x)^2}~~ \color {green} {\text{- - - (I)}}}
&{} \\
{\Rightarrow}&{20K}
& {~=~}& {4 \sum{x^2_i} ~-~ 8 \bar x \sum{x_i} ~+~ 4 \sum{{\bar x}^2}}
&{} \\
{\Rightarrow}&{20K}
& {~=~}& {4 \left[ \sum{x^2_i} ~-~ 2 \bar x \sum{x_i} ~+~ \sum{{\bar x}^2}\right]~~ \color {green} {\text{- - - (II)}}}
&{} \\
{\Rightarrow}&{20K}
& {~=~}& {4 \left[100 \right]}
&{} \\
{\Rightarrow}&{20K}
& {~=~}& {400}
&{} \\
{\Rightarrow}&{K}
& {~=~}& {20}
&{} \\
\end{array}$
◼ Remarks:
•
Line marked as I:
In this line, we use an information that we learned in our earlier classes. It can be written in 3 steps:
(i) Average of a set of observations is 'A'
(ii) The set is modified by multiplying each observation by 'm'.
(iii) Then the average of the modified set is 'mA'.
•
So in our present case, we get: $\bar y = 2 \bar x$
•
Line marked as II:
Based on the result in 1(iii), the portion inside the square brackets is '100'.
4. Now, based on 2(i), we can write:
${\sigma^2}_y~=~K~=~20$
•
Thus we get the new variance of the modified observations.
•
Note that, 20 = 22 × 5
•
So we can write:
If the observations are multiplied by a constant 'm', then the variance of the modified observations will be m2 times the original variance.
Solved example 15.18
The mean of 5 observations is 4.4 and their variance is 8.24. If three of the observations are 1, 2 and 6, find the other two observations.
Solution:
1. We can use the usual table that is used for calculation of variance. It is shown below:
Table 15.29 |
•
In column I, we fill up the observations ($x_i$). The two unknown observations are written as x4 and x5.
•
In column II, we fill up the frequencies. Each of the five observations occur only once. So all values in the column II are '1'.
•
In column III, we write the product $f_i x_i$. Since all $f_i$ values are '1', Column III will be same as column II.
•
In column IV, we write the deviations from the mean. We are already given the mean. It is 4.4.
•
In column V, we write the squares of the deviations. Here $f_i$ do not have any effect because, all $f_i$ values are '1'.
•
After filling all the columns, we can start the calculations.
2. We know that, the column III is used to find mean.
•
Sum of all the values in column III is:
1 + 2 + 6 + x4 + x5 = 9 + x4 + x5
•
So we can write:
Mean = $\frac{9 + x_4 + x_5}{5}~=~4.4$
•
From this, we get:
x4 + x5 = 13
3. We know that, the column V is used to find the variance.
Sum of all values in column V can be calculated as follows:
$\begin{array}{ll}
{}&{\text{Sum}}
& {~=~}& {11.56 + 5.76 + 2.56 + (x_4 - 4.4)^2 + (x_5 - 4.4)^2} &{} \\
{}&{}
& {~=~}& {19.88 + (x_4 - 4.4)^2 + (x_5 - 4.4)^2} &{} \\
{}&{}
& {~=~}& {19.88 ~+~{x_4}^2 - 8.8 x_4 + 19.36~+~{x_5}^2 - 8.8 x_5 + 19.36} &{} \\
{}&{}
& {~=~}& {{x_4}^2 +{x_5}^2 - 8.8 (x_4 + x_5) + 19.36 + 19.36 + 19.88} &{} \\
{}&{}
& {~=~}& {{x_4}^2 +{x_5}^2 - 8.8 (x_4 + x_5) + 58.6} &{} \\
{}&{}
& {~=~}& {{x_4}^2 +{x_5}^2 - 8.8 (13) + 58.6~\color{green}{\text{- - - I}}} &{} \\
{}&{}
& {~=~}& {{x_4}^2 +{x_5}^2 - 114.4 + 58.6} &{} \\
{}&{}
& {~=~}& {{x_4}^2 +{x_5}^2 - 55.8} &{} \\
\end{array}$
◼ Remarks:
•
Line marked as I:
In this line, we use the result from (2) to replace the sum $(x_4 + x_5)$
4. Based on the above sum of the squares, we can write:
$\begin{array}{ll}
{}&{\text{Variance}}
& {~=~}& {\frac{{x_4}^2 +{x_5}^2 - 55.8}{5}}
&{} \\
{\Rightarrow}&{8.24}
& {~=~}& {\frac{{x_4}^2 +{x_5}^2 - 55.8}{5}}
&{} \\
{\Rightarrow}&{41.2}
& {~=~}& {{x_4}^2 +{x_5}^2 - 55.8}
&{} \\
{\Rightarrow}&{97}
& {~=~}& {{x_4}^2 +{x_5}^2}
&{} \\
\end{array}$
5. Now we have two useful results:
♦ From (2) we have: $x_4 + x_5 = 13$
♦ From (3) we have: ${x_4}^2 +{x_5}^2 = 97$
•
Using these two results, we can calculate $2 x_4 x_5$. This is shown below:
$\begin{array}{ll}
{}&{(x_4 + x_5)^2}
& {~=~}& {13^2}
&{} \\
{\Rightarrow}&{{x_4}^2 +2 x_4 x_5 + {x_5}^2 }
& {~=~}& {169}
&{} \\
{\Rightarrow}&{97 +2 x_4 x_5}
& {~=~}& {169}
&{} \\
{\Rightarrow}&{2 x_4 x_5}
& {~=~}& {169 - 97}
&{} \\
{\Rightarrow}&{2 x_4 x_5}
& {~=~}& {72}
&{} \\
\end{array}$
6. Consider the two results:
♦ From (3) we have: ${x_4}^2 +{x_5}^2 = 97$
♦ From (4) we have: $2 x_4 x_5 = 72$
•
Using these two results, we can calculate $ x_4 - x_5$. This is shown below:
$\begin{array}{ll}
{}&{(x_4 - x_5)^2}
& {~=~}& {{x_4}^2 -2 x_4 x_5 + {x_5}^2 }
&{} \\
{\Rightarrow}&{(x_4 - x_5)^2}
& {~=~}& {{x_4}^2 + {x_5}^2 -2 x_4 x_5}
&{} \\
{\Rightarrow}&{(x_4 - x_5)^2}
& {~=~}& {97 - 72}
&{} \\
{\Rightarrow}&{(x_4 - x_5)^2}
& {~=~}& {25}
&{} \\
{\Rightarrow}&{x_4 - x_5}
& {~=~}& {\pm 5}
&{} \\
\end{array}$
7. Consider the three equations:
(i) From (2) we have: $x_4 + x_5 = 13$
(ii) From (5) we have: $x_4 - x_5 = 5$
(iii) From (5) we have: $x_4 - x_5 = -5$
•
Solving (i) and (ii), we get:
x4 = 9, x5 = 4
Solving (i) and (iii), we get:
•
x4 = 4, x5 = 9
8. So the unknown observations are: 4 and 9
Solved example 15.19
Show that adding or subtracting a constant number 'a' from each observation, does not affect the variance.
Solution:
◼ Let the original n observations be denoted as xi
•
Then the original mean will be given by: $\bar x ~=~\frac{\sum x_i}{n}$
•
Also, the original variance will be given by: ${\sigma^2}_x~=~\frac{\sum(x_i - \bar x)^2}{n}$
Part (i): Adding 'a' to each of the original observation
1. Let the modified n observations be denoted as yi
•
Then the modified observations can be written as:
y1 = (x1 + a), y2 = (x2 + a), y3 = (x3 + a), . . . , yn = (xn + a)
2. The modified mean can be calculated as follows:
$\begin{array}{ll}
{}&{\bar y}
& {~=~}& {\frac{\sum y_i}{n}} &{} \\
{}&{}
& {~=~}& {\frac{\sum (x_i + a)}{n}} &{} \\
{}&{}
& {~=~}& {\frac{\left(\sum x_i \right) ~+~ \left( \sum a \right)}{n}} &{} \\
{}&{}
& {~=~}& {\frac{\left(\sum x_i \right) ~+~ na}{n}} &{} \\
{}&{}
& {~=~}& {\frac{\left(\sum x_i \right)}{n}~+~\frac{na}{n}} &{} \\
{}&{}
& {~=~}& {\bar x ~+~ a} &{} \\
\end{array}$
3. Now the modified variance can be calculated as follows:
$\begin{array}{ll}
{}&{{\sigma^2}_y}
& {~=~}& {\frac{\sum(y_i - \bar y)^2}{n}} &{} \\
{}&{}
& {~=~}& {\frac{\sum \left[(x_i + a) - (\bar x + a) \right]^2}{n}} &{} \\
{}&{}
& {~=~}& {\frac{\sum \left[x_i + a - \bar x - a \right]^2}{n}} &{} \\
{}&{}
& {~=~}& {\frac{\sum \left[x_i - \bar x \right]^2}{n}} &{} \\
{}&{}
& {~=~}& {{\sigma^2}_x} &{} \\
\end{array}$
◼ That means, there is no change in the variance.
Part (ii): Subtracting 'a' from each of the original observation
1. Let the modified n observations be denoted as yi
•
Then the modified observations can be written as:
y1 = (x1 - a), y2 = (x2 - a), y3 = (x3 - a), . . . , yn = (xn - a)
2. The modified mean can be calculated as follows:
$\begin{array}{ll}
{}&{\bar y}
& {~=~}& {\frac{\sum y_i}{n}} &{} \\
{}&{}
& {~=~}& {\frac{\sum (x_i - a)}{n}} &{} \\
{}&{}
& {~=~}& {\frac{\left(\sum x_i \right) ~-~ \left( \sum a \right)}{n}} &{} \\
{}&{}
& {~=~}& {\frac{\left(\sum x_i \right) ~-~ na}{n}} &{} \\
{}&{}
& {~=~}& {\frac{\left(\sum x_i \right)}{n}~-~\frac{na}{n}} &{} \\
{}&{}
& {~=~}& {\bar x ~-~ a} &{} \\
\end{array}$
3. Now the modified variance can be calculated as follows:
$\begin{array}{ll}
{}&{{\sigma^2}_y}
& {~=~}& {\frac{\sum(y_i - \bar y)^2}{n}} &{} \\
{}&{}
& {~=~}& {\frac{\sum \left[(x_i - a) - (\bar x - a) \right]^2}{n}} &{} \\
{}&{}
& {~=~}& {\frac{\sum \left[x_i - a - \bar x + a \right]^2}{n}} &{} \\
{}&{}
& {~=~}& {\frac{\sum \left[x_i - \bar x \right]^2}{n}} &{} \\
{}&{}
& {~=~}& {{\sigma^2}_x} &{} \\
\end{array}$
◼ That means, there is no change in the variance.
•
Based on the results in parts (i) and (ii), we can write:
Adding or subtracting a constant number 'a' from each observation, does not affect the variance.
•
From step (2) of part (i), we get an useful result. It can be written in 3 steps:
(i) Average of a set of observations is 'A'
(ii) The set is modified by adding a constant 'a' to each observation.
(iii) Then the average of the modified set is (A+a).
•
From step (2) of part (ii), we get an useful result. It can be written in 3 steps:
(i) Average of a set of observations is 'A'
(ii) The set is modified by subtracting a constant 'a' from each observation.
(iii) Then the average of the modified set is (A-a).
Solved example 15.20
The mean and standard deviation of 100 observations were calculated as 40 and 5.1, respectively by a student who took by mistake 50 instead of 40 for one observation. What are the correct mean and standard deviation?
Solution:
Part (i): Finding the correct mean.
1. Let us write the mean for this problem: $\bar x ~=~ \frac{\sum_{i=1}^{i=100}{x_i}}{100}$
2. Ninety nine observations are correct. One observation is wrong. So we will split the above equation. We get:
$\begin{array}{ll}
{}&{{\bar x}_{\text{wrong}}}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{x_i}~+~{x_{\text{wrong}}}}{100}}
&{} \\
{\Rightarrow}&{{\bar x}_{\text{wrong}}}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{x_i}}{100}~+~\frac{x_{\text{wrong}}}{100}}
&{} \\
{\Rightarrow}&{40}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{x_i}}{100}~+~\frac{50}{100}~~ \color {green} {\text{- - - (I)}}}
&{} \\
{\Rightarrow}&{40 - 0.50}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{x_i}}{100}}
&{} \\
{\Rightarrow}&{39.5}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{x_i}}{100}}
&{} \\
{\Rightarrow}&{\sum_{i=1}^{i=99}{x_i}}
& {~=~}& {3950}
&{} \\
\end{array}$
◼ Remarks:
•
Line marked as I:
♦ We are given that, the wrong mean is 40.
♦ We are given that, the wrong observation is 50.
3. Now we can find the correct mean:
$\begin{array}{ll}
{}&{{\bar x}_{\text{correct}}}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{x_i}~+~x_{\text{correct}}}{100}}
&{} \\
{}&{}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{x_i}}{100}~+~\frac{x_{\text{correct}}}{100}}
&{} \\
{}&{}
& {~=~}& {\frac{3950}{100}~+~\frac{40}{100}~~ \color {green} {\text{- - - (I)}}}
&{} \\
{}&{}
& {~=~}& {39.50 + 0.40}
&{} \\
{}&{}
& {~=~}& {39.9}
&{} \\
\end{array}$
◼ Remarks:
•
Line marked as I:
♦ We use the result $\sum_{i=1}^{i=99}{x_i}~=~3950$ from step (2).
♦ We are given that, the correct observation is 40.
Part (ii): Finding the correct variance.
1. We are given that:
Wrong standard deviation is 5.1
•
So the wrong variance = (5.1)2
2. Let us write the variance for this problem: $\sigma^2~=~ \frac{\sum_{i=1}^{i=100}{(x_i - \bar x)^2}}{100}$
3. Ninety nine observations are correct. One observation is wrong. So we will split the above equation. We get:
$\begin{array}{ll}
{}&{{\sigma^2}_{\text{wrong}}}
&
{~=~}& {\frac{\sum_{i=1}^{i=99}{(x_i - {\bar
x}_{\text{wrong}})^2}~+~(x_{\text{wrong}} - {\bar
x}_{\text{wrong}})^2}{100}}
&{} \\
{\Rightarrow}&{{\sigma^2}_{\text{wrong}}}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{(x_i - {\bar
x}_{\text{wrong}})^2}}{100}~+~\frac{(x_{\text{wrong}} - {\bar
x}_{\text{wrong}})^2}{100}}
&{} \\
{\Rightarrow}&{5.1^2}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{(x_i - 40)^2}}{100}~+~\frac{(50 - 40)^2}{100}~~ \color {green} {\text{- - - (I)}}}
&{} \\
{\Rightarrow}&{5.1^2}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{(x_i - 40)^2}}{100}~+~\frac{(10)^2}{100}}
&{} \\
{\Rightarrow}&{5.1^2}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{(x_i - 40)^2}}{100}~+~1}
&{} \\
{\Rightarrow}&{5.1^2~-~1^2}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{\left({x_i}^2 - 80 x_i + 40^2 \right)}}{100}}
&{} \\
{\Rightarrow}&{(5.1+1)(5.1 - 1)}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{\left({x_i}^2 \right)} - 80 \sum_{i=1}^{i=99}{(x_i)} + 99 \times 40^2 }{100}}
&{} \\
{\Rightarrow}&{(6.1)(4.1)}
&
{~=~}& {\frac{\sum_{i=1}^{i=99}{\left({x_i}^2 \right)} - 80 (3950) +
99 \times 40^2 }{100}~~ \color {green} {\text{- - - (II)}}}
&{} \\
{\Rightarrow}&{25.01}
& {~=~}&
{\frac{\sum_{i=1}^{i=99}{\left({x_i}^2 \right)} - 316000 + 158400 }{100}}
&{} \\
{\Rightarrow}&{25.01}
& {~=~}&
{\frac{\sum_{i=1}^{i=99}{\left({x_i}^2 \right)} -157600 }{100}}
&{} \\
{\Rightarrow}&{2501~+~157600}
& {~=~}&
{\sum_{i=1}^{i=99}{\left({x_i}^2 \right)}}
&{} \\
{\Rightarrow}&{160101}
& {~=~}&
{\sum_{i=1}^{i=99}{\left({x_i}^2 \right)}}
&{} \\
\end{array}$
◼ Remarks:
•
Line marked as I:
♦ In (1) of part (ii), we wrote the wrong variance.
♦ We are given that, the wrong mean is 40.
♦ We are given that, the wrong observation is 50.
•
Line marked as II:
♦ We use the result $\sum_{i=1}^{i=99}{x_i}~=~3950$ from step (2) of part (i).
4. Now we can find the correct variance.
$\begin{array}{ll}
{}&{{\sigma^2}_{\text{correct}}}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{(x_i - {\bar
x}_{\text{correct}})^2}~+~(x_{\text{correct}} - {\bar
x}_{\text{correct}})^2}{100}}
&{} \\
{}&{}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{(x_i - {\bar
x}_{\text{correct}})^2}}{100}~+~\frac{(x_{\text{correct}} - {\bar
x}_{\text{correct}})^2}{100}}
&{} \\
{}&{}
& {~=~}& {\frac{\sum_{i=1}^{i=99}{(x_i - 39.9)^2}}{100}~+~\frac{(40 - 39.9)^2}{100}~~ \color {green} {\text{- - - (I)}}}
&{} \\
{}&{}
&
{~=~}& {\frac{\sum_{i=1}^{i=99}{\left({x_i}^2 - 2 \times 39.9
\times x_i + 39.9^2 \right)}}{100}~+~\frac{(0.10)^2}{100}}
&{} \\
{}&{}
&
{~=~}& {\frac{\sum_{i=1}^{i=99}{\left({x_i}^2 \right)} - 2 \times
39.9 \times \sum_{i=1}^{i=99}{(x_i)} + 99 \times 39.9^2
}{100}~+~\frac{(0.10)^2}{100}}
&{} \\
{}&{}
& {~=~}& {\frac{160101- 2 \times 39.9 \times 3950 + 99 \times
39.9^2 }{100}~+~\frac{(0.10)^2}{100}~~ \color {green} {\text{- - - (II)}}}
&{} \\
{}&{}
& {~=~}& {\frac{160101- 315210
+ 157608.99 }{100}~+~\frac{(0.10)^2}{100}}
&{} \\
{}&{}
& {~=~}& {\frac{2499.99}{100}~+~\frac{(0.10)^2}{100}}
&{} \\
{}&{}
& {~=~}& {\frac{2499.99~+~0.01}{100}}
&{} \\
{}&{}
& {~=~}& {\frac{2500}{100}}
&{} \\
{}&{}
& {~=~}& {25}
&{} \\
\end{array}$
◼ Remarks:
•
Line marked as I:
♦ In part (i) we calculated the correct mean as 39.9.
♦ We are given that, the correct observation is 40.
•
Line marked as II:
♦ We use the result $\sum_{i=1}^{i=99}{(x_i)^2}~=~160101$ from step (3) of part (ii).
♦ We use the result $\sum_{i=1}^{i=99}{x_i}~=~3950$ from step (2) of part (i).
5. Thus we get the correct standard deviation:
${\sigma}_{\text{correct}}~=~ \sqrt{{\sigma^2}_{\text{correct}}} ~=~\sqrt{25}~=~5$
Link to a few more miscellaneous examples is given below:
In the next chapter, we will see probability.
No comments:
Post a Comment