Mean of UnGrouped Data
The mean (or average) of observations, as we know, is the
From Class IX, recall that if
Now, the sum of the values of all the observations =
So, the mean x̄ of the data is given by
=
Recall that we can write this in short form by using the Greek letter ∑(capital sigma) which means summation. That is,
=
which, more briefly, is written as =
Let us apply this formula to find the mean in the following example.
1. The marks obtained by 30 students of Class X of a certain school in a Mathematics paper consisting of 100 marks are presented in table below. Find the mean of the marks obtained by the students.
| Marks obtained(xi) | 10 | 20 | 36 | 40 | 50 | 56 | 60 | 70 | 72 | 80 | 88 | 92 | 95 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Number of students(fi) | 1 | 1 | 3 | 4 | 3 | 2 | 4 | 4 | 1 | 1 | 2 | 3 | 1 |
Solution: Recall that to find the mean marks, we require the product of each
| Marks obtained( | Number of students( | |
|---|---|---|
| 10 | 1 | 10 |
| 20 | 1 | |
| 36 | 3 | |
| 40 | 4 | |
| 50 | 3 | |
| 56 | 2 | |
| 60 | 4 | |
| 70 | 4 | |
| 72 | 1 | |
| 80 | 1 | |
| 88 | 2 | |
| 92 | 3 | |
| 95 | 1 | |
| Total | ∑ | |
Now = =
Therefore, the mean marks obtained is
In most of our real life situations, data is usually so large that to make a meaningful study it needs to be condensed as grouped data. So, we need to convert given ungrouped data into grouped data and devise some method to find its
Let us convert the ungrouped data of Example 1 into grouped data by forming class-intervals of width, say 15.
Remember that, while allocating frequencies to each class-interval, students falling in any
With this convention in our mind, let us form a grouped frequency distribution table.
| Class interval | 10-25 | 25-40 | 40-55 | 55-70 | 70-85 | 85-100 |
|---|---|---|---|---|---|---|
| Number of students(fi) | 2 | 3 | 7 | 6 | 6 | 6 |
Now, for each class-interval, we require a point which would serve as the representative of the whole class.
It is assumed that the frequency of each class interval is centred around its
So the mid-point (or class mark) of each class can be chosen to represent the observations falling in the class. Recall that we find the mid-point of a class (or its class mark) by finding the average of its upper and lower limits. That is
Class mark =
With reference to Table, for the class 10-25, the class mark is
Similarly, we can find the class marks of the remaining class intervals. We put them in Table. These class marks serve as our
| Class interval | Number of students(fi) | Class mark(xi) | fixi |
|---|---|---|---|
| 10-25 | 2 | 17.5 | |
| 25-40 | 3 | | |
| 40-55 | 7 | | |
| 55-70 | 6 | | |
| 70-85 | 6 | | |
| 85-100 | 6 | ||
| Total |
The sum of the values in the last column gives us Σ
=
This new method of finding the mean is known as the Direct Method.
We observe that Tables are using the same data and employing the same formula for the calculation of the mean but the results obtained are
Can you think why this is so, and which one is more accurate? The difference in the two values is because of the mid-point assumption in Table 59.3 being the exact
Sometimes when the numerical values of
We can do nothing with the
The first step is to choose one among the
The next step is to find the difference
i.e.,
The third step is to find the product of
| Class interval | Number of students(fi) | Class mark(xi) | ||
|---|---|---|---|---|
| 10-25 | 2 | 17.5 | -30 | |
| 25-40 | 3 | 32.5 | | |
| 40-55 | 7 | 47.5 | | |
| 55-70 | 6 | 62.5 | | |
| 70-85 | 6 | 77.5 | | |
| 85-100 | 6 | 92.5 | ||
| Total |
So, from Table, the mean of the deviations, =
Activity 1 : From the Direct Method table, find the mean by taking each of
What do you observe? You will find that the mean determined in each case is the same, i.e., 62. (Why?)
So, we can say that the value of the mean obtained
Observe that in the above given table, the values in Column 4 are all multiples of
So, if we divide the values in the entire Column 4 by 15, we would get smaller numbers to multiply with
So, let
Now, we calculate
Taking h = 15, let us form the following table:
| Class interval | |||||
|---|---|---|---|---|---|
| 10-25 | 2 | 17.5 | -30 | -2 | |
| 25-40 | 3 | 32.5 | -1 | -3 | |
| 40-55 | 7 | 47.5 | | | |
| 55-70 | 6 | 62.5 | | ||
| 70-85 | 6 | 77.5 | | ||
| 85-100 | 6 | 92.5 | |||
| Total |
Let =
We have,
Therefore, =
=
=
So, h = - a
i.e., = a + h
So, = a + h(
Now, substituting the values of a, h,
=
= 47.5 +
So, the mean marks obtained by a student is 62.
The method discussed above is called the
We note that :
(1) the step-deviation method will be convenient to apply if all the d_i’s have a common factor.
(2) The mean obtained by all the three methods is the
(3) The assumed mean method and step-deviation method are just simplified forms of the
(4) The formula = a + h still holds if a and h are not as given above, but are any non-zero numbers such that
Let us apply these methods in another example.
2. The table below gives the percentage distribution of female teachers in the primary schools of rural areas of various states and union territories (U.T.) of India. Find the mean percentage of female teachers by all the three methods discussed in this section.
| Percentage of female teachers | 15-25 | 25-35 | 35-45 | 45-55 | 55-65 | 65-75 | 75-85 |
|---|---|---|---|---|---|---|---|
| Number of States/U.T. | 6 | 11 | 7 | 4 | 4 | 2 | 1 |
Solution:
Let us find the class marks,
| Percentage of female teachers | Number of States/U.T. | xi |
|---|---|---|
| 15 - 25 | 6 | 20 |
| 25 - 35 | 11 | |
| 35 - 45 | 7 | |
| 45 - 55 | 4 | |
| 55 - 65 | 4 | |
| 65 - 75 | 2 | |
| 75 - 85 | 1 | |
Here we take a = 50, h = 10, then
We now find
| Percentage of female teachers | Number of States/U.T. | | |||||
|---|---|---|---|---|---|---|---|
| 15 - 25 | 6 | 20 | -30 | -3 | 120 | -180 | -18 |
| 25 - 35 | 11 | 30 | -20 | ||||
| 35 - 45 | 7 | 40 | -10 | ||||
| 45 - 55 | 4 | 50 | 0 | ||||
| 55 - 65 | 4 | 60 | 10 | ||||
| 65 - 75 | 2 | 70 | 20 | | |||
| 75 - 85 | 1 | 80 | 30 | ||||
| Total |
From the table above, we obtain ∑
∑
Using the direct method, =
Therefore, the mean percentage of female teachers in the primary schools of rural areas is 39.71.
Remark : The result obtained by all the three methods is the
So the choice of method to be used depends on the numerical values of
If
If the class sizes are unequal, and
3. The distribution below shows the number of wickets taken by bowlers in one-day cricket matches. Find the mean number of wickets by choosing a suitable method. What does the mean signify?
| Number of wickets | 20-60 | 60-100 | 100-150 | 150-250 | 250-350 | 350-450 |
|---|---|---|---|---|---|---|
| Number of bowlers. | 7 | 5 | 16 | 12 | 2 | 3 |
Here, the class size varies, and the
| Number of wickets taken | Number of bowlers( | ||||
|---|---|---|---|---|---|
| 20 - 60 | 7 | 40 | |||
| 60 - 100 | 5 | 80 | | ||
| 100 - 150 | 16 | 125 | | | |
| 150 - 250 | 12 | 200 | | | |
| 250 - 350 | 2 | 300 | | | |
| 350 - 450 | 3 | 400 | | | |
So, =
Therefore, = = 200 + 20
This tells us that, on an average, the number of wickets taken by these 45 bowlers in one-day cricket is 152.89.
Now, let us see how well you can apply the concepts discussed in this section!
Activity 2 :
Divide the students of your class into three groups and ask each group to do one of the following activities.
1. Collect the marks obtained by all the students of your class in Mathematics in the latest examination conducted by your school. Form a grouped frequency distribution of the data obtained.
2. Collect the daily maximum temperatures recorded for a period of 30 days in your city. Present this data as a grouped frequency table.
3. Measure the heights of all the students of your class (in cm) and form a grouped frequency distribution table of this data.
After all the groups have collected the data and formed grouped frequency distribution tables, the groups should find the mean in each case by the method which they find appropriate.