Innings2
Powered by Innings 2

Glossary

Select one of the keywords on the left…

Chapter 14: Statistics > Mean of UnGrouped Data

Mean of UnGrouped Data

The mean (or average) of observations, as we know, is the of the values of all the observations divided by the number of observations.

From Class IX, recall that if x1, x2,. . ., xn are observations with respective frequencies f1, f2, . . ., fn, then this means observation x1 occurs f1 times, x2 occurs f2 times, and so on.

Now, the sum of the values of all the observations = f1x1 + f2x2 + ....fnxn, and the number of observations = f1 + f2 + .....fn.

So, the mean x̄ of the data is given by

\bar{x} = f1x1 + f2x2 + .... fnxnf1 + f2 +... +fn

Recall that we can write this in short form by using the Greek letter (capital sigma) which means summation. That is,

\bar{x} = i=1nfixifi

which, more briefly, is written as \bar{x} = fixifi, if it is understood that i varies from to .

Let us apply this formula to find the mean in the following example.

1. The marks obtained by 30 students of Class X of a certain school in a Mathematics paper consisting of 100 marks are presented in table below. Find the mean of the marks obtained by the students.

Marks obtained(xi)10203640505660707280889295
Number of students(fi)1134324411231

Solution: Recall that to find the mean marks, we require the product of each xi with the corresponding frequency fi. So, let us put them in a column as shown in Table.

Marks obtained(xi)Number of students(fi)fi xi
10110
201
36 3
40 4
50 3
56 2
60 4
70 4
72 1
80 1
88 2
92 3
95 1
Total fi = fi xi

Now = \bar{x} =∑fixi∑fi = 177930 =

Therefore, the mean marks obtained is .

In most of our real life situations, data is usually so large that to make a meaningful study it needs to be condensed as grouped data. So, we need to convert given ungrouped data into grouped data and devise some method to find its .

Let us convert the ungrouped data of Example 1 into grouped data by forming class-intervals of width, say 15.

Remember that, while allocating frequencies to each class-interval, students falling in any class-limit would be considered in the next class, e.g., 4 students who have obtained marks would be considered in the class-interval 40-55 and not in 25-40.

With this convention in our mind, let us form a grouped frequency distribution table.

Class interval10-2525-4040-5555-7070-8585-100
Number of students(fi)237666

Now, for each class-interval, we require a point which would serve as the representative of the whole class.

It is assumed that the frequency of each class interval is centred around its .

So the mid-point (or class mark) of each class can be chosen to represent the observations falling in the class. Recall that we find the mid-point of a class (or its class mark) by finding the average of its upper and lower limits. That is

Class mark = Upper class limit + Lower class limit2

With reference to Table, for the class 10-25, the class mark is 10 + 252 =

Similarly, we can find the class marks of the remaining class intervals. We put them in Table. These class marks serve as our xi. Now, in general, for the ith class interval, we have the frequency fi corresponding to the class mark xi We can now proceed to compute the mean in the same manner as in Example 1.

Class intervalNumber of students(fi)Class mark(xi)fixi
10-25217.5
25-403
40-557
55-706
70-856
85-1006
Totalfi = fi xi

The sum of the values in the last column gives us Σ fi xi. So, the mean x of the given data is given by

\bar{x} =∑fixi∑fi = 186030 =

This new method of finding the mean is known as the Direct Method.

We observe that Tables are using the same data and employing the same formula for the calculation of the mean but the results obtained are .

Can you think why this is so, and which one is more accurate? The difference in the two values is because of the mid-point assumption in Table 59.3 being the exact , while 62 an approximate .

Sometimes when the numerical values of xi and fi are large, finding the product of xi and fi becomes tedious and time consuming. So, for such situations, let us think of a method of reducing these calculations.

We can do nothing with the fi's but we can change each xi to a smaller number so that our calculations become easy. How do we do this? What about subtracting a fixed number from each of these xi's.

The first step is to choose one among the xi's as the assumed mean, and denote it by ‘a’. Also, to further reduce our calculation work, we may take ‘a’ to be that xi which lies in the centre of x1,x2,.... xn. So, we can choose a = 47.5 or a = 62.5. Let us choose a = 47.5.

The next step is to find the difference di between a and each of the xi's, that is, the deviation of ‘a’ from each of the xi's.

i.e., di = xi - a = xi

The third step is to find the product of di with the corresponding fi , and take the sum of all the fi di's. The calculations are shown in the below table.

Class intervalNumber of students(fi)Class mark(xi)di = xi– 47.5fi di
10-25217.5-30
25-40332.5
40-55747.5
55-70662.5
70-85677.5
85-100692.5
Totalfi=fi di=

So, from Table, the mean of the deviations, \bar{x} =∑fidi∑fi.

Activity 1 : From the Direct Method table, find the mean by taking each of xi(i.e., 17.5, 32.5,and so on) as ‘a’.

What do you observe? You will find that the mean determined in each case is the same, i.e., 62. (Why?)

So, we can say that the value of the mean obtained depend on the choice of ‘a’.

Observe that in the above given table, the values in Column 4 are all multiples of .

So, if we divide the values in the entire Column 4 by 15, we would get smaller numbers to multiply with fi. (Here, 15 is the class size of each class interval.)

So, let ui = xiah, where a is the assumed mean and h is the class size.

Now, we calculate ui in this way and continue as before (i.e., find fiui and then Σfiui).

Taking h = 15, let us form the following table:

Class intervalfixidi = xiaui = xiahfiui
10-25217.5-30-2
25-40332.5-1-3
40-55747.5
55-70662.5
70-85677.5
85-100692.5
TotalΣfi = Σfiui =

Let \bar{u} = ΣfiuiΣfiHere, again let us find the relation between \bar{u} and \bar{x} .

We have, ui = xiah

Therefore, \bar{u} = ΣfixiahΣfi = 1h [ΣfixiaΣfiΣfi]

= 1h [ΣfixiΣfi - aΣfiΣfi]

= 1h[\bar{x}-a]

So, h\bar{u} = \bar{x} - a

i.e., \bar{x} = a + h\bar{u}

So, \bar{x} = a + h(ΣfiuiΣfi)

Now, substituting the values of a, h, Σfiui and Σfi from Table 14.5, we get

\bar{x} = + × (2930)

= 47.5 + =

So, the mean marks obtained by a student is 62.

The method discussed above is called the method.

We note that :

(1) the step-deviation method will be convenient to apply if all the d_i’s have a common factor.

(2) The mean obtained by all the three methods is the .

(3) The assumed mean method and step-deviation method are just simplified forms of the method.

(4) The formula \bar{x} = a + h\bar{u} still holds if a and h are not as given above, but are any non-zero numbers such that ui = xiah.

Let us apply these methods in another example.

2. The table below gives the percentage distribution of female teachers in the primary schools of rural areas of various states and union territories (U.T.) of India. Find the mean percentage of female teachers by all the three methods discussed in this section.

Percentage of female teachers15-2525-3535-4545-5555-6565-7575-85
Number of States/U.T.61174421

Solution:

Let us find the class marks, xi of each class, and put them in a column:

Percentage of female teachersNumber of States/U.T.xi
15 - 25620
25 - 3511
35 - 457
45 - 55 4
55 - 654
65 - 752
75 - 85 1

Here we take a = 50, h = 10, then di = xi -50 and ui =

We now find di and ui and put them in Table

Percentage of female teachersNumber of States/U.T.xidi = xi-50ui = xi5010fi xi fi difi ui
15 - 25620-30-3120-180-18
25 - 351130-20
35 - 45740-10
45 - 554500
55 - 6546010
65 - 7527020
75 - 8518030
Total

From the table above, we obtain ∑fi = ,fi xi = ,

fi di = ,fi ui = .

Using the direct method, \bar{x} =∑fixi∑fi = 139035 = . (Upto two decimal places)

Therefore, the mean percentage of female teachers in the primary schools of rural areas is 39.71.

Remark : The result obtained by all the three methods is the .

So the choice of method to be used depends on the numerical values of xi and fi.

If xi fi are sufficiently small, then the direct method is an appropriate choice. If xi fi are numerically large numbers, then we can go for the method or method.

If the class sizes are unequal, and xi are large numerically, we can still apply the step-deviation method by taking h to be a suitable divisor of all the di's.

3. The distribution below shows the number of wickets taken by bowlers in one-day cricket matches. Find the mean number of wickets by choosing a suitable method. What does the mean signify?

Number of wickets20-6060-100100-150150-250250-350350-450
Number of bowlers.75161223

Here, the class size varies, and the xis are large. Let us still apply the stepdeviation method with a = 200 and h = 20. Then, we obtain the data as in Table.

Number of wickets takenNumber of bowlers(fi)xidi=xi-200ui=di20ui fi
20 - 60740
60 - 100 580
100 - 15016125
150 - 250 12200
250 - 350 2300
350 - 4503400

So, \bar{u} = 10645

Therefore, \bar{x}= = 200 + 20 10645 = 200 – = .

This tells us that, on an average, the number of wickets taken by these 45 bowlers in one-day cricket is 152.89.

Now, let us see how well you can apply the concepts discussed in this section!

Activity 2 :

Divide the students of your class into three groups and ask each group to do one of the following activities.

1. Collect the marks obtained by all the students of your class in Mathematics in the latest examination conducted by your school. Form a grouped frequency distribution of the data obtained.

2. Collect the daily maximum temperatures recorded for a period of 30 days in your city. Present this data as a grouped frequency table.

3. Measure the heights of all the students of your class (in cm) and form a grouped frequency distribution table of this data.

After all the groups have collected the data and formed grouped frequency distribution tables, the groups should find the mean in each case by the method which they find appropriate.