measures of central tendency & variability dhon g. dungca, m.eng’g

Measures of Central Tendency & Variability

Dhon G. Dungca, M.Eng’g.

MEASURES OF CENTRAL TENDENCY• Is the value around which the whole set of data tend

to cluster. It is represented by a single number which summarizes and describes the whole set

• Most commonly used measures of central tendency are:– The arithmetic mean– The median– The mode

FOR UNGROUPED DATA • Ungrouped Data

– Refer to data not organized into frequency distribution.

• ARITHMETIC MEAN – May be defined as an arithmetic average. It is the

sum of the observed values divided by the number of observations.

– This can either be a population mean (denoted by ) or a sample mean (denoted by x)

ARITHMETIC MEAN Population Mean:

n

xi

= i=1

N

where: N = total # of items in a population xi = the ith observed value

Sample Mean: n

xi

x = i=1

n

where: n = total # of items in a populationxi = the ith observed value

WEIGHTED MEAN Takes into consideration the proper

weights assigned to the observed values according to their relative importance.

n

Wi xi

x = i=1

Wi

where: Wi = weight of each item

xi = value of each item

EXAMPLE 1 What is the mean age of a group of 8 children

whose ages are: 7, 8, 8.5, 9, 10, 10.5, 11, 12, and 13?

Answer:

x = 7 + 8 + 8.5 + 9 + 10 + 10.5 + 11 + 12 + 13

8

x = 88 / 8

x = 11 years old

EXAMPLE 2 A market vendor sold 3 dozens of eggs at

P 33.50 per dozen, 5 dozens at P 32.00 per dozen, and 2 dozens at P 35.00 per dozen. Find the weighted mean price per dozen eggs the vendor sold.

MEDIAN (Me)

Is the midpoint of the distribution. Half of the values in a distribution fall below the median, and the other half fall above it. For distributions having an even number of arrayed observations, the median is the average of the two middle values

EXAMPLE 3

Find the median of the following set of observations: 8, 4, 1, 3 and 7

Answer:

1, 3, 4, 7, 8 Me = 4

EXAMPLE 4

Find the median of the following set of observations: 12, 9, 6, 10, 7 and 14

Answer:

6, 7, 9, 10, 12, 14

Median = (9+10) / 2

Me = 9.5

MODE (Mo)

Is the value that appears with the highest (greatest) frequency. That is, the value that appears most often.

Example 5

Determine the mode of the following distribution:

3, 8, 10, 5, 3, 5, 2, 5, 7

Answer: Mo = 5 (uni-modal)

EXAMPLES

Example 6Determine the mode of the following distribution:20, 15, 10, 9, 7, 20, 10, 10, 20

Answer: Mo = 20 and 10

Example 7Determine the mode of the following distribution:7, 5, 10, 22, 20, 6, 11, and 21Answer: There is no mode.

MEAN: GROUPED DATATwo methods of computing for the mean:

LONG Method: n

xi fi

x = i=1

n

where: n = total # of items xi = class midpointsfi = corresponding frequencies

SHORT Method (Use of CODING): n

ui fi

x = xo + i=1 C n

where: xo = assumed mean or coded mean C = class size

MEAN: GROUPED DATA

• Coding technique:– Choose the assumed mean xo (the class

with the highest frequency).– Under a new column u, write the zero value

code opposite xo and assign positive numbers to the classes higher in value than the class with the assumed mean, and consecutive negative integers to those classes lower in value.

– Multiply the coded values with their corresponding frequencies and compute for the algebraic sum.

– Substitute in the given coded formula and compute for the mean.

EXAMPLE 8

•The following is the distribution of the wages of 50 workers of Hazel’s Garments Manufacturing Co. taken during a particular week last June.•Determine the mean using both methods.

Weekly Wages

# of workers

870-899 4

900-929 6

930-959 10

960-989 13

990-1019 8

1020-1049 7

1050-1079 2

Total 50

MEDIAN: GROUPED DATA (Me)

The median of a frequency distribution (grouped data) could be found by the following formula:

n _ FMe

Me = LMe + 2 C

fMe

where: LMe = lower boundary of the median class n = total # of observations

FMe = cumulative frequency immediately preceding the median class

fMe = frequency of the median classC = class size

The median class is the class which contains the n/2th value.

MODE: GROUPED DATA (Mo)

The mode of a frequency distribution (grouped data) could be found by the following formula:

Mo = LMo + d1 C

d1 + d2

where: LMe = lower boundary of the modal class

d1 = difference between the frequency of the modal class and the frequency of the class next lower in value.

d2 = difference between the frequency of the modal class and the frequency of the class next higher in value.

The modal class is the class with the highest frequency.

EXAMPLE 9

Find the mean, median and mode of the following distribution of length of service in years of 30 employees of KITS INC.

Length of service # of employees

1 - 5 3

6 - 10 6

11 - 15 11

16 - 20 7

21 - 25 3

Total 30

MEASURES OF VARIABILITY

Indicate the extent to which values in a distribution are spread around the central tendency.

Example: Suppose 2 applicants, A & B were given an examination to test and compare their typing speed (Assume all factors being equal). Each was given 10 trials and the results were as follows:A: 14, 16, 18, 20, 22, 24, 26, 28, 30B: 18, 18, 20, 22, 24, 24, 24, 24, 24

Both A & B have the same mean of 22 minutes tending to show that there is no difference in the performance of both if the only basis of comparison is the mean.

However, B has an advantage over A since her completion time varied only as much as 4 minutes above and below the mean while that of A varied as much as 8 minutes above and below the mean.

VARIANCE: UNGROUPED DATA

Is the average of the squared deviation values from the distribution’s mean.

n

(xi – x)2

Sample Variance (S2) = i=1

n - 1

If all values are identical the variance is zero (0). The greater the dispersion of values the greater the variance.

STANDARD DEVIATION: UNGROUPED DATA

Is the positive square root of the variance. Measures the spread or dispersion of each value from the mean of the distribution. It is the most used measure of spread since it improves interpretability by removing the variance square and expressing deviations in their original unit. It is the most important measure of dispersion since it enables us to determine with a great deal of accuracy where the values of the distribution are located in relation to the mean.

n

(xi – x)2

Sample SD (S) = i=1

n - 1

n

(xi – )2

Population SD () = i=1

N

EXAMPLE 10

Solve for the variance and the standard deviation of the following set of data:

6, 2, 4, 7, 8, and 10

FOR GROUPED DATA

n n 2

n xi2 fi -- xi fi

Sample Variance (S2) = i=1 i=1

n (n – 1)

n n 2

n xi2 fi -- xi fi

Sample SD (S) = i=1 i=1

n (n – 1)

EXAMPLE 11

Find the values of the variance and standard deviation of the following:

Monthly Savings # of families

550-569 7

570-589 20

590-609 33

610-629 25

630-649 11

650-669 4

measures of central tendency & variability dhon g. dungca, m.eng’g

Documents

mean age

assumed mean xo

x arithmetic mean population

coded mean c

following distribution

weighted mean price

ith observed valuesample

n wi xi x