measures’of’central’tendency’ · measures’of’central’tendency:’’...

Measures of Central Tendency

Levin and Fox Elementary Sta:s:cs In Social Research

Chapter 3

1

Measures of central tendency:

Measures of central tendency: Measures of central tendency are numbers that describe what is average or

typical in a distribu9on We will focus on three measures of central tendency:

–  The Mode –  The Median –  The Mean (average)

Our choice of an appropriate measure of central tendency depends on three factors: (a) the level of measurement, (b) the shape of the distribu:on, (c) the purpose of the research.

2

The Mode

The Mode: The mode is the most frequent, most typical or most common value or category

in a distribu9on.

Example: There are more protestants in the US than people of any other religion.

The mode is always a category or score, not a frequency.

The mode is not necessarily the category with the majority (that is, 50% or more) of cases. It is simply the category in which the largest number of cases falls.

3

The Mode

The Mode: -‐  Most frequent or most

common value or category. -‐  category or score (not a

frequency.) -‐  not necessarily majority -‐  Used to describe nominal

variables!

4

Look at the figure below and iden:ty the mode.

4%

Let’s Practice!

5

The pie chart shows answers of 1998 GSS respondents to the ques9on, “Would you say your own health, in general, is excellent, good, fair, or

poor?”

Note that the highest percentage (49%) of respondents is associated with the answer “good.”

The answer “good” is the mode. Remember: The mode is used to describe nominal variables!

A Review of Mode

6

A Review of Mode Another Mode Example: Our ques:on is the following: “What is the most common foreign language spoken in the United States today,

as determined by the mode?” To answer this ques9on, let’s look at a list of the ten most commonly spoken

foreign languages in the United States and the number of people who speak each foreign language:

7

Language Number of Speakers Spanish 17,339,000 French 1,702,000 German 1,547,000 Italian 1,309,000 Chinese 1,249,000 Tagalog 843,000 Polish 723,000 Korean 626,000 Vietnamese 507,000 Portuguese 430,000

Ten Most Common Foreign Languages Spoken in the United States, 1990.

Source: U.S. Bureau of the Census, Statistical Abstract of the United States, 2000, Table 51.

8

Is the mode 17,339,000?

NO!

Recall: The mode is the category or score, not the frequency!!

Thus, the mode is Spanish.

A Review of Mode

9

The Mode

Some addi:onal points to consider about modes: Some distribu9ons have two modes where two response categories have the

highest frequencies.

Such distribu9ons are said to be bimodal. NOTE: When two scores or categories have the highest frequencies that are

quite close, but not iden9cal, in frequency, the distribu9on is s9ll “essen9ally” bimodal. In these instances report both the “true” mode and the highest frequency categories.

10

Example of a Bimodal Frequency Distribu:on

11

The Median

The Median: The median is the score that divides the distribu9on into two equal parts so

that half of the cases are above it and half are below it.

The median can be calculated for both ordinal and interval levels of measurement, but not for nominal data.

It must be emphasized that the median is the exact middle of a distribu:on.

So, now let’s look at ways we can find the median in sorted data:

12

The Mode and Median

The Mode: -‐  Most frequent or most

common value or category. -‐  category or score (not a

frequency.) -‐  not necessarily majority -‐  Used to describe nominal

variables!

13

The Median: -‐ Divides the distribu9on into

two equal (exact middle 50% above and below)

-‐  The median can be calculated for both ordinal and interval levels of measurement, but not for nominal data.

-‐  Need to sort data to calculate

In some cases, we can find the median by simple inspec:on.

Let’s look at the responses (A) to the ques9on: “Think about the economy, how would you rate economic condi?ons in the country today?”

First, we sort the responses (B) in order from lowest to highest (or highest to lowest).

Since we have an odd number of cases, let’s find the middle case.

Poor Jim Good Sue Only Fair Bob Poor Jorge Excellent Karen Total (N) 5

Poor Jim Poor Jorge Only Fair Bob Good Sue Excellent Karen Total (N) 5

A

B

14

Calcula:ng the median:

Jim Poor Jorge Poor Bob Only Fair Sue Good Karen Excellent

We can find the median through visual inspec:on and through calcula:on.

We can also find the middle case when N is odd by adding 1 to N and dividing by 2: (N + 1) ÷2.

Since N is 5, you calculate (5 + 1) ÷ 2 = 3. The middle case is, thus, the third case (Bob), the

median response is “Only Fair.”

15


State Number California 1831 Florida 93 Virginia 105 New Jersey 694 New York 853 Ohio 265 Pennsylvania 168 Texas 333 North Carolina 42

TOTAL N = 9

Another example: The following is a list of the number of hate crimes reported in the nine

largest U.S. states for 1997.

16


Finding the Median Number of Hate Crimes

1.  Order the cases from lowest to

highest. 2.  In this situa9on, we need the 5th

case: (9 + 1) ÷ 2 = 5

Which is 265 (Interval data)

Remember: (N + 1) ÷2.

State Number

North Carolina 42

Florida 93

Virginia 105

Pennsylvania 168

Ohio 265

Texas 333

New Jersey 694

New York 853

California 1831

N = 9

17

Finding the Median Number of Hate Crimes out of Eight States

1.  Order the cases from lowest to highest.

2.  The median is always that point above which 50% of cases fall and below which 50% of cases fall.

3.  For an even number of cases, there will be two middle cases.

4.  In this instance, the median falls halfway between both cases (216.5).

5.  However, the circumstances being explained should determine if you use the two middle cases or the point halfway between both cases for your explana9on.

State Number

North Carolina 42

Florida 93

Virginia 105

Pennsylvania 168

Ohio 265

Texas 333

New Jersey 694

New York 853

18

Finding the Median Number of Hate Crimes out of Eight States

1.  In this instance, the median falls halfway

between both cases (216.5).

(8 + 1) ÷ 2 = 4.5

State Number

North Carolina 42

Florida 93

Virginia 105

Pennsylvania 168

Ohio 265

Texas 333

New Jersey 694

New York 853

19

4.5 (216.5)

The median in frequency distribu:ons:

So now, let’s find the median in frequency distribu9ons: O_en the data are arranged in frequency distribu9ons.

The procedure is a bit more involved: –  We have to find the category associated with the observa9on located in

the middle of the distribu9on. –  To do this, we construct a cumula9ve percentage distribu9on.

So, let’s take a look at a frequency distribu:on…

20

Table: Poli:cal Views of GSS Respondents, 1988

Political Views

Frequency (f)

Cf Percentage C%

Extremely Liberal

32 32 2.4 2.4

Liberal 175 207 12.9 15.3

Slightly Liberal

189 396 13.9 29.2

Moderate 502 898 37.0 66.2

Slightly Conservative

211 1109 15.6 81.8

Conservative 203 1312 15.0 96.8

Extremely Conservative

44 1356 3.2 100.00

Total 1356 100.00

21

Cumula:ve Percentage Distribu:on: We construct a cumula9ve percentage distribu9on to help locate the middle of

the distribu:on.

The observa9on located in the middle of the distribu9on is the one that has the cumula:ve percentage value equal to 50%.

Ø No9ce that 29.2% of the observa:ons are accumulated below the category of “moderate” and that 66.2% are accumulated up to and including the category “moderate.”

The median is the value of the category associated with this observa9on.

This middle observa9on falls within the category “moderate,” so the median for this distribu9on is “moderate.”

Cumula:ve Percentage Distribu:on:

22

Table: Poli:cal Views of GSS Respondents, 1988

Political Views

Frequency (f)

Cf Percentage C%

Extremely Liberal

32 32 2.4 2.4

Liberal 175 207 12.9 15.3

Slightly Liberal 189 396 13.9 29.2

Moderate 502 898 37.0 66.2 29.2-66.2

Slightly Conservative

211 1109 15.6 81.8

Conservative 203 1312 15.0 96.8

Extremely Conservative

44 1356 3.2 100.00

Total 1356 100.00

23

The Mean The Mean: The mean is what most people call the average. It find the mean of any distribu9on

simply add up all the scores and divide by the total number of scores.

Here is formula for calcula:ng the mean

�

X =X∑N

where X =mean (read as X bar)

∑ = sum (expressed as the Greek letter sigma)

X = raw score in a set of scoresN = total number of scores in a set

24

Finding the Mean Communicable Diseases -> Tuberculosis (as of 22 March 2007)

2005 Bangladesh 37 Bhutan 44 Democratic People's Republic of Korea 103 India 58 Indonesia 47 Maldives 76 Myanmar 119 Nepal 64 Sri Lanka 71 Thailand 61 Timor-Leste 71

n (cases) = 11 751

© World Health Organization, 2008. All rights reserved 25

Finding the Mean: To iden9fy the instances of tuberculosis found in 2006 by the WHO in this

region,

–  Add up the cases for all of the countries in the region and –  Divide the sum by the total number of cases.

Thus, the mean rate is (751 ÷ 11) = 68.273.

Finding the Mean

�

X =X∑N

26

Using a formula to calculate the mean: The Usefulness of Formulas: The mean introduces the usefulness of a formula, which may be defined as a

is a shorthand way to explain what opera:ons we need to follow to obtain a certain result.

Again, the formula that defines the mean is:

�

X =X∑N

where X =mean (read as X bar)

∑ = sum (expressed as the Greek letter sigma)

X = raw score in a set of scoresN = total number of scores in a set

27

Devia:on:

Devia:on: The devia9on indicates the distance and direc9on of any raw score from the

mean.

To find the devia9on of a par9cular score, we simply subtract the mean from the score:

Where X = any raw score in the distribu9on

�

Deviation = X − X

ondistributitheofmean=X

28

The Weighted Mean When groups differ in size, you can’t just sum their means and divide by the

number of groups. Instead, you must weight each group mean by its size,

meanweighted

combinedgroupsallinnumbergroupparticularainnumber

groupparticularaofmean

=

=

=

=

=∑

X

X

XX

w

total

group

group

total

groupgroup

w

NN

where

N

N

29

So what does this tell us?

The mode is the peak of the curve. The mean is found closest to the tail, where the rela9vely few extreme cases

will be found. The median is found between the mode and mean or is aligned with them in

a normal distribu9on.

30

Did you know?

The shape or form of a distribu9on can influence the researcher’s choice of a measure of tendency.

Why is that? Well, let’s see…

31

measures’of’central’tendency’ · measures’of’central’tendency:’’...

Documents