measures’of’central’tendency’ · measures’of’central’tendency:’’...

31
Measures of Central Tendency Levin and Fox Elementary Sta:s:cs In Social Research Chapter 3 1

Upload: others

Post on 09-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Measures  of  Central  Tendency  

Levin  and  Fox  Elementary  Sta:s:cs  In  Social  Research  

Chapter  3  

1

Page 2: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Measures  of  central  tendency:      

Measures  of  central  tendency:    Measures  of  central  tendency  are  numbers  that  describe  what  is  average  or  

typical  in  a  distribu9on    We  will  focus  on  three  measures  of  central  tendency:  

–  The  Mode  –  The  Median  –  The  Mean  (average)  

Our  choice  of  an  appropriate  measure  of  central  tendency  depends  on  three  factors:  (a)  the  level  of  measurement,  (b)  the  shape  of  the  distribu:on,  (c)  the  purpose  of  the  research.  

2

Page 3: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

The  Mode  

The  Mode:    The  mode  is  the  most  frequent,  most  typical  or  most  common  value  or  category  

in  a  distribu9on.      

 Example:  There  are  more  protestants  in  the  US  than  people  of  any  other  religion.  

 The  mode  is  always  a  category  or  score,  not  a  frequency.  

The  mode  is  not  necessarily  the  category  with  the  majority  (that  is,  50%  or  more)  of  cases.  It  is  simply  the  category  in  which  the  largest  number  of  cases  falls.  

3

Page 4: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

The  Mode  

The  Mode:      -­‐  Most  frequent  or  most  

common  value  or  category.    -­‐  category  or  score  (not  a  

frequency.)  -­‐  not  necessarily  majority    -­‐  Used  to  describe  nominal  

variables!  

4

Page 5: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Look  at  the  figure  below  and  iden:ty  the  mode.  

4%

Let’s Practice!

5

Page 6: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

The  pie  chart  shows  answers  of  1998  GSS  respondents  to  the  ques9on,    “Would  you  say  your  own  health,  in  general,  is  excellent,  good,  fair,  or  

poor?”  

Note  that  the  highest  percentage  (49%)  of  respondents  is  associated  with  the  answer  “good.”  

The  answer  “good” is  the  mode.    Remember:  The  mode  is  used  to  describe  nominal  variables!    

A  Review  of  Mode  

6

Page 7: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

A  Review  of  Mode  Another  Mode  Example:  Our  ques:on  is  the  following:    “What  is  the  most  common  foreign  language  spoken  in  the  United  States  today,  

as  determined  by  the  mode?”    To  answer  this  ques9on,  let’s  look  at  a  list  of  the  ten  most  commonly  spoken  

foreign  languages  in  the  United  States  and  the  number  of  people  who  speak  each  foreign  language:  

7

Page 8: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Language   Number  of  Speakers  Spanish 17,339,000 French 1,702,000 German 1,547,000 Italian 1,309,000 Chinese 1,249,000 Tagalog 843,000 Polish 723,000 Korean 626,000 Vietnamese 507,000 Portuguese 430,000

Ten  Most  Common  Foreign  Languages  Spoken  in  the  United  States,  1990.  

Source: U.S. Bureau of the Census, Statistical Abstract of the United States, 2000, Table 51.

8

Page 9: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Is  the  mode  17,339,000?  

NO!  

Recall:  The  mode  is  the  category  or  score,  not  the  frequency!!  

Thus,  the  mode  is  Spanish.  

A  Review  of  Mode  

9

Page 10: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

The  Mode  

Some  addi:onal  points  to  consider  about  modes:  Some  distribu9ons  have  two  modes  where  two  response  categories  have  the  

highest  frequencies.  

Such  distribu9ons  are  said  to  be  bimodal.   NOTE:  When  two  scores  or  categories  have  the  highest  frequencies  that  are  

quite  close,  but  not  iden9cal,  in  frequency,  the  distribu9on  is  s9ll  “essen9ally”  bimodal.  In  these  instances  report  both  the  “true”  mode  and  the  highest  frequency  categories.  

10

Page 11: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Example  of  a  Bimodal  Frequency  Distribu:on  

11

Page 12: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

The  Median  

The  Median:  The  median  is  the  score  that  divides  the  distribu9on  into  two  equal  parts  so  

that  half  of  the  cases  are  above  it  and  half  are  below  it.  

The  median  can  be  calculated  for  both  ordinal  and  interval  levels  of  measurement,  but  not  for  nominal  data.  

   It  must  be  emphasized  that  the  median  is  the  exact  middle  of  a  distribu:on.  

So,  now  let’s  look  at  ways  we  can  find  the  median  in  sorted  data:      

12

Page 13: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

The  Mode  and  Median  

The  Mode:      -­‐  Most  frequent  or  most  

common  value  or  category.    -­‐  category  or  score  (not  a  

frequency.)  -­‐  not  necessarily  majority    -­‐  Used  to  describe  nominal  

variables!  

13

The  Median:    -­‐        Divides  the  distribu9on  into  

two  equal  (exact  middle  50%  above  and  below)  

-­‐  The  median  can  be  calculated  for  both  ordinal  and  interval  levels  of  measurement,  but  not  for  nominal  data.  

-­‐  Need  to  sort  data  to  calculate  

   

Page 14: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

In  some  cases,  we  can  find  the  median  by  simple  inspec:on.  

Let’s  look  at  the  responses  (A)  to  the  ques9on:  “Think  about  the  economy,  how  would  you  rate  economic  condi?ons  in  the  country  today?”  

First,  we  sort  the  responses  (B)  in  order  from  lowest  to  highest  (or  highest  to  lowest).  

Since  we  have  an  odd  number  of  cases,  let’s  find  the  middle  case.  

Poor Jim Good Sue Only Fair Bob Poor Jorge Excellent Karen Total (N) 5

Poor Jim Poor Jorge Only Fair Bob Good Sue Excellent Karen Total (N) 5

A

B

14

Page 15: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Calcula:ng  the  median:  

Jim Poor Jorge Poor Bob Only Fair Sue Good Karen Excellent

We  can  find  the  median  through  visual  inspec:on  and  through  calcula:on.  

We  can  also  find  the  middle  case  when  N  is  odd  by  adding  1  to  N  and  dividing  by  2:              (N  +  1)  ÷2.  

Since  N  is  5,  you  calculate  (5  +  1)  ÷  2  =  3.    The  middle  case  is,  thus,  the  third  case  (Bob),  the  

median  response  is  “Only  Fair.”  

15

Page 16: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Calcula:ng  the  median:  

State Number California 1831 Florida 93 Virginia 105 New Jersey 694 New York 853 Ohio 265 Pennsylvania 168 Texas 333 North Carolina 42

TOTAL N = 9

Another  example:  The  following  is  a  list  of  the  number  of  hate  crimes  reported  in  the  nine  

largest  U.S.  states  for  1997.  

16

Page 17: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Calcula:ng  the  median:  

Finding  the  Median  Number  of  Hate  Crimes  

 1.  Order  the  cases  from  lowest  to  

highest.  2.  In  this  situa9on,  we  need  the  5th  

case:    (9  +  1)  ÷  2  =  5  

 Which  is  265  (Interval  data)  

 Remember:  (N  +  1)  ÷2.  

State Number

North Carolina 42

Florida 93

Virginia 105

Pennsylvania 168

Ohio 265

Texas 333

New Jersey 694

New York 853

California 1831

N = 9

17

Page 18: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Finding  the  Median  Number  of  Hate  Crimes  out  of  Eight  States  

1.  Order  the  cases  from  lowest  to  highest.  

2.  The  median  is  always  that  point  above  which  50%  of  cases  fall  and  below  which  50%  of  cases  fall.  

3.  For  an  even  number  of  cases,  there  will  be  two  middle  cases.  

4.  In  this  instance,  the  median  falls  halfway  between  both  cases  (216.5).    

5.  However,  the  circumstances  being  explained  should  determine  if  you  use  the  two  middle  cases  or  the  point  halfway  between  both  cases  for  your  explana9on.    

 

State Number

North Carolina 42

Florida 93

Virginia 105

Pennsylvania 168

Ohio 265

Texas 333

New Jersey 694

New York 853

18

Page 19: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Finding  the  Median  Number  of  Hate  Crimes  out  of  Eight  States  

 1.  In  this  instance,  the  median  falls  halfway  

between  both  cases  (216.5).    

 (8  +  1)  ÷  2  =  4.5      

State Number

North Carolina 42

Florida 93

Virginia 105

Pennsylvania 168

Ohio 265

Texas 333

New Jersey 694

New York 853

19

4.5  (216.5)  

Page 20: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

The  median  in  frequency  distribu:ons:  

So  now,  let’s  find  the  median  in  frequency  distribu9ons:  O_en  the  data  are  arranged  in  frequency  distribu9ons.    

The  procedure  is  a  bit  more  involved:  –  We  have  to  find  the  category  associated  with  the  observa9on  located  in  

the  middle  of  the  distribu9on.  –  To  do  this,  we  construct  a  cumula9ve  percentage  distribu9on.  

So,  let’s  take  a  look  at  a  frequency  distribu:on…  

20

Page 21: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Table:  Poli:cal  Views  of  GSS  Respondents,  1988  

Political Views

Frequency (f)

Cf Percentage C%

Extremely Liberal

32 32 2.4 2.4

Liberal 175 207 12.9 15.3

Slightly Liberal

189 396 13.9 29.2

Moderate 502 898 37.0 66.2

Slightly Conservative

211 1109 15.6 81.8

Conservative 203 1312 15.0 96.8

Extremely Conservative

44 1356 3.2 100.00

Total 1356 100.00

21

Page 22: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Cumula:ve  Percentage  Distribu:on:    We  construct  a  cumula9ve  percentage  distribu9on  to  help  locate  the  middle  of  

the  distribu:on.  

The  observa9on  located  in  the  middle  of  the  distribu9on  is  the  one  that  has  the  cumula:ve  percentage  value  equal  to  50%.    

Ø No9ce  that  29.2%  of  the  observa:ons  are  accumulated  below  the  category  of  “moderate”  and  that  66.2%  are  accumulated  up  to  and  including  the  category  “moderate.”  

The  median  is  the  value  of  the  category  associated  with  this  observa9on.  

This    middle  observa9on  falls  within  the  category  “moderate,” so  the  median  for  this  distribu9on  is  “moderate.”  

Cumula:ve  Percentage  Distribu:on:      

22

Page 23: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Table:  Poli:cal  Views  of  GSS  Respondents,  1988  

Political Views

Frequency (f)

Cf Percentage C%

Extremely Liberal

32 32 2.4 2.4

Liberal 175 207 12.9 15.3

Slightly Liberal 189 396 13.9 29.2

Moderate 502 898 37.0 66.2 29.2-66.2

Slightly Conservative

211 1109 15.6 81.8

Conservative 203 1312 15.0 96.8

Extremely Conservative

44 1356 3.2 100.00

Total 1356 100.00

23

Page 24: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

The  Mean  The  Mean:    The  mean  is  what  most  people  call  the  average.  It  find  the  mean  of  any  distribu9on  

simply  add  up  all  the  scores  and  divide  by  the  total  number  of  scores.    

Here  is  formula  for  calcula:ng  the  mean      

X =X∑N

where X =mean (read as X bar)

∑ = sum (expressed as the Greek letter sigma)

X = raw score in a set of scoresN = total number of scores in a set

24

Page 25: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Finding  the  Mean  Communicable Diseases -> Tuberculosis (as of 22 March 2007)

2005 Bangladesh 37 Bhutan 44 Democratic People's Republic of Korea 103 India 58 Indonesia 47 Maldives 76 Myanmar 119 Nepal 64 Sri Lanka 71 Thailand 61 Timor-Leste 71

n (cases) = 11 751  

© World Health Organization, 2008. All rights reserved 25

Page 26: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Finding  the  Mean:  To  iden9fy  the  instances  of  tuberculosis  found  in  2006  by  the  WHO  in  this  

region,    

–  Add  up  the  cases  for  all  of  the  countries  in  the  region  and  –  Divide  the  sum  by  the  total  number  of  cases.      

Thus,  the  mean  rate  is  (751  ÷  11)  =  68.273.  

Finding  the  Mean  

X =X∑N

26

Page 27: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Using  a  formula  to  calculate  the  mean:  The  Usefulness  of  Formulas:    The  mean  introduces  the  usefulness  of  a  formula,  which  may  be  defined  as  a    

is  a  shorthand  way  to  explain  what  opera:ons  we  need  to  follow  to  obtain  a  certain  result.  

 Again,  the  formula  that  defines  the  mean  is:  

   

X =X∑N

where X =mean (read as X bar)

∑ = sum (expressed as the Greek letter sigma)

X = raw score in a set of scoresN = total number of scores in a set

27

Page 28: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Devia:on:  

Devia:on:  The  devia9on  indicates  the  distance  and  direc9on  of  any  raw  score  from  the  

mean.  

To  find  the  devia9on  of  a  par9cular  score,  we  simply  subtract  the  mean  from  the  score:  

   

 Where  X  =  any  raw  score  in  the  distribu9on  

       

Deviation = X − X

ondistributitheofmean=X

28

Page 29: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

The  Weighted  Mean  When  groups  differ  in  size,  you  can’t  just  sum  their  means  and  divide  by  the  

number  of  groups.  Instead,  you  must  weight  each  group  mean  by  its  size,  

meanweighted

combinedgroupsallinnumbergroupparticularainnumber

groupparticularaofmean

=

=

=

=

=∑

X

X

XX

w

total

group

group

total

groupgroup

w

NN

where

N

N

29

Page 30: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

So  what  does  this  tell  us?  

The  mode  is  the  peak  of  the  curve.    The  mean  is  found  closest  to  the  tail,  where  the  rela9vely  few  extreme  cases  

will  be  found.    The  median  is  found  between  the  mode  and  mean  or  is  aligned  with  them  in  

a  normal  distribu9on.  

30

Page 31: Measures’of’Central’Tendency’ · Measures’of’central’tendency:’’ Measures’of’central’tendency:" Measures"of"central"tendency"are"numbers"thatdescribe"whatis"average"or"

Did  you  know?  

The  shape  or  form  of  a  distribu9on  can  influence  the  researcher’s  choice  of  a  measure  of  tendency.  

 Why  is  that?  Well,  let’s  see…  

31