two variable tables

13
Two Variable Tables Two Variable Tables February 23, 2011 February 23, 2011

Upload: phuong

Post on 08-Jan-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Two Variable Tables. February 23, 2011. Objectives. By the end of this meeting, participants should be able to: Create and interpret a cross-tabulation using frequency and percentage tables. Explain the considerations related to missing data in a cross-tabulation. Cross Tabs. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Two Variable Tables

Two Variable TablesTwo Variable Tables

February 23, 2011 February 23, 2011

Page 2: Two Variable Tables

ObjectivesBy the end of this meeting, participants

should be able to:a) Create and interpret a cross-

tabulation using frequency and percentage tables.

b) Explain the considerations related to missing data in a cross-tabulation.

Page 3: Two Variable Tables

Cross TabsCross Tabsa)a) A popular and useful way to study A popular and useful way to study

the relationship between two ordinal the relationship between two ordinal or nominal variables is through a or nominal variables is through a bivariate frequency distribution.bivariate frequency distribution.

b)b) This is also called a contingency This is also called a contingency table, cross-tabulation and more table, cross-tabulation and more commonly cross tabs. commonly cross tabs.

c)c) These tables can be computed as These tables can be computed as percentages or countspercentages or counts

Page 4: Two Variable Tables

Count TablesCount Tablesa)a) Tables that use simply a count of two Tables that use simply a count of two

variables are generally less useful because of variables are generally less useful because of the ease of confusion. the ease of confusion.

b)b) A fictional example:A fictional example:

Grade/Professor

Smith Jones

A 12 10

B 23 10

C 30 8

D 9 6

F 10 6

Total 84 40

Page 5: Two Variable Tables

Count TablesCount Tablesc)c) Which professorWhich professor’’s class are you more s class are you more

likely to fail ?likely to fail ?d)d) On a quick glance you might think On a quick glance you might think

that Smith was the harder of two that Smith was the harder of two professors failing 19 students with a professors failing 19 students with a D or an F.D or an F.

e)e) But Smith issued a D or an F in 22.5% But Smith issued a D or an F in 22.5% cases while Jones issued a D or an F cases while Jones issued a D or an F 30% of the time.30% of the time.

f)f) So if you want to be safe, Smith is So if you want to be safe, Smith is the better choice.the better choice.

Page 6: Two Variable Tables

Count TablesCount Tablesg)g) A real example: Lincoln County, MO is traditionally a A real example: Lincoln County, MO is traditionally a

bellwether in presidential elections. Look at the 2008 results bellwether in presidential elections. Look at the 2008 results from this county:from this county:

h)h) Based upon this count it would be easy to conclude that the Based upon this count it would be easy to conclude that the county had near-unanimous support for McCain. Instead, county had near-unanimous support for McCain. Instead, while the county went for McCain it did so by 2,690 votes, or while the county went for McCain it did so by 2,690 votes, or 11.6% of the 23,158 cast.11.6% of the 23,158 cast.

Precincts Won by Candidate in Lincoln County, MO

City Precincts

Non-City Precincts

Total

McCain

3 18 21

Obama

3 0 3

Page 7: Two Variable Tables

Percentage TablesPercentage Tablesa)a) Due to the ease of confusion with Due to the ease of confusion with

count tables most analysts lean more count tables most analysts lean more towards to percentage tables. towards to percentage tables.

b)b) These tables make it easier for the These tables make it easier for the reader to observe the relationship reader to observe the relationship between two variables. between two variables.

c)c) The best percentage tables should The best percentage tables should have some basic count information have some basic count information on them, such as total number of on them, such as total number of cases. cases.

Page 8: Two Variable Tables

Percentage Tables Percentage Tables d)d) Without some basic count information Without some basic count information

is still easy for the reader to be misled. is still easy for the reader to be misled. e)e) For example, lets examine the Missouri For example, lets examine the Missouri

Congressional District 5 in 2008 which Congressional District 5 in 2008 which includes all of Kansas City, MO and includes all of Kansas City, MO and parts of Jackson and Cass counties. parts of Jackson and Cass counties. Let’s say that our question was did the Let’s say that our question was did the vote vary in this district by county?vote vary in this district by county?

Page 9: Two Variable Tables

Percentage TablesPercentage Tables

f)f) So from this table we see that there is So from this table we see that there is substantial variation in District 5 by county. substantial variation in District 5 by county. The problem is we cannot answer the more The problem is we cannot answer the more basic question of who won that election. basic question of who won that election.

Congressional Vote by County, District 5

Kansas City

Jackson Cass

Cleaver (D)

79.0% 52.5% 42.8%

Turk (R) 21.0% 47.5% 57.2%

Total 100.0% 100.0% 100.0%

Page 10: Two Variable Tables

Percentage TablesPercentage Tablesg)g) This table can be improved by adding some This table can be improved by adding some

simple count information so that the final simple count information so that the final result is clear. How would we explain the result is clear. How would we explain the differences in vote by county?differences in vote by county?

Congressional Vote by County, District 5

Kansas City

Jackson Cass Total

Cleaver (D)

79.0%(115,757)

52.5%(70,266)

42.8%(11,226)

64.4%(197,249)

Turk (R) 21.0%(30,683)

47.5%(63,496)

57.2%(14,987)

35.6%(109,166)

Total 100.0%(146,440)

100.0%(133,762)

100.0%(26,213)

100.0%(306,415)

Page 11: Two Variable Tables

Missing DataMissing Dataa)a) In the previous example we were using In the previous example we were using

the official results from the MO Secretary the official results from the MO Secretary of State and we had access to full of State and we had access to full results. Therefore, we did not have to results. Therefore, we did not have to deal with the issue of missing data. deal with the issue of missing data.

b)b) Generally, whenever using survey data Generally, whenever using survey data there is always the question of missing there is always the question of missing data.data.

c)c) Some missing data comes from errors or Some missing data comes from errors or omissions while others stem from omissions while others stem from refusals.refusals.

Page 12: Two Variable Tables

Missing DataMissing Datad)d) When presenting data as a table the When presenting data as a table the

analyst needs to decide how they analyst needs to decide how they want to present these categories.want to present these categories.

e)e) Generally it is necessary to Generally it is necessary to determine how important the missing determine how important the missing cases are to the analysis. cases are to the analysis.

Page 13: Two Variable Tables

For Next TimeFor Next Timea) Read WKB chapter 12b) Select two ordinal variables with a possible

relationship from the PS-ARE data.• Recode the variables (if necessary) so that they

are both ordinal scales.• Compute the cross tabulation of the two

variables with both frequencies and percentages.

• Interpret this table. Are the variables substantively associated?

• Hint: Save the code you used to get this table. It’ll come in handy for your next homework.