stb1e_ppt_ch03

28

Upload: wilson-carlwinata-s

Post on 26-Dec-2015

5 views

Category:

Documents


0 download

DESCRIPTION

Powerpoint

TRANSCRIPT

Page 1: stb1e_ppt_ch03
Page 2: stb1e_ppt_ch03

Copyright © 2011 Pearson Education, Inc.

Describing Categorical Data

Chapter 3

Page 3: stb1e_ppt_ch03

3.1 Looking At Data

Which hosts send the most visitors to Amazon’s Web site?

Data set consists of 188,996 visits

Host is a categorical variable

To answer this question we must describe the variation in Host

Copyright © 2011 Pearson Education, Inc.

3 of 28

Page 4: stb1e_ppt_ch03

3.1 Looking At Data

Frequency and Relative Frequency Tables

The distribution of a categorical variable is a list of values with its associated count (frequency)

A frequency table summarizes the distribution of a categorical variable

A relative frequency table shows the proportion (or percentage) in each category

Copyright © 2011 Pearson Education, Inc.

4 of 28

Page 5: stb1e_ppt_ch03

3.1 Looking At Data

Copyright © 2011 Pearson Education, Inc.

5 of 28

Page 6: stb1e_ppt_ch03

3.2 Charts of Categorical Data

Bar Charts and Pie Charts

Unless you need to know exact counts, charts are better than tables for summarizing more than five categories

The two most common displays of a categorical variable are a bar chart and a pie chart

Copyright © 2011 Pearson Education, Inc.

6 of 28

Page 7: stb1e_ppt_ch03

3.2 Charts of Categorical Data

The Bar Chart

Uses horizontal or vertical bars to show the distribution of a categorical variable

Is called a Pareto chart when the categories are sorted by frequency (popular in quality control)

Becomes cluttered with too many categories

Is appropriate for ordinal categorical variables

Copyright © 2011 Pearson Education, Inc.

7 of 28

Page 8: stb1e_ppt_ch03

3.2 Charts of Categorical Data

Bar Chart (Horizontal) of Top 10 Hosts

Copyright © 2011 Pearson Education, Inc.

8 of 28

Page 9: stb1e_ppt_ch03

3.2 Charts of Categorical Data

Bar Chart (Vertical) of Top 10 Hosts

Copyright © 2011 Pearson Education, Inc.

9 of 28

Page 10: stb1e_ppt_ch03

3.2 Charts of Categorical Data

The Pie Chart

Uses wedges of a circle to show the distribution of a categorical variable

Commonly chosen to illustrate market shares or sources of revenue for a company

Less useful than bar charts if we want to compare actual counts (easier to compare bars than angles of wedges)

Copyright © 2011 Pearson Education, Inc.

10 of 28

Page 11: stb1e_ppt_ch03

3.2 Charts of Categorical Data

Pie Chart of Top 10 Hosts

Copyright © 2011 Pearson Education, Inc.

11 of 28

Page 12: stb1e_ppt_ch03

3.3 The Area Principle

The Fundamental Rule for Data Displays

The area occupied by a part of the graph/chart that displays data should be proportional to the amount of data it represents

Charts decorated to attract attention often violate the area principle

Copyright © 2011 Pearson Education, Inc.

12 of 28

Page 13: stb1e_ppt_ch03

3.3 The Area Principle

An Example Violating the Area Principle

Copyright © 2011 Pearson Education, Inc.

13 of 28

Page 14: stb1e_ppt_ch03

3.3 The Area Principle

The Same Example Respecting the Area Principle

Copyright © 2011 Pearson Education, Inc.

14 of 28

Page 15: stb1e_ppt_ch03

4M Example 3.1: ROLLING OVER

Motivation

Are certain types of vehicles more prone to roll-over accidents than others?

Copyright © 2011 Pearson Education, Inc.

15 of 28

Page 16: stb1e_ppt_ch03

4M Example 3.1: ROLLING OVER

Method

Data gathered from Fatality Analysis Reporting System (FARS) for roll-over accidents on interstate highways. Cases that make up the rows are accidents resulting in roll-overs in 2000. The column of interest is model of the car involved.

Copyright © 2011 Pearson Education, Inc.

16 of 28

Page 17: stb1e_ppt_ch03

4M Example 3.1: ROLLING OVER

Mechanics

Copyright © 2011 Pearson Education, Inc.

17 of 28

Page 18: stb1e_ppt_ch03

4M Example 3.1: ROLLING OVER

Mechanics

Copyright © 2011 Pearson Education, Inc.

18 of 28

Page 19: stb1e_ppt_ch03

4M Example 3.1: ROLLING OVER

Message

Ford Broncos were involved in more than twice as many roll-over accidents as the next-closest model.

Copyright © 2011 Pearson Education, Inc.

19 of 28

Page 20: stb1e_ppt_ch03

4M Example 3.2: CHIP SALES

Motivation

Infineon pled guilty to price fixing for DRAM’s in September 2004. Did Infineon gain a larger share of the market for chips during this period?

Copyright © 2011 Pearson Education, Inc.

20 of 28

Page 21: stb1e_ppt_ch03

4M Example 3.2: CHIP SALES

Method

Copyright © 2011 Pearson Education, Inc.

21 of 28

Page 22: stb1e_ppt_ch03

4M Example 3.2: CHIP SALES

Mechanics

Copyright © 2011 Pearson Education, Inc.

22 of 28

Page 23: stb1e_ppt_ch03

4M Example 3.2: CHIP SALES

Message

Infineon and Samsung increased their shares from 1999 to 2002. It appears to have been at the expense of smaller companies.

Copyright © 2011 Pearson Education, Inc.

23 of 28

Page 24: stb1e_ppt_ch03

3.4 Mode and Median

Mode

Category with the highest frequency

The longest bar in a bar chart

The widest slice in a pie chart

Two or more categories can tie with the highest frequency (bimodal or multimodal)

Copyright © 2011 Pearson Education, Inc.

24 of 28

Page 25: stb1e_ppt_ch03

3.4 Mode and Median

Median

Not appropriate for nominal data

Data must be ordinal

It is the category label of the middle observation in ordered data

Copyright © 2011 Pearson Education, Inc.

25 of 28

Page 26: stb1e_ppt_ch03

Best Practices

Use a bar chart to show the frequencies of a categorical variable.

Use a pie chart to show the proportions of a categorical variable.

Preserve the ordering of an ordinal variable.

Copyright © 2011 Pearson Education, Inc.

26 of 28

Page 27: stb1e_ppt_ch03

Best Practices (Continued)

Respect the area principle.

Show the best plots to answer the motivating question.

Label your chart to show the categories and indicate whether some have been combined or omitted.

Copyright © 2011 Pearson Education, Inc.

27 of 28

Page 28: stb1e_ppt_ch03

Pitfalls

Avoid elaborate plots that may be deceptive.

Do not show too many categories.

Do not put ordinal data in a pie chart.

Do not carelessly round data.

Copyright © 2011 Pearson Education, Inc.

28 of 28