ns2 data presenting 14

Upload: donald-yum

Post on 20-Feb-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/24/2019 NS2 Data Presenting 14

    1/37

    1

    Chapter 2Presenting Data in

    Tables and Charts

    David Chow

    Sep 2014

  • 7/24/2019 NS2 Data Presenting 14

    2/37

    2

    A Picture Is Worth a Thousand Words

  • 7/24/2019 NS2 Data Presenting 14

    3/37

    3

    Categorical Data

  • 7/24/2019 NS2 Data Presenting 14

    4/37

    4

    Organizing Categorical Data:

    Summary Table A summary table indicates the frequency, amount, or

    percentage of items in a set of categories.

    You can easily see differences between categories.

    How do you spend the holidays? Percent

    At home with family 45%

    Travel to visit family 38%

    Vacation 5%Catching up on work 5%

    Other 7%

  • 7/24/2019 NS2 Data Presenting 14

    5/37

    5

    Organizing Categorical Data:

    Bar Chart & Pie Chart In a bar chart, a bar

    shows each category, thelength of which

    represents the amount,frequency or percentage.

    How Do You Spend the Holidays?

    45%

    38%

    5%

    5%

    7%

    0% 10% 20% 30% 40% 50%

    At home w ith family

    Travel to visit family

    Vacation

    Catching up on w ork

    Other

    Pie chart is a circlebroken up into slicesrepresenting categories.

    The size of each slice

    corresponds to thepercentage share.

    How Do You Spend the Holiday's

    45%

    38%

    5%

    5%7%

    At home with family

    Travel to visit family

    Vacation

    Catching up on work

    Other

  • 7/24/2019 NS2 Data Presenting 14

    6/37

    6

    Organizing Categorical Data:

    Pareto Diagram Also for categorical data

    Essentially, it is a bar chart and a cumulative

    polygon in the same graph

    Categories are shown in descending order of frequency

    Easy to see the vital few versus the trivial many

  • 7/24/2019 NS2 Data Presenting 14

    7/377

    Organizing Categorical Data:

    Pareto Diagram

    cumulative%invested

    (linegraph)

    %i

    nves

    tedineach

    category

    (bargraph)

    0%

    5%

    10%

    15%

    20%

    25%

    30%

    35%

    40%

    45%

    Stocks Bonds Savings CD

    0%

    10%

    20%

    30%

    40%

    50%

    60%

    70%

    80%

    90%

    100%

    Current Investment Portfolio Pareto diagram

    is a bar chart &

    a cumulativepolygon together

    - in descendingorder offrequency

    - easy to see thevital few

  • 7/24/2019 NS2 Data Presenting 14

    8/37

    8

    Numerical DataOrdered Array & Stem-

    and-Leaf

  • 7/24/2019 NS2 Data Presenting 14

    9/37

    9

    Organizing Numerical Data:

    Ordered Array An ordered array is a sequence of data, in rank order, from

    the smallest value to the largest value.

    Age ofSurveyed

    College

    Students

    Day Students16 17 17 18 18 18

    19 19 20 20 21 22

    22 25 27 32 38 42

    Night Students

    18 18 19 19 20 21

    23 28 32 33 41 45

  • 7/24/2019 NS2 Data Presenting 14

    10/37

    10

    Organizing Numerical Data:

    Stem and Leaf Display A stem-and-leaf display organizes data into groups (called

    stems) so that the values within each group (the leaves)

    branch out to the right on each row.

    Stem Leaf

    1 67788899

    2 0012257

    3 28

    4 2

    Age of College Students (stem: 10s column)

    Day Students Night Students

    Stem Leaf

    1 8899

    2 0138

    3 23

    4 15

  • 7/24/2019 NS2 Data Presenting 14

    11/37

    11

    Stem-and-Leaf Display

    Construct a stem-and-leaf display forthe following data sets:

    1. Midterm scores: 50, 74, 74, 76, 81

    2. Average daily expenditure:

    $36.15, $31.00, $35.05, $40.25, $33.75

  • 7/24/2019 NS2 Data Presenting 14

    12/37

    12

    Numerical Data:Tables & Charts

  • 7/24/2019 NS2 Data Presenting 14

    13/37

    13

    Organizing Numerical Data:

    Frequency Distribution The frequency distribution is a summary table in which the

    data are arranged into numerically ordered class groupings.

    You must give attention to selecting the appropriate number ofclass groupings, determining a suitable width of a classgrouping, and establishing the boundaries of each to avoidoverlapping.

    To determine the width of a class interval, you divide therange (highest value - lowest value) by the number of classgroupings desired.

  • 7/24/2019 NS2 Data Presenting 14

    14/37

    14

    Organizing Numerical Data:

    Frequency Distribution Example

    Example: A manufacturer of insulation randomly selects 20

    winter days and records the daily high temperature (in

    Fahrenheit):

    24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27

  • 7/24/2019 NS2 Data Presenting 14

    15/37

    15

    Organizing Numerical Data:

    Frequency Distribution Example Sort raw data in ascending order:

    12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

    Find range: 58 - 12 = 46

    Select number of classes: 5 (usually between 5 and 15)

    Compute class interval (width): 10 (46/5 then round up)

    Determine class boundaries (limits): 10, 20, 30, 40, 50, 60

    Compute class midpoints: 15, 25, 35, 45, 55

    Count observations & assign to classes

  • 7/24/2019 NS2 Data Presenting 14

    16/37

    16

    Organizing Numerical Data:

    Frequency Distribution Example

    Class Frequency

    10 but less than 20 3 .15 15

    20 but less than 30 6 .30 30

    30 but less than 40 5 .25 25

    40 but less than 50 4 .20 20

    50 but less than 60 2 .10 10

    Total 20 1.00 100

    RelativeFrequency

    Percentage

  • 7/24/2019 NS2 Data Presenting 14

    17/37

    17

    Organizing Numerical Data:

    The Histogram The graphical version of a frequency distribution is

    called a histogram.

    The class boundaries(or class midpoints) areshown on the horizontal axis. The vertical axis can

    be frequency, relative frequency, orpercentage.

    Bars of the appropriate heights are used to representthe number of observations within each class.

  • 7/24/2019 NS2 Data Presenting 14

    18/37

    18

    Organizing Numerical Data:

    The Histogram

    Class Frequency

    10 but less than 20 3 .15 15

    20 but less than 30 6 .30 30

    30 but less than 40 5 .25 25

    40 but less than 50 4 .20 20

    50 but less than 60 2 .10 10

    Total 20 1.00 100

    RelativeFrequency

    Percentage

    Histogram: Daily High Temperature

    0

    1

    2

    3

    4

    5

    6

    7

    5 15 25 35 45 55 More

    Frequency

  • 7/24/2019 NS2 Data Presenting 14

    19/37

    19

    Histogram in Excel: Step 1

    Earlier Versions

    Select Tools/Data Analysis

    EXCEL 2010 VersionData > Data Analysis*

    You may need to activate Data Analysis

    by yourself. Simply click

    File > Options > Add-in

  • 7/24/2019 NS2 Data Presenting 14

    20/37

    20

    Histogram in Excel: Steps 2-4

    2. Choose Histogram

    3. Input data range and binrange (bin range is a cell rangecontaining the upper class boundariesfor each class grouping)

    4. Select Chart Output

    and click OK

  • 7/24/2019 NS2 Data Presenting 14

    21/37

    21

    Organizing Numerical Data:

    The Polygon A percentage polygon is formed by having the

    midpoint of each class represent the data in that class

    and then connecting the sequence of midpoints at

    their respective class percentages.

    The cumulative percentage polygon, or ogive,

    displays the variable of interest along theX

    axis, andthe cumulative percentages along the Y axis.

  • 7/24/2019 NS2 Data Presenting 14

    22/37

    22

    Organizing Numerical Data:

    The Polygon

    Frequency Polygon: Daily High Temperature

    0

    1

    2

    3

    4

    5

    6

    7

    5 15 25 35 45 55 More

    Freq

    uency

    Class Frequency

    10 but less than 20 3 .15 15

    20 but less than 30 6 .30 30

    30 but less than 40 5 .25 25

    40 but less than 50 4 .20 2050 but less than 60 2 .10 10

    Total 20 1.00 100

    RelativeFrequency

    Percentage

    (In a percentage polygon

    the vertical axis would

    be defined to show the

    percentage of

    observations per class)

  • 7/24/2019 NS2 Data Presenting 14

    23/37

    23

    Organizing Numerical Data:The Cumulative Percentage

    Polygon

    Ogive: Daily High Temperature

    0

    20

    40

    60

    80

    100

    10 20 30 40 50 60Cumulativ

    ePercentage

    Class LowerBoundary

    % Less ThanLower Boundary

    10

  • 7/24/2019 NS2 Data Presenting 14

    24/37

    24

    Cross Tabulation

  • 7/24/2019 NS2 Data Presenting 14

    25/37

    25

    Cross Tabulations:The Contingency Table

    A cross-classification (or contingency) tablepresentsthe results of two categorical variables

    The categories of one variable are located in the rows,

    the categories of the other are located in the columns

    The joint responses are classified and shown in the cells

    A graphical representation is the side-by-side bar chart

  • 7/24/2019 NS2 Data Presenting 14

    26/37

    26

    Cross Tabulations:The Contingency Table

    Importance of Brand Name Male Female Total

    More 450 300 750

    Equal or Less 3300 3450 6750

    Total 3750 3750 7500

    A survey was conducted to study the importance of brandname to consumers as compared to a few years ago.

    The results, classified by gender, were as follows:

  • 7/24/2019 NS2 Data Presenting 14

    27/37

    27

    Cross Tabulations:Side-By-Side Bar Charts

    Importance of Brand Name

    0 500 1000 1500 2000 2500 3000 3500 4000

    More

    Less or Equal

    Response

    Number of Responses

    Female

    Male

  • 7/24/2019 NS2 Data Presenting 14

    28/37

    28

    Numerical Data:Scatter Plots

    & Time Series PlotsTo create scatter plots & time-series

    plots in EXCEL, use the XY(Scatter)

    option in the chart wizard.

  • 7/24/2019 NS2 Data Presenting 14

    29/37

    29

    Scatter Plots

    Scatter plotsare used for numerical data consisting

    of paired observations taken from two numerical

    variables.

    One variable is measured on the vertical axis and the

    other variable is measured on the horizontal axis.

  • 7/24/2019 NS2 Data Presenting 14

    30/37

    30

    Scatter Plot Example

    Volumeper day

    Cost perday

    23 125

    26 140

    29 146

    33 160

    38 167

    42 170

    50 188

    55 195

    60 200

    Cost per Day vs. Production Volume

    0

    50

    100

    150

    200

    250

    20 30 40 50 60 70

    Volume per Day

    CostperDay

  • 7/24/2019 NS2 Data Presenting 14

    31/37

    31

    Time Series Plot

    Attendance (in millions) at USA

    amusement/theme parks from 2000-2005Year Year

    NumberAttendance

    2000 0 317

    2001 1 319

    2002 2 324

    2003 3 322

    2004 4 328

    2005 5 335

    A time-series plot is used to study patterns in the

    values of a numerical variable over time.

    Attendance (in millions) at US Theme Parks

    316

    320

    324

    328

    332

    336

    0 1 2 3 4 5 6

    Year (Since 2000)

    Attendance

  • 7/24/2019 NS2 Data Presenting 14

    32/37

    32

    Principles of Excellent Graphs

    The graph should not distort the data.

    The graph should not contain unnecessary

    adornments (chart junk).

    The scale on the vertical axis should begin at zero.

    The graph should contain a title & properly labeled.

    Use the simplest possible graph.

  • 7/24/2019 NS2 Data Presenting 14

    33/37

    33

    Graphical Errors: Chart Junk

    1960: $1.00

    1970: $1.60

    1980: $3.10

    1990: $3.80

    Minimum Wage Minimum Wage

    0

    2

    4

    1960 1970 1980 1990

    $

    Which one is a better presentation?

    Example 1

  • 7/24/2019 NS2 Data Presenting 14

    34/37

    34

    Graphical Errors:

    No Relative Basis

    As received by

    students.As received by

    students.

    0

    200

    300

    FR SO JR SR

    Freq.

    10%

    30%

    FR SO JR SR

    FR = Freshmen, SO = Sophomore, JR = Junior, SR = Senior

    100

    20%

    0%

    %

    Example 2

    Which one is a better presentation?

  • 7/24/2019 NS2 Data Presenting 14

    35/37

    35

    Graphical Errors:

    Compressing the Vertical Axis

    Quarterly Sales Quarterly Sales

    0

    25

    50

    Q1 Q2 Q3 Q4

    $

    0

    100

    200

    Q1 Q2 Q3 Q4

    $

    Example 3

    Which one is a better presentation?

  • 7/24/2019 NS2 Data Presenting 14

    36/37

    36

    Graphical Errors: No Zero

    Point on the Vertical Axis

    Hang Seng Index

    0

    5000

    10000

    15000

    20000

    25000

    9/1

    /2008

    9/8

    /2008

    9/15

    /2008

    9/22

    /2008

    9/29

    /2008

    10

    /6

    /2008

    10

    /13

    /2008

    10

    /20

    /2008

    10

    /27

    /2008

    11

    /3

    /2008

    11

    /10

    /2008

    11

    /17

    /2008

    11

    /24

    /2008

    12

    /1

    /2008

    12

    /8

    /2008

    12

    /15

    /2008

    12

    /22

    /2008

    12

    /29

    /2008

    HSI

    Example 4

    Which one is a better presentation?

    Hang Seng Index

    10000

    12000

    14000

    16000

    18000

    20000

    22000

    24000

    9/1/2008

    9/8/2008

    9/15/2008

    9/22/2008

    9/29/2008

    10/6/2008

    10/13/2008

    10/20/2008

    10/27/2008

    11/3/2008

    11/10/2008

    11/17/2008

    11/24/2008

    12/1/2008

    12/8/2008

    12/15/2008

    12/22/2008

    12/29/2008

    HSI

    Impact of Financial Tsunami to HSI

  • 7/24/2019 NS2 Data Presenting 14

    37/37

    How to Sell a Lie

    Point to pictures or graphs

    Pictures (relevant or not) often alter our

    perceptions of truth Present numbers or tables

    Use words like because

    Tell a story