m35 chapter 4

Upload: derain123

Post on 03-Jun-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 M35 Chapter 4

    1/24

    Analyzing and Summarizing Data

  • 8/12/2019 M35 Chapter 4

    2/24

    Summarizing Data

    Most sets of data show a distinct tendency to grouparound a central value (or central tendency).

    The purpose of central tendency is to find a single

    value that best represents an entire distribution of

    scores.

    When people talk about an average value or the middle

    value or the most frequent value, they are talking

    informally about the mean, median, and modethree

    measures of central tendency.

  • 8/12/2019 M35 Chapter 4

    3/24

    TERMINOLOGY

    Central Tendency

    -the extent to which the data values group around a typicalor central value

    Measures of Central Tendency

    -numerical values that locate, in some sense, the center of aset of data

    Variation-the amount of dispersion, or scattering, of values away

    from a central value

    Shape-the pattern of the distribution of values from the lowest

    value to the highest value

  • 8/12/2019 M35 Chapter 4

    4/24

    IMPORTANCE

    1) To find representative valueIt give us one value for the distribution and this valuerepresents the entire distribution.

    2) To condense dataAverage converts the whole set of figures into just one figure

    and thus helps in condensation.

    3) To make comparisonsTo make comparisons of two or more than two distributions,

    we have to find the representative values of these

    distributions.

    4) Helpful in further statistical analysisMany techniques of statistical analysis (Dispersion,

    Skewness, Correlation) are based on measures of central

    tendency.

  • 8/12/2019 M35 Chapter 4

    5/24

    Mean-The average with which you are probably most familiar.-The sample mean is represented by (read x-bar orsample mean).

    -The mean is found by adding all the values of the variable x(this sum of x values is symbolized x) and dividing the sum

    by the number of these values, n (the sample size).

    Sample Mean =

    =

    Population Mean =

    =

  • 8/12/2019 M35 Chapter 4

    6/24

    Activity:

    Typical Time It Takes To Get Ready In The Morning

    If you knew the typical time it takes you to get ready in the morning,you might be able to better plan your morning and minimize any

    excessive lateness (or earliness) going to your destination.

    Find the Mean for the following times (in mins)collected for 10 consecutive days.

    Day 1 2 3 4 5 6 7 8 9 10

    Time (min) 39 29 43 52 39 44 40 31 44 35

    Answer:

  • 8/12/2019 M35 Chapter 4

    7/24

    Mean = 39.6 minutes

    Even though no individual day in the sample actuallyhad the value 39.6 minutes, allotting about 40

    minutes to get ready would be a good rule for

    planning your mornings.

  • 8/12/2019 M35 Chapter 4

    8/24

    What if on Day 4, the time you spent is 102

    minutes instead of 52 minutes:

    Day 1 2 3 4 5 6 7 8 9 10

    Time (min) 39 29 43 102 39 44 40 31 44 35

    Find the Mean.

  • 8/12/2019 M35 Chapter 4

    9/24

    Answer:

    Mean = 44.6 minutes

    The one extreme value has increased the mean from39.6 to 44.6 minutes.

    In contrast to the original mean that was in themiddle, the new mean is greater than 9 of the 10

    getting-ready times.

    Because of the extreme value, now the mean is not agood measure of central tendency.

    Time (min) 29 31 35 39 39 39.6 40 43 44 44 52

    Time (min) 29 31 35 39 39 40 43 44 44 44.6 102

  • 8/12/2019 M35 Chapter 4

    10/24

    Mean

    Use the mean to describe the middle of a set of data thatdoes nothave an outlier (extreme values).

    Advantages:

    Most popular measure in fields such as business,

    engineering and computer science.

    It is unique - there is only one answer.

    Useful when comparing sets of data.

    Disadvantages: Affected by extreme values (outliers)

  • 8/12/2019 M35 Chapter 4

    11/24

    Median-The value of the data that occupies the middle position

    when the data are ranked in order according to size.

    -The sample median is represented by x (read x-tilde orsample median).

    -The median is not affected by extreme values, so you canuse the median when extreme values are present.

  • 8/12/2019 M35 Chapter 4

    12/24

    Steps in determining the Median:

    1)

    Rank the data.

    2) Determine the depth of the median (rank of the medianvalue).

    = () + 1

    2

    3) Determine the value of the median by counting its rank asgiven by the depth.

  • 8/12/2019 M35 Chapter 4

    13/24

    Activity:

    A)Find the median for the set of data {6, 3, 8, 5, 3}.Median = 5 (3rdvalue)

    B)Find the median of the sample 9, 6, 7, 9, 10, 8.Median = 8.5 (3.5thvalue)

    C)Find the median for both cases:

    a) Median = 39.5 (5.5th) b) Median = 39.5 (5.5th)a) Time (min) 29 31 35 39 39 40 43 44 44 52b) Time (min) 29 31 35 39 39 40 43 44 44 102

  • 8/12/2019 M35 Chapter 4

    14/24

    Median

    Use the median to describe the middle of a set of data thatdoeshave an outlier.

    Advantages:

    Extreme values (outliers) do not affect the median as

    strongly as they do the mean

    Easy to calculate and in some cases, can be obtained

    by inspection

    It is unique - there is only one answer.

    Disadvantages:

    Not capable of further algebraic treatment

    Ranking a large number of data can be tedious

  • 8/12/2019 M35 Chapter 4

    15/24

    Mode

    -The value of x that occurs most frequently-Can be used with categorical data-Like the median, extreme values do not affect the mode-Often, there is no mode or there are several modes in a set

    of data-Distributions can be: unimodal, bimodal, or multimodal

  • 8/12/2019 M35 Chapter 4

    16/24

    Activity: For Categorical Data

    Find the mode.

    Flavor f

    Vanilla 28

    Chocolate 22

    Strawberry 15Neapolitan 8

    Butter Pecan 12

    Rocky Road 9Fudge Ripple 6

    Mode: Vanilla

  • 8/12/2019 M35 Chapter 4

    17/24

    Activity: For Numerical Data

    A)Find the Mode.Day 1 2 3 4 5 6 7 8 9 10

    Time (min) 39 29 43 52 39 44 40 31 44 35

    Mode = 39, 44 --> bimodal

    B)The bounced check fees ($) for a sample of 10 banksis:

    26 28 20 21 22 25 18 23 15 30

    Find the Mode.

    Mode = no mode

  • 8/12/2019 M35 Chapter 4

    18/24

    Mode

    Use the mode when the data is non-numeric or whenasked to choose the most popular item.

    Advantages:

    Extreme values (outliers) do not affect the mode.

    Disadvantages:

    Not necessarily unique - may be more than one

    answer

    When no values repeat in the data set, there is nomode and may seem useless.

    When there is more than one mode, it is difficult to

    interpret and/or compare.

  • 8/12/2019 M35 Chapter 4

    19/24

    Considerations for Choosing a Measure of Central

    Tendency:

    For nominal variables, the mode is the only measurethat can be used.

    For ordinal variables, the mode and the median may beused. The median provides more information (taking

    into account the ranking of categories).

    For numerical variables, the mode, median and meanmay all be calculated. The mean provides the most

    information about the distribution but the median is

    preferred if the distribution has extreme values.

  • 8/12/2019 M35 Chapter 4

    20/24

    Midrange

    -The number exactly midway between a lowest-valued data,L, and a highest-valued data, H

    = +

    2

  • 8/12/2019 M35 Chapter 4

    21/24

    Activity:

    Find the mean, median, mode and midrange.

    {6, 7, 8, 9, 9, 10}

    Mean = 8.17

    Median = 8.5

    Mode = 9Midrange = 8

  • 8/12/2019 M35 Chapter 4

    22/24

    ASSIGNMENT: (1 whole sheet, due: THU, Dec. 12)1)Compute the mean, median and mode for the set of

    scores shown in the following frequency distribution

    table.

    X f7 16 15 14 13 42 31 12) Identify the circumstances where the median instead

    of the mean is the preferred measure of central

    tendency.

  • 8/12/2019 M35 Chapter 4

    23/24

    3)Under what circumstances will the mean, the median,and the mode all have the same value?

    4)Under what circumstances is the mode the preferredmeasure of central tendency?

    5)Explain why the mean is often not a good measure ofcentral tendency for a skewed distribution?

    6)Draw and determine the shape of the distributionwhen:

    a) The mean, median and mode are equalb)The mode is lowest, followed by median and meanc) The mean is lowest, followed by the median and

    mode

  • 8/12/2019 M35 Chapter 4

    24/24