4b-1. descriptive statistics (part 2) standardized data standardized data percentiles and quartiles...
Post on 19-Dec-2015
233 views
TRANSCRIPT
4B-1
Descriptive Statistics (Part Descriptive Statistics (Part 2)2)
Descriptive Statistics (Part Descriptive Statistics (Part 2)2)
Standardized Data
Percentiles and Quartiles
Box Plots
Chapter4B4B4B4B
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, Inc. All rights reserved.
4B-3
• For any population with mean For any population with mean and standard deviation and standard deviation , the percentage of , the percentage of observations that lie within observations that lie within kk standard deviations of the mean must be at standard deviations of the mean must be at least 100[1 – 1/least 100[1 – 1/kk22]. ].
• Developed by mathematicians Jules BienaymDeveloped by mathematicians Jules Bienayméé (1796-1878) and Pafnuty Chebyshev (1821-1894).(1796-1878) and Pafnuty Chebyshev (1821-1894).
Standardized DataStandardized DataStandardized DataStandardized Data
Chebyshev’s TheoremChebyshev’s Theorem
4B-4
• For For kk = 2 standard deviations, = 2 standard deviations, 100[1 – 1/2100[1 – 1/222] = 75%] = 75%
• So, at least 75.0% will lie within So, at least 75.0% will lie within ++ 2 2• For For kk = 3 standard deviations, = 3 standard deviations,
100[1 – 1/3100[1 – 1/322] = 88.9%] = 88.9%• So, at least 88.9% will lie within So, at least 88.9% will lie within ++ 3 3
• Although applicable to any data set, these limits Although applicable to any data set, these limits tend to be too wide to be useful.tend to be too wide to be useful.
Standardized DataStandardized DataStandardized DataStandardized Data
Chebyshev’s TheoremChebyshev’s Theorem
4B-5
• The The Empirical RuleEmpirical Rule states that for data from a states that for data from a normal distribution, we expect that fornormal distribution, we expect that for
• The normal or Gaussian distribution was named for The normal or Gaussian distribution was named for Karl Gauss (1771-1855).Karl Gauss (1771-1855).
• The normal distribution is symmetric and is also The normal distribution is symmetric and is also known as the bell-shaped curve.known as the bell-shaped curve.
kk = 1 about 68.26% will lie within = 1 about 68.26% will lie within ++ 1 1kk = 2 about 95.44% will lie within = 2 about 95.44% will lie within ++ 2 2
kk = 3 about 99.73% will lie within = 3 about 99.73% will lie within ++ 3 3
Standardized DataStandardized DataStandardized DataStandardized Data
The Empirical RuleThe Empirical Rule
4B-6
Note: no upper bound is given. Note: no upper bound is given. Data values outside Data values outside ++ 3 3 are rare.are rare.
• Distance from the mean is measured in terms of Distance from the mean is measured in terms of the number of standard deviations.the number of standard deviations.
Standardized DataStandardized DataStandardized DataStandardized Data
The Empirical RuleThe Empirical Rule
4B-7
• If 80 students take an exam, how many will score If 80 students take an exam, how many will score within 2 standard deviations of the mean?within 2 standard deviations of the mean?
• Assuming exam scores follow a normal distribution, Assuming exam scores follow a normal distribution, the empirical rule statesthe empirical rule states
about 95.44% will lie within about 95.44% will lie within ++ 2 2so 95.44% x 80 so 95.44% x 80 76 students will score 76 students will score ++ 2 2 from from ..
• How many students will score more than 2 How many students will score more than 2 standard deviations from the mean?standard deviations from the mean?
Standardized DataStandardized DataStandardized DataStandardized Data
Example: Exam ScoresExample: Exam Scores
4B-8
• UnusualUnusual observations are those that lie beyond observations are those that lie beyond ++ 2 2..
• OutliersOutliers are observations that lie beyond are observations that lie beyond ++ 3 3..
Standardized DataStandardized DataStandardized DataStandardized Data
Unusual ObservationsUnusual Observations
4B-9
• For example, the P/E ratio data contains several For example, the P/E ratio data contains several large data values. Are they unusual or outliers?large data values. Are they unusual or outliers?
77 88 88 1010 1010 1010 1010 1212 1313 1313 1313 1313
1313 1313 1313 1414 1414 1414 1515 1515 1515 1515 1515 1616
1616 1616 1717 1818 1818 1818 1818 1919 1919 1919 1919 1919
2020 2020 2020 2121 2121 2121 2222 2222 2323 2323 2323 2424
2525 2626 2626 2626 2626 2727 2929 2929 3030 3131 3434 3636
3737 4040 4141 4545 4848 5555 6868 9191
Standardized DataStandardized DataStandardized DataStandardized Data
Unusual ObservationsUnusual Observations
4B-10
• If the sample came from a normal distribution, then If the sample came from a normal distribution, then the Empirical rule statesthe Empirical rule states
1x s = 22.72 ± 1(14.08)
2x s = 22.72 ± 2(14.08)
3x s = 22.72 ± 3(14.08)
Standardized DataStandardized DataStandardized DataStandardized Data
The Empirical RuleThe Empirical Rule
= (8.9, 38.8)
= (-5.4, 50.9)
= (-19.5, 65.0)
4B-11
22.7222.72 38.838.88.98.9 50.950.9-5.4-5.4 65.065.0-19.5-19.5
Standardized DataStandardized DataStandardized DataStandardized Data
The Empirical RuleThe Empirical Rule
OutliersOutliers OutliersOutliers
UnusualUnusualUnusualUnusual
• Are there any unusual values or outliers?Are there any unusual values or outliers?7 8 7 8 . . .. . . 48 55 68 91 48 55 68 91
4B-12
• A A standardized variablestandardized variable ( (ZZ) redefines each observation in ) redefines each observation in terms the number of standard deviations from the mean.terms the number of standard deviations from the mean.
iix
z
Standardization Standardization formula for a formula for a population:population:
Standardization Standardization formula for a formula for a sample:sample:
iix x
zs
Standardized DataStandardized DataStandardized DataStandardized Data
Defining a Standardized VariableDefining a Standardized Variable
4B-13
• zzii tells how far away the observation is from the mean. tells how far away the observation is from the mean.
iix x
zs
== 7 – 22.727 – 22.72
14.0814.08== -1.12-1.12
Standardized DataStandardized DataStandardized DataStandardized Data
Defining a Standardized VariableDefining a Standardized Variable
• For example, for the P/E data, the first value For example, for the P/E data, the first value xx11 = 7. = 7.
The associated The associated zz value is value is
4B-14
iix x
zs
== 91 – 22.7291 – 22.72
14.0814.08== 4.854.85
• A negative A negative zz value means the observation is below value means the observation is below the mean.the mean.
Standardized DataStandardized DataStandardized DataStandardized Data
Defining a Standardized VariableDefining a Standardized Variable
• Positive Positive zz means the observation is above the mean. means the observation is above the mean. For For xx6868 = 91, = 91,
4B-15
• Here are the standardized Here are the standardized zz values for the P/E values for the P/E data:data:
Standardized DataStandardized DataStandardized DataStandardized Data
Defining a Standardized VariableDefining a Standardized Variable
• What do you conclude for these four values?What do you conclude for these four values?
4B-16
• In Excel, use =STANDARDIZE(Array, Mean, STDev) to calculate a In Excel, use =STANDARDIZE(Array, Mean, STDev) to calculate a standardized standardized zz value. value.
• MegaStat calculates standardized values as well as MegaStat calculates standardized values as well as checks for outliers.checks for outliers.
Standardized DataStandardized DataStandardized DataStandardized Data
Defining a Standardized VariableDefining a Standardized Variable
4B-17
• What do we do with outliers in a data set?What do we do with outliers in a data set?
• If due to erroneous data, then discard.If due to erroneous data, then discard.
• An outrageous observation (one completely outside An outrageous observation (one completely outside of an expected range) is certainly invalid.of an expected range) is certainly invalid.
• Recognize unusual data points and outliers and Recognize unusual data points and outliers and their potential impact on your study.their potential impact on your study.
• Research books and articles on how to handle Research books and articles on how to handle outliers.outliers.
Standardized DataStandardized DataStandardized DataStandardized Data
OutliersOutliers
4B-18
• For a normal distribution, the range of values is 6For a normal distribution, the range of values is 6 (from (from – 3 – 3 to to + 3 + 3).).
• If you know the range If you know the range RR (high – low), you can (high – low), you can estimate the standard deviation as estimate the standard deviation as = = RR/6./6.
• Useful for approximating the standard deviation Useful for approximating the standard deviation when only when only RR is known. is known.
• This estimate depends on the assumption of This estimate depends on the assumption of normality.normality.
Standardized DataStandardized DataStandardized DataStandardized Data
Estimating SigmaEstimating Sigma
4B-19
• PercentilesPercentiles are data that have been divided into are data that have been divided into 100 groups.100 groups.
• For example, you score in the 83For example, you score in the 83rdrd percentile on a standardized test. percentile on a standardized test. That means that 83% of the test-takers scored below you. That means that 83% of the test-takers scored below you.
• DecilesDeciles are data that have been divided into are data that have been divided into 10 groups.10 groups.
• QuintilesQuintiles are data that have been divided into are data that have been divided into 5 groups.5 groups.
• QuartilesQuartiles are data that have been divided into are data that have been divided into 4 groups.4 groups.
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
PercentilesPercentiles
4B-20
• Percentiles are used to establish Percentiles are used to establish benchmarksbenchmarks for comparison purposes for comparison purposes (e.g., health care, manufacturing and banking industries use 5, 25, 50, 75 (e.g., health care, manufacturing and banking industries use 5, 25, 50, 75 and 90 percentiles). and 90 percentiles).
• Quartiles (25, 50, and 75 percent) are commonly used Quartiles (25, 50, and 75 percent) are commonly used to assess financial performance and stock portfolios. to assess financial performance and stock portfolios.
• Percentiles are used in employee merit evaluation Percentiles are used in employee merit evaluation and salary benchmarking.and salary benchmarking.
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
PercentilesPercentiles
4B-21
• QuartilesQuartiles are scale points that divide the sorted are scale points that divide the sorted data into four groups of approximately equal size.data into four groups of approximately equal size.
• The three values that separate the four groups are The three values that separate the four groups are called called QQ11, , QQ22, and , and QQ33, respectively., respectively.
Q1 Q2 Q3
Lower 25% | Second 25% | Third 25% | Upper 25%
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
QuartilesQuartiles
4B-22
• The second quartile The second quartile QQ22 is the is the medianmedian, an important , an important
indicator of indicator of central tendencycentral tendency..
• QQ11 and and QQ33 measure measure dispersiondispersion since the since the interquartile rangeinterquartile range QQ33 – – QQ11
measures the degree of spread in the middle 50 percent of data values.measures the degree of spread in the middle 50 percent of data values.
QQ22
Lower 50% Lower 50% || Upper 50% Upper 50%
QQ11 QQ33
Lower 25%Lower 25% || Middle 50% Middle 50% || Upper 25%Upper 25%
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
QuartilesQuartiles
4B-23
• The first quartile The first quartile QQ11 is the median of the data values below is the median of the data values below QQ22, and , and
the third quartile the third quartile QQ33 is the median of the data values above is the median of the data values above QQ22..
QQ11 QQ22 QQ33
Lower 25%Lower 25% || Second 25%Second 25% || Third 25%Third 25% || Upper 25%Upper 25%
For first half of data, For first half of data, 50% above, 50% above,
50% below 50% below QQ11..
For second half of data, For second half of data, 50% above, 50% above,
50% below 50% below QQ33..
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
QuartilesQuartiles
4B-24
• Depending on Depending on nn, the quartiles , the quartiles QQ11,,QQ22, and , and QQ33 may be members of may be members of
the data set or may lie the data set or may lie betweenbetween two of the sorted data values. two of the sorted data values.
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
QuartilesQuartiles
4B-25
• For small data sets, find quartiles using For small data sets, find quartiles using method of method of mediansmedians::
Step 1.Step 1. Sort the observations. Sort the observations.
Step 2.Step 2. Find the median Find the median QQ22..
Step 3.Step 3. Find the median of the data values that lie Find the median of the data values that lie belowbelow QQ22..
Step 4.Step 4. Find the median of the data values that lie Find the median of the data values that lie aboveabove QQ22..
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
Method of MediansMethod of Medians
4B-26
• Use Excel function =QUARTILE(Array, k) to return Use Excel function =QUARTILE(Array, k) to return the the kkth quartile.th quartile.
=QUARTILE(Array, 3)=QUARTILE(Array, 3)
=PERCENTILE(Array, 75)=PERCENTILE(Array, 75)
• Excel treats quartiles as a special case of percentiles. Excel treats quartiles as a special case of percentiles. For example, to calculate For example, to calculate QQ33
• Excel calculates the quartile positions as:Excel calculates the quartile positions as:
Position of QPosition of Q11 0.250.25n n + 0.75+ 0.75
Position of QPosition of Q22 0.500.50n n + 0.50+ 0.50
Position of QPosition of Q33 0.750.75n n + 0.25+ 0.25
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
Excel QuartilesExcel Quartiles
4B-27
• Consider the following P/E ratios for 68 stocks in a Consider the following P/E ratios for 68 stocks in a portfolio. portfolio.
• Use quartiles to define benchmarks for stocks that are Use quartiles to define benchmarks for stocks that are low-priced (bottom quartile) or high-priced (top quartile).low-priced (bottom quartile) or high-priced (top quartile).
7 8 8 10 10 10 10 12 13 13 13 13 13 13 13 14 14
14 15 15 15 15 15 16 16 16 17 18 18 18 18 19 19 19
19 19 20 20 20 21 21 21 22 22 23 23 23 24 25 26 26
26 26 27 29 29 30 31 34 36 37 40 41 45 48 55 68 91
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
Example: P/E Ratios and QuartilesExample: P/E Ratios and Quartiles
4B-28
• Using Excel’s method of interpolation, the quartile Using Excel’s method of interpolation, the quartile positionspositions are:are:
Quartile Quartile PositionPosition
FormulaFormula Interpolate Interpolate BetweenBetween
QQ11 = 0.25(68) + 0.75 = 17.75= 0.25(68) + 0.75 = 17.75 XX1717 + + XX1818
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
Example: P/E Ratios and QuartilesExample: P/E Ratios and Quartiles
QQ22 = 0.50(68) + 0.50 = 34.50= 0.50(68) + 0.50 = 34.50 XX3434 + + XX3535
QQ33 = 0.75(68) + 0.25 = 51.25= 0.75(68) + 0.25 = 51.25 XX5151 + + XX5252
4B-29
• The quartiles are:The quartiles are:
QuartileQuartile FormulaFormula
First (First (QQ11)) QQ11 = = XX1717 + 0.75 ( + 0.75 (XX1818--XX1717) )
= 14 + 0.75 (14-14) = 14 = 14 + 0.75 (14-14) = 14
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
Example: P/E Ratios and QuartilesExample: P/E Ratios and Quartiles
Second (Second (QQ22)) QQ22 = = XX3434 + 0.50 ( + 0.50 (XX3535--XX3434) )
= 19 + 0.50 (19-19) = 19 = 19 + 0.50 (19-19) = 19Third (Third (QQ33)) QQ33 = = XX5151 + 0.25 ( + 0.25 (XX5252--XX5151) )
= 26 + 0.25 (26-26) = 26 = 26 + 0.25 (26-26) = 26
4B-30
• So, to summarize:So, to summarize:
• These quartiles express central tendency and These quartiles express central tendency and dispersion. What is the interquartile range?dispersion. What is the interquartile range?
QQ11 QQ22 QQ33
Lower 25%Lower 25% of of P/E P/E RatiosRatios
1414 Second 25%Second 25% of of P/EP/E Ratios Ratios
1919 Third 25%Third 25% of of P/EP/E Ratios Ratios
2626 Upper 25%Upper 25% of of P/EP/E Ratios Ratios
• Because of clustering of identical data values, these quartiles do not Because of clustering of identical data values, these quartiles do not provide clean cut points between groups of observations.provide clean cut points between groups of observations.
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
Example: P/E Ratios and QuartilesExample: P/E Ratios and Quartiles
4B-31
Whether you use the method of Whether you use the method of medians or Excel, your quartiles will be medians or Excel, your quartiles will be about the same. Small differences in about the same. Small differences in calculation techniques typically do not calculation techniques typically do not
lead to different conclusions in lead to different conclusions in business applications.business applications.
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
TipTip
4B-32
• Quartiles generally resist outliers.Quartiles generally resist outliers.• However, quartiles do not provide clean cut points in the sorted However, quartiles do not provide clean cut points in the sorted
data, especially in small samples with repeating data values.data, especially in small samples with repeating data values.
Data set Data set AA:: 1, 2, 4, 4, 8, 8, 8, 81, 2, 4, 4, 8, 8, 8, 8 QQ11 = 3, = 3, QQ22 = 6, = 6, QQ33 = 8 = 8
Data set Data set BB:: 0, 3, 3, 6, 6, 6, 10, 150, 3, 3, 6, 6, 6, 10, 15 QQ11 = 3, = 3, QQ22 = 6, = 6, QQ33 = 8 = 8
• Although they have identical quartiles, these two data sets are Although they have identical quartiles, these two data sets are not similar. The quartiles do not represent either data set well.not similar. The quartiles do not represent either data set well.
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
CautionCaution
4B-33
• Some robust measures of central tendency and Some robust measures of central tendency and dispersion using quartiles are:dispersion using quartiles are:
StatisticStatistic FormulaFormula ExcelExcel ProPro ConCon
MidhingeMidhinge=0.5*(QUARTILE=0.5*(QUARTILE
(Data,1)+QUARTILE(Data,1)+QUARTILE(Data,3))(Data,3))
Robust to Robust to presence presence of extreme of extreme data data values.values.
Less Less familiar familiar to most to most people.people.
1 3
2
Q Q
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
Dispersion Using QuartilesDispersion Using Quartiles
4B-34
StatisticStatistic FormulaFormula ExcelExcel ProPro ConCon
MidspreadMidspread QQ33 – – QQ11=QUARTILE(Data,3)-=QUARTILE(Data,3)-QUARTILE(Data,1)QUARTILE(Data,1)
Stable Stable when when extreme extreme data values data values exist.exist.
Ignores Ignores magnitude magnitude of extreme of extreme data data values.values.
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
Dispersion Using QuartilesDispersion Using Quartiles
Coefficient Coefficient of quartile of quartile variation variation ((CQVCQV))
NoneNone
Relative Relative variation in variation in percent so percent so we can we can compare compare data sets.data sets.
Less Less familiar to familiar to non-non-statisticiansstatisticians
3 1
3 1
100Q Q
Q Q
4B-35
• The mean of the first and third quartiles.The mean of the first and third quartiles.
• For the 68 P/E ratios,For the 68 P/E ratios,
Midhinge = Midhinge = 1 3
2
Q Q
Midhinge = Midhinge = 1 3 14 2620
2 2
Q Q
• A robust measure of central tendency since A robust measure of central tendency since quartiles ignore extreme values.quartiles ignore extreme values.
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
MidhingeMidhinge
4B-36
• A robust measure of dispersionA robust measure of dispersion
• For the 68 P/E ratios,For the 68 P/E ratios,
Midspread = Midspread = QQ33 – – QQ11
Midspread = Midspread = QQ33 – – QQ11 = 26 – 14 = 12 = 26 – 14 = 12
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
Midspread (Interquartile Range)Midspread (Interquartile Range)
4B-37
• Measures Measures relativerelative dispersion, expresses the dispersion, expresses the midspread as a percent of the midhinge.midspread as a percent of the midhinge.
• For the 68 P/E ratios,For the 68 P/E ratios,
3 1
3 1
100Q Q
CQVQ Q
3 1
3 1
26 14100 100 30.0%
26 14
Q QCQV
Q Q
• Similar to the Similar to the CVCV, , CQVCQV can be used to compare data can be used to compare data
sets measured in different units or with different means.sets measured in different units or with different means.
Percentiles and QuartilesPercentiles and QuartilesPercentiles and QuartilesPercentiles and Quartiles
Coefficient of Quartile Variation (CQV)Coefficient of Quartile Variation (CQV)
4B-38
• A useful tool of A useful tool of exploratory data analysisexploratory data analysis (EDA). (EDA).
• Also called a Also called a box-and-whisker plotbox-and-whisker plot..
• Based on a Based on a five-number summaryfive-number summary::
XXminmin, , QQ11, , QQ22, , QQ33, , XXmaxmax
• Consider the five-number summary for the Consider the five-number summary for the 68 P/E ratios:68 P/E ratios:
7 14 19 26 917 14 19 26 91
XXminmin, , QQ11, , QQ22, , QQ33, , XXmaxmax
Box PlotsBox PlotsBox PlotsBox Plots
4B-39
MinimumMinimum
Median (Median (QQ22))
MaximumMaximum
QQ11 QQ33
BoxBox
WhiskersWhiskers
Right-skewedRight-skewed
Center of Box is MidhingeCenter of Box is Midhinge
Box PlotsBox PlotsBox PlotsBox Plots
4B-40
• Use quartiles to detect unusual data points.Use quartiles to detect unusual data points.
• These points are called These points are called fencesfences and can be found and can be found using the following formulas: using the following formulas:
Inner fencesInner fences Outer fences:Outer fences:
Lower fenceLower fence QQ11 – 1.5 ( – 1.5 (QQ33––QQ11)) QQ11 – 3.0 ( – 3.0 (QQ33––QQ11))
Upper fenceUpper fence QQ33 + 1.5 ( + 1.5 (QQ33––QQ11)) QQ33 + 3.0 ( + 3.0 (QQ33––QQ11))
• Values outside the inner fences are Values outside the inner fences are unusualunusual while while those outside the outer fences are those outside the outer fences are outliersoutliers. .
Box PlotsBox PlotsBox PlotsBox Plots
Fences and Unusual Data ValuesFences and Unusual Data Values
4B-41
• For example, consider the P/E ratio data:For example, consider the P/E ratio data:
• Ignore the lower fence since it is negative and P/E Ignore the lower fence since it is negative and P/E ratios are only positive. ratios are only positive.
Inner fencesInner fences Outer fences:Outer fences:
Lower fence:Lower fence: 14 – 1.5 (26–14) = 14 – 1.5 (26–14) = 44 14 – 3.0 (26–14) = 14 – 3.0 (26–14) = 2222
Upper fence:Upper fence: 26 + 1.5 (26–14) = +4426 + 1.5 (26–14) = +44 26 + 3.0 (26–14) = +6226 + 3.0 (26–14) = +62
Box PlotsBox PlotsBox PlotsBox Plots
Fences and Unusual Data ValuesFences and Unusual Data Values
4B-42
• Truncate the whisker at the fences and display Truncate the whisker at the fences and display unusual values unusual values and outliers and outliers as dots.as dots.
Inner Inner FenceFence
OuterOuterFenceFence
UnusualUnusual OutliersOutliers
Box PlotsBox PlotsBox PlotsBox Plots
Fences and Unusual Data ValuesFences and Unusual Data Values
• Based on these fences, there are three unusual Based on these fences, there are three unusual P/E values and two outliers.P/E values and two outliers.
4B-43
• Although some information is lost, grouped data Although some information is lost, grouped data are easier to display than raw data. are easier to display than raw data.
• When bin limits are given, the mean and standard When bin limits are given, the mean and standard deviation can be estimated.deviation can be estimated.
• Accuracy of grouped estimates depend on Accuracy of grouped estimates depend on - the number of bins- the number of bins- distribution of data within bins- distribution of data within bins- bin frequencies- bin frequencies
Grouped DataGrouped DataGrouped DataGrouped Data
Nature of Grouped DataNature of Grouped Data