data summary using descriptive measures
DESCRIPTION
Data Summary Using Descriptive Measures. Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing. Types of Descriptive Measures. Central Tendency Variation Position Shape. Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/1.jpg)
Data SummaryUsing Descriptive Measures
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 2: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/2.jpg)
Types of Descriptive Measures
• Central Tendency
• Variation
• Position
• Shape
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 3: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/3.jpg)
Measures of Central Tendency
• Mean
• Median
• Midrange
• Mode
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 4: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/4.jpg)
The Mean
The Mean is simply the average of the data.
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 5: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/5.jpg)
Sample Mean
x x
n
Each value in the sample is represented by xthus to get the mean simply add all the valuesin the sample and divide by the number of values in the sample
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 6: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/6.jpg)
Accident Data Set
x 6 9 7 23 5
510.0
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 7: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/7.jpg)
The Median
The Median (Md) of a set of data is the value in the center of the data values when they are arranged from lowest to highest.
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 8: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/8.jpg)
Accident Data
Ordered array: 5, 6, 7, 9, 23
The value that has an equal number of items to the right and left is the median. Thus Md = 7
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 9: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/9.jpg)
The Median
Md n1
2
st ordered value
In general if n is odd, Md is the center data value of the ordered data set.
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 10: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/10.jpg)
Accident Data
Ordered array: 5, 6, 7, 9, 23
Md 51
2
st ordered value = 3rd value
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 11: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/11.jpg)
The Median
If n is even, Md is the average of the two center values of the ordered data set.
For the ordered data set: 3, 8, 12, 14
Md 812
2
= 10.0
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 12: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/12.jpg)
The Midrange
The Midrange (Mr) provides an easy-to-grasp measure of central tendency.
Mr L H
2
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 13: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/13.jpg)
Accident Data
Mr 5 23
2
Mr L H
2
= 14.0
x Md = 7Note: that the Midrange is severely affected by outliersCompare:
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 14: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/14.jpg)
The Mode
The Mode (Mo) of a data set is the value that occurs more than once and the most often.
The Mode is not always a measure of central tendency; this value need not occur in the center of the data.
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 15: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/15.jpg)
Level of Measurement and Measure of Central Tendency
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 16: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/16.jpg)
Measures of Variation
• Homogeneity refers to the degree of similarity within a set of data.
• The more Homogeneous a set of data is, the better the mean will represent a typical value.
• Variation is the tendency of data values to scatter about the mean, .x
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 17: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/17.jpg)
Common Measures of Variation
• Range
• Variance
• Standard Deviation
• Coefficient of Variation
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 18: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/18.jpg)
The Range
For the Accident data:
Range = H - L = 23 - 5 = 18
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 19: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/19.jpg)
The Variance and Standard Deviation
Both measures describe the variation of the values about the mean.
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 20: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/20.jpg)
Accident Data
Data Value (x - ) (x - )2
5 -5 256 -4 167 -3 99 -1 1
23 13 169 = 220
x
x
(x – x )2
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 21: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/21.jpg)
Definition: Sample Variance
s2 220
5 –1
220
455.0
s2 ( x– x )2n– 1
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 22: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/22.jpg)
Definition: Sample Standard Deviation
s ( x– x )2n –1
s 55.0 7.416
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 23: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/23.jpg)
Definition:Population Variance
2 ( x– )2
N
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 24: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/24.jpg)
Definition:Population Standard Deviation
(x – )2
N
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 25: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/25.jpg)
The Coefficient of Variation
The Coefficient of Variation (CV) is used to compare the variation of two or more data sets where the values of the data differ greatly.
CV sx
100Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 26: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/26.jpg)
Example
Data Set 1: 5, 6, 7, 9, 23Data Set 2: 5000, 6000, 7000, 9000, 23,000
CV 7.416
100Data Set 110
. = 74.16
CV 7,416
10010,000
. = 74.16Data Set 2
Thus both data sets exhibit the same relative variationIntroduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 27: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/27.jpg)
Measures of Position
• Percentile (Quartile)
• Z Score
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 28: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/28.jpg)
Percentile
The 35th Percentile (P35) is that value such that at most 35% of the data values are less than P35 and at most 65% of the data values are greater than P35 .
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 29: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/29.jpg)
PercentileTexon Industries Data
nP
10050.35 17.5
17.5 represents the position of the 35th percentile
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 30: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/30.jpg)
Percentile: Location Rules
• If n P/100 is not a counting number, round it up, and the Pth percentile will be the value in this position of the ordered data.
• If n P/100 is a counting number, the Pth percentile is the average of the number in this location (of the ordered data) and the number in the next largest location.
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 31: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/31.jpg)
Quartiles
Quartiles are merely particular percentiles that divide the data into quarters, namely:
• Q1 = 1st quartile = 25th percentile (P25)
• Q2 = 2nd quartile = 50th percentile (P50)
• Q3 = 3rd quartile = 75th percentile (P75)
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 32: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/32.jpg)
Z Scores• Z score determines the relative position of any
particular data value x and is based on the mean and standard deviation of the data set.
• The Z score is expresses the number of standard deviations the value x is from the mean.
• A negative Z score implies that x is to the left of the mean and a positive Z score implies that x is to the right of the mean.Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 33: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/33.jpg)
Z Score Equation
zx– x
sIntroduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 34: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/34.jpg)
Measures of Shape
• Skewness
• Kurtosis
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 35: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/35.jpg)
Skewness
Skewness measures the tendency of a distribution to stretch out in a particular direction
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 36: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/36.jpg)
Skewness
• In a symmetrical distribution the mean, median, and mode would all be the same value. Sk = 0 (fig 3.7)
• A positive Sk number implies a shape which is skewed right (fig3.8). The
mode < median < mean
• In a data set with a negative Sk value (fig3.9) the mean < Median < ModeIntroduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 37: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/37.jpg)
Figure 3.7
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 38: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/38.jpg)
Figure 3.8
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 39: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/39.jpg)
Figure 3.9
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 40: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/40.jpg)
Skewness Calculation
Sk 3( x – Md)
s
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 41: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/41.jpg)
Kurtosis
Kurtosis measures the peakedness of the distribution.
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 42: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/42.jpg)
Chebyshev’s Inequality
• At least 75% of the data values are between
x - 2s and x + 2s or
At least 75% of the data values have a Z score value between -2 and +2
• At least 89% of the data values are between
x - 3s and x + 3s
• In general, at least (1-1/k2) x 100% of the data values lie between x - ks and x + ks for any k>1Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 43: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/43.jpg)
Empirical Rule
• Under the assumption of a bell shaped population
• Approximately 68% of the data values lie between
• Approximately 95% of the data values lie between
• Approximately 99.7% of the data values lie between
s xandx s
2s xandx s2
3s xandx s3Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 44: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/44.jpg)
Chebyshev’s versus Empirical
Introduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing
![Page 45: Data Summary Using Descriptive Measures](https://reader036.vdocuments.us/reader036/viewer/2022062321/56813723550346895d9eaf43/html5/thumbnails/45.jpg)
Grouped DataApproximations
x f mn
s2 f m2 – ( f m) 2/n
n– 1
Where: f is the frequency of the class and m is the m is the midpoint of the classIntroduction to Business Statistics, 5e
Kvanli/Guynes/Pavur
(c)2000 South-Western College Publishing