statistics · 2016. 7. 9. · sample vs. population . sample standard deviation 2 ( 1) xx n ¦ v...
TRANSCRIPT
Statistics
Statistics
The collection, evaluation, and interpretation
of data.
Statistics
Statistics
Descriptive Statistics
Describe collected data.
Inferential Statistics
Generalize and
evaluate a population
based on sample
data.
Data
Values that possess names or labels.
(color of M&Ms, breed of dog, etc…)
Categorical or Qualitative Data
Values that represent a measurable quantity.
Population, number of M&Ms, number
of defective parts, etc.
Numerical or Quantitative Data
Data Collection
Sampling
Random
Systematic
Stratified
Cluster
Convenience
Graphic Data Representation
Histogram
Frequency Polygons
Bar Chart
Pie Chart
Frequency distribution graph
Frequency distribution graph
Categorical data graph
Categorical data graph %
Measures of Central Tendency
xx
n
Most frequently used measure of central
tendency.
Strongly influenced by outliers – very large
or very small values.
Mean
Arithmetic average.
Sum of all data values divided by the
number of data values within the array.
x
Measures of Central Tendency
xx
n
48, 63, 62, 49, 58, 2, 63, 5, 60, 59, 55
Determine the mean value of
(48 63 62 49 58 2 63 5 60 59 55)
11x
524
11x
47.64x
Measures of Central Tendency
Average (mean) height of NBA player:
Mean
Fun Facts:
x
Measures of Central Tendency
Average (mean) height of NBA player:
6 feet, 7.62 inches
Mean
Fun Facts:
x
Measures of Central Tendency
Average (mean) cost of a wedding:
Mean
Fun Facts:
x
Measures of Central Tendency
Average (mean) cost of a wedding:
$25,200
Mean
Fun Facts:
x
Measures of Central Tendency
Average (mean) U.S. cost of a prom:
Mean
Fun Facts:
x
Measures of Central Tendency
Average (mean) U.S cost of a prom:
$978
Mean
Fun Facts:
x
Measures of Central Tendency
Average (mean) life of a cell phone:
Mean
Fun Facts:
x
Measures of Central Tendency
Average (mean) life of a cell phone:
24 months
Mean
Fun Facts:
x
Measures of Central Tendency
Median
Data value that divides a data array into
two equal groups.
Data values must be ordered from lowest
to highest.
Useful in situations with skewed data
and outliers (e.g., wealth management).
Measures of Central Tendency Determine the median value of
Organize the data array from lowest to
highest value.
59, 60, 62, 63, 63
48, 63, 62, 49, 58, 2, 63, 5, 60, 59, 55
Select the data value that splits the data set
evenly.
2, 5, 48, 49, 55, 58,
Median = 58
What if the data array had an even number of
values? 60, 62, 63, 63 5, 48, 49, 55, 58, 59,
Measures of Central Tendency
Median U.S. household income:
Median
Fun Facts:
Measures of Central Tendency
Median U.S. household income:
$51,939
Median
Fun Facts:
Measures of Central Tendency
Median home price in San Francisco:
Median
Fun Facts:
Measures of Central Tendency
Median home price in San Francisco:
$1,000,000
Median
Fun Facts:
Measures of Central Tendency
Usually the highest point of curve.
Mode
Most frequently occurring response within a
data array.
May not be typical.
May not exist at all.
Modal, bimodal, and multimodal.
Measures of Central Tendency Determine the mode of
48, 63, 62, 49, 58, 2, 63, 5, 60, 59, 55 Mode = 63
Determine the mode of
48, 63, 62, 59, 58, 2, 63, 5, 60, 59, 55
Mode = 63 & 59 Bimodal
Determine the mode of
48, 63, 62, 59, 48, 2, 63, 5, 60, 59, 55
Mode = 63, 59, & 48 Multimodal
Data Variation
Range
Standard Deviation
Measure of data scatter
Difference between the lowest and
highest data value.
Square root of the variance.
Range
63 2R
Calculate by subtracting the lowest value
from the highest value.
R h l
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
Calculate the range for the data array.
R h l
61R
s=S x-x( )
2
(n-1)
Sample Standard Deviation Population Standard Deviation.
In practice, only the sample standard deviation can be measured and therefore is more
useful for applications.
Population Standard Deviation
A population standard deviation represents a parameter, not a statistic. The standard
deviation of a population gives researchers an amount of dispersion of data for an entire
population of survey respondents.
Sample Standard Deviation
A standard deviation of a sample estimates the standard deviation of a population based
on a random sample. The sample standard deviation, unlike the population standard
deviation, is a statistic that measures the dispersion of the data around the sample
mean.
Standard Deviation – Sample vs. Population
Sample Standard Deviation
2
( 1)
x x
N
σ for a sample, not population
1. Calculate the mean
2. Subtract the mean from each value and then
square it.
3. Sum all squared differences.
4. Divide the summation by the number of
values in the array minus 1.
5. Calculate the square root of the product.
x
Sample Standard Deviation
2
( 1)
x x
N
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
Calculate the standard
deviation for the data array.
x
x
n
524
111. 47.64
2.
(2 - 47.64)2 = 2083.01
(5 - 47.64)2 = 1818.17
(48 - 47.64)2 = 0.13
(49 - 47.64)2 = 1.85
(55 - 47.64)2 = 54.17
(58 - 47.64)2 = 107.33
(59 - 47.64)2 = 129.05
(60 - 47.64)2 = 152.77
(62 - 47.64)2 = 206.21
(63 - 47.64)2 = 235.93
(63 - 47.64)2 = 235.93
2
x x
Sample Standard Deviation
2
( 1)
x xs
N
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
Calculate the standard
deviation for the data array.
3.
2083.01 + 1818.17 + 0.13 + 1.85 + 54.17 + 107.33
+ 129.05 + 152.77 + 206.21 + 235.93 + 235.93
2
x x
= 5,024.55
4. 2
( 1)
x x
N
5,024.55
10 502.46
5. 2
( 1)
x xs
N
502.46
S = 22.42
2, 5, 48, 49, 55, 58, 59, 60, 62, 63, 63
Calculate the population standard
deviation for the data array
524
111. Calculate the mean
2. Subtract the mean from each data value and square each
difference
Population Standard Deviation
Variation 3. Sum all squared differences
2082.6777 + 1817.8595 + 0.1322 + 1.8595 + 54.2231 +
107.4050 + 129.1322 + 152.8595 + 206.3140
+ 236.0413 + 236.0413
= 5,024.5455
4. Divide the summation by the number of data values
5. Calculate the square root of the result
Note that this is the sum of the
unrounded squared differences.
Population Standard Deviation
Population Standard Deviation
Find the mean for each
group of numbers
x σ
0, 0, 14, 14
0, 6, 8, 14
6, 6, 8, 8
Population Standard Deviation
Find the mean for each
group of numbers
x σ
0, 0, 14, 14 7
0, 6, 8, 14 7
6, 6, 8, 8 7
Population Standard Deviation
Find the population standard
deviation for each group of
numbers
x σ
0, 0, 14, 14 7
0, 6, 8, 14 7
6, 6, 8, 8 7
Population Standard Deviation
x σ
0, 0, 14, 14 7 7
0, 6, 8, 14 7
6, 6, 8, 8 7
Find the population standard
deviation for each group of
numbers
Population Standard Deviation
x σ
0, 0, 14, 14 7 7
0, 6, 8, 14 7 5
6, 6, 8, 8 7
Find the population standard
deviation for each group of
numbers
Population Standard Deviation
x σ
0, 0, 14, 14 7 7
0, 6, 8, 14 7 5
6, 6, 8, 8 7 1
Find the population standard
deviation for each group of
numbers
Graphing Frequency Distribution Numerical assignment of each outcome of a
chance experiment.
A coin is tossed 3 times. Assign the variable
X to represent the frequency of heads
occurring in each toss.
Toss Outcome X Value
HHH
HHT
HTH
THH
HTT
THT
TTH
TTT
3
2
2
2
1
1
1
0
X =1 when?
HTT,THT,TTH
Graphing Frequency Distribution
The calculated likelihood that an outcome
variable will occur within an experiment.
Toss Outcome X value
HHH
HHT
HTH
THH
HTT
THT
TTH
TTT
3
2
2
2
1
1
1
0
x P(x)
0
1
2
3
xx
a
FP
F
0
1P
8
1
3P8
2
3P
8
3
1P
8
Graphing Frequency Distribution
x P(x)
0
1
2
3
0
1P
8
1
3P8
2
3P
8
3
1P
8 x
Histogram
Histogram Available airplane passenger seats one week
before departure.
What information does
the histogram provide
the airline carriers?
What information
does the histogram
provide prospective
customers?
open seats
perc
ent of th
e tim
e