what does statistics mean? descriptive statistics –number of people –trends in employment...
TRANSCRIPT
What does Statistics Mean?
• Descriptive statistics– Number of people– Trends in employment– Data
• Inferential statistics– Make an inference about a population from a
sample
Population Parameter Versus Sample Statistics
Population Parameter
• Variables in a population
• Measured characteristics of a population
• Greek lower-case letters as notation
Sample Statistics
• Variables in a sample
• Measures computed from data
• English letters for notation
Making Data Usable
• Frequency distributions
• Proportions
• Central tendency– Mean– Median– Mode
• Measures of dispersion
Frequency (number ofpeople making deposits
Amount in each range)
less than $3,000 499$3,000 - $4,999 530$5,000 - $9,999 562$10,000 - $14,999 718$15,000 or more 811
3,120
Frequency Distribution of Deposits
Amount Percentless than $3,000 16$3,000 - $4,999 17$5,000 - $9,999 18$10,000 - $14,999 23$15,000 or more 26
100
Percentage Distribution of Amounts of Deposits
Amount Probability
less than $3,000 .16$3,000 - $4,999 .17$5,000 - $9,999 .18$10,000 - $14,999 .23$15,000 or more .26
1.00
Probability Distribution of Amounts of Deposits
Measures of Central Tendency
• Mean - arithmetic average– µ, Population; , sample
• Median - midpoint of the distribution
• Mode - the value that occurs most often
X
Population Mean
NXi
nX
Xi
Sample Mean
Number ofSalesperson Sales callsMike 4
Patty 3Billie 2Bob 5John 3Frank 3Chuck 1Samantha 5
26
Number of Sales Calls Per Day by Salespersons
Product A Product B196 150198 160199 176199 181200 192200 200200 201201 202201 213201 224202 240202 261
Sales for Products A and B, Both Average 200
Measures of Dispersion or Spread
• Range
• Mean absolute deviation
• Variance
• Standard deviation
The Range as a Measure of Spread
• The range is the distance between the smallest and the largest value in the set.
• Range = largest value – smallest value
Deviation Scores
• The differences between each observation value and the mean:
xxd ii
150 160 170 180 190 200 210
5
4
3
2
1
Low Dispersion
Value on Variable
Fre
quen
cyLow Dispersion Verses High
Dispersion
150 160 170 180 190 200 210
5
4
3
2
1
Fre
quen
cy High dispersion
Value on Variable
Low Dispersion Verses High Dispersion
Average Deviation
0)(
n
XX i
Mean Squared Deviation
n
XXi 2)(
The Variance
2
2
S
Sample
Population
Variance
1
22
n
)XΣ(X=S i
Variance
• The variance is given in squared units
• The standard deviation is the square root of variance:
Sample Standard Deviation
1
2
n
XX iS
Population Standard Deviation
2
Sample Standard Deviation
2SS
The Normal Distribution
• Normal curve
• Bell shaped
• Almost all of its values are within plus or minus 3 standard deviations
• I.Q. is an example
MEAN
Normal Distribution
2.14%
13.59% 34.13% 34.13% 13.59%
2.14%
Normal Distribution
85 115100 14570
Normal Curve: IQ Example
Standardized Normal Distribution
• Symetrical about its mean• Mean identifies highest point• Infinite number of cases - a continuous
distribution• Area under curve has a probability density = 1.0
Standard Normal Curve
• Mean of zero, standard deviation of 1
• The curve is bell-shaped or symmetrical
• About 68% of the observations will fall within 1 standard deviation of the mean
• About 95% of the observations will fall within approximately 2 (1.96) standard deviations of the mean
• Almost all of the observations will fall within 3 standard deviations of the mean
01 -1-2 2 z
A Standardized Normal Curve
The Standardized Normal is the Distribution of Z
–z +z
xz
Standardized Scores
xz
Standardized Values
• Used to compare an individual value to the population mean in units of the standard deviation
Linear Transformation of Any Normal Variable Into a Standardized Normal Variable
-2 -1 0 1 2
Sometimes thescale is stretched
Sometimes thescale is shrunk
X
xz
•Population distribution
•Sample distribution
•Sampling distribution
x
Population Distribution
XS
Sample Distribution
XS XX
Sampling Distribution
Standard Error of the Mean
• Standard deviation of the sampling distribution
Central Limit Theorem
The theory that, as sample size increases, the distribution of sample means of size n, randomly selected, approaches a
normal distribution.
Standard Error of the Mean
nSx
Distribution Mean StandardDeviation
Population Sample
X S
SamplingX
XS
Estimation of Parameter
• Point estimates
• Confidence interval estimates
error sampling small aX
Confidence Interval
Xcl SZ ERRORSAMPLING SMALL
XclSZ E
E ±X=μ
Estimating the Standard Error of the Mean
n
S=S
x
n
SZX cl
Random Sampling Error and Sample Size are Related
Sample Size
• Variance (standard deviation)
• Magnitude of error• Confidence level
Sample Size Formula
2
E
zs=n
E
SZn
n
SZE
cl
cl
Sample Size Formula - Example
Suppose a survey researcher, studying expenditures on lipstick, wishes to have a 95 percent confident level (Z) and a range of error (E) of less than $2.00. The estimate of the standard deviation is $29.00.
2
E
zsn
2
00.2
00.2996.1
2
00.2
84.56
242.28 808
Sample Size Formula - Example
Suppose, in the same example as the one before, the range of error (E) is acceptable at $4.00, sample size is reduced.
Sample Size Formula - Example
2
E
zsn
2
00.4
00.2996.1
2
00.4
84.56
221.14 202
Sample Size Formula - Example
99% ConfidenceCalculating Sample Size
1389
265.372
253.74
2
2)29)(57.2(n
2
347 6325.18 2
453.74
2
4)29)(57.2(n
2
npp
or
npq
ps
)1(
Standard Error of the Proportion
pclSZp
Confidence Interval for a Proportion
2
2
EpqZ
n
Sample Size for a Proportion
2
2
Epqz
n
Where: n = Number of items in samples
Z2 = The square of the confidence interval in standard error units.
p = Estimated proportion of success
q = (1-p) or estimated the proportion of failures
E2 = The square of the maximum allowance for error between the true proportion and sample proportion or zsp squared.
Calculating Sample Size at the 95% Confidence Level
753001225.
922.
001225
)24)(.8416.3(
)035( .)4)(.6(.)96 1. (
n4.q
6.p2
2