top 10 concepts of statistics
TRANSCRIPT
-
7/30/2019 Top 10 concepts of Statistics
1/111
Review of Top 10 Concepts
in Statistics(reordered slightly for review the interactivesession)
NOTE: This Power Point file is not an introduction,but rather a checklist of topics to review
-
7/30/2019 Top 10 concepts of Statistics
2/111
Top Ten #10
Qualitative vs. Quantitative
-
7/30/2019 Top 10 concepts of Statistics
3/111
Qualitative
Categorical data:
success vs. failure
ethnicitymarital status
color
zip code4 star hotel in tour guide
-
7/30/2019 Top 10 concepts of Statistics
4/111
Qualitative
If you need an average, do not calculate themean
However, you can compute the mode(average person is married, buys a blue carmade in America)
-
7/30/2019 Top 10 concepts of Statistics
5/111
Quantitative
Two cases
Case 1: discrete
Case 2: continuous
-
7/30/2019 Top 10 concepts of Statistics
6/111
Discrete
(1) integer values (0,1,2,)
(2) example: binomial
(3) finite number of possible values(4) counting
(5) number of brothers
(6) number of cars arriving at gas station
-
7/30/2019 Top 10 concepts of Statistics
7/111
Continuous
Real numbers, such as decimal values($22.22)
Examples: Z, t Infinite number of possible values
Measurement
Miles per gallon, distance, duration of time
-
7/30/2019 Top 10 concepts of Statistics
8/111
Graphical Tools
Pie chart or bar chart: qualitative
Joint frequency table: qualitative (relatemarital status vs zip code)
Scatter diagram: quantitative (distance fromCSUN vs duration of time to reach CSUN)
-
7/30/2019 Top 10 concepts of Statistics
9/111
Hypothesis Testing
Confidence Intervals
Quantitative: Mean
Qualitative: Proportion
-
7/30/2019 Top 10 concepts of Statistics
10/111
Top Ten #9
Population vs. Sample
-
7/30/2019 Top 10 concepts of Statistics
11/111
Population
Collection of all items (all light bulbs made atfactory)
Parameter: measure of population
(1) population mean (average number ofhours in life of all bulbs)
(2) population proportion (% of all bulbs thatare defective)
-
7/30/2019 Top 10 concepts of Statistics
12/111
Sample
Part of population (bulbs tested by inspector)
Statistic: measure of sample = estimate ofparameter
(1) sample mean (average number of hoursin life of bulbs tested by inspector)
(2) sample proportion (% of bulbs in sample
that are defective)
-
7/30/2019 Top 10 concepts of Statistics
13/111
Top Ten #1
Descriptive Statistics
-
7/30/2019 Top 10 concepts of Statistics
14/111
Measures of Central Location
Mean
Median
Mode
-
7/30/2019 Top 10 concepts of Statistics
15/111
Mean
Population mean == x/N = (5+1+6)/3 = 12/3 =4
Algebra: x = N* = 3*4 =12
Sample mean = x-bar = x/n Example: the number of hours spent on the
Internet: 4, 8, and 9
x-bar = (4+8+9)/3 = 7 hours
Do NOT use if the number of observations issmall or with extreme values
Ex: Do NOT use if 3 houses were sold this week,and one was a mansion
-
7/30/2019 Top 10 concepts of Statistics
16/111
Median
Median = middle value
Example: 5,1,6
Step 1: Sort data: 1,5,6
Step 2: Middle value = 5 When there is an even number of observation,
median is computed by averaging the twoobservations in the middle.
OK even if there are extreme values
Home sales: 100K,200K,900K, so
mean =400K, but median = 200K
-
7/30/2019 Top 10 concepts of Statistics
17/111
Mode
Mode: most frequent value
Ex: female, male, female
Mode = female Ex: 1,1,2,3,5,8
Mode = 1
It may not be a very good measure, see thefollowing example
-
7/30/2019 Top 10 concepts of Statistics
18/111
Measures of Central Location -Example
Sample: 0, 0, 5, 7, 8, 9, 12, 14, 22, 23
Sample Mean = x-bar = x/n = 100/10 = 10
Median = (8+9)/2 = 8.5
Mode = 0
-
7/30/2019 Top 10 concepts of Statistics
19/111
Relationship
Case 1: if probability distribution symmetric(ex. bell-shaped, normal distribution),
Mean = Median = Mode
Case 2: if distribution positively skewed toright (ex. incomes of employers in large firm: a
large number of relatively low-paid workersand a small number of high-paid executives),
Mode < Median < Mean
-
7/30/2019 Top 10 concepts of Statistics
20/111
Relationship contd
Case 3: if distribution negatively skewed to left(ex. The time taken by students to write
exams: few students hand their exams earlyand majority of students turn in their exam atthe end of exam), Mean < Median < Mode
-
7/30/2019 Top 10 concepts of Statistics
21/111
Dispersion Measures ofVariability
How much spread of data
How much uncertainty
Measures Range
Variance
Standard deviation
-
7/30/2019 Top 10 concepts of Statistics
22/111
Range
Range = Max-Min > 0
But range affected by unusual values
Ex: Santa Monica has a high of 105 degreesand a low of 30 once a century, but rangewould be 105-30 = 75
-
7/30/2019 Top 10 concepts of Statistics
23/111
Standard Deviation (SD)
Better than range because all data used
Population SD = Square root of variance=sigma =
SD > 0
-
7/30/2019 Top 10 concepts of Statistics
24/111
Empirical Rule
Applies to mound or bell-shaped curves
Ex: normal distribution 68% of data within + one SD of mean
95% of data within + two SD of mean
99.7% of data within + three SD of mean
-
7/30/2019 Top 10 concepts of Statistics
25/111
Standard Deviation =
Square Root of Variance
1)(2
nxxs
-
7/30/2019 Top 10 concepts of Statistics
26/111
Sample Standard Deviation
x
6 6-8=-2 (-2)(-2)= 4
6 6-8=-2 4
7 7-8=-1 (-1)(-1)= 1
8 8-8=0 0
13 13-8=5 (5)(5)= 25
Sum=40 Sum=0 Sum = 34
Mean=40/5=8
xx 2)( xx
-
7/30/2019 Top 10 concepts of Statistics
27/111
Standard Deviation
Total variation = 34
Sample variance = 34/4 = 8.5
Sample standard deviation =square root of 8.5 = 2.9
-
7/30/2019 Top 10 concepts of Statistics
28/111
Measures of Variability - Example
The hourly wages earned by a sample of five studentsare:
$7, $5, $11, $8, and $6Range: 11 5 = 6
Variance:
Standard deviation:
30.5
15
2.21
15
4.76...4.77
1
222
2
n
XXs
30.230.52
ss
-
7/30/2019 Top 10 concepts of Statistics
29/111
Graphical Tools
Line chart: trend over time
Scatter diagram: relationship between twovariables
Bar chart: frequency for each category Histogram: frequency for each class of
measured data (graph of frequency distr.)
Box plot: graphical display based onquartiles, which divide data into 4 parts
-
7/30/2019 Top 10 concepts of Statistics
30/111
Top Ten #8
Variation Creates Uncertainty
-
7/30/2019 Top 10 concepts of Statistics
31/111
No Variation
Certainty, exact prediction
Standard deviation = 0
Variance = 0
All data exactly same
Example: all workers in minimum wage job
-
7/30/2019 Top 10 concepts of Statistics
32/111
High Variation
Uncertainty, unpredictable
High standard deviation
Ex #1: Workers in downtown L.A. have variationbetween CEOs and garment workers
Ex #2: New York temperatures in spring rangefrom below freezing to very hot
-
7/30/2019 Top 10 concepts of Statistics
33/111
Comparing StandardDeviations
Temperature Example
Beach city: small standard deviation (single
temperature reading close to mean) High Desert city: High standard deviation (hot
days, cool nights in spring)
-
7/30/2019 Top 10 concepts of Statistics
34/111
Standard Error of the Mean
Standard deviation of sample mean =
standard deviation/square root of n
Ex: standard deviation = 10, n =4, so standarderror of the mean = 10/2= 5
Note that 5
-
7/30/2019 Top 10 concepts of Statistics
35/111
Sampling Distribution
Expected value of sample mean = populationmean, but an individual sample mean could besmaller or larger than the population mean
Population mean is a constant parameter, butsample mean is a random variable
Sampling distribution is distribution of samplemeans
-
7/30/2019 Top 10 concepts of Statistics
36/111
Example
Mean age of all students in the building ispopulation mean
Each classroom has a sample mean
Distribution of sample means from allclassrooms is sampling distribution
-
7/30/2019 Top 10 concepts of Statistics
37/111
Central Limit Theorem (CLT)
If population standard deviation is known,sampling distribution of sample means is normalif n > 30
CLT applies even if original population isskewed
-
7/30/2019 Top 10 concepts of Statistics
38/111
Top Ten #5
Expected Value
-
7/30/2019 Top 10 concepts of Statistics
39/111
Expected Value
Expected Value = E(x) = xP(x)
= x1P(x1) + x2P(x2) +
Expected value is a weighted average, also along-run average
-
7/30/2019 Top 10 concepts of Statistics
40/111
Example
Find the expected age at high schoolgraduation if 11 were 17 years old, 80 were18 years old, and 5 were 19 years old
Step 1: 11+80+5=96
-
7/30/2019 Top 10 concepts of Statistics
41/111
Step 2
x P(x) x P(x)
17 11/96=.115 17(.115)=1.955
18 80/96=.833 18(.833)=14.994
19 5/96=.052 19(.052)=.988
E(x)= 17.937
-
7/30/2019 Top 10 concepts of Statistics
42/111
Top Ten #4
Linear Regression
-
7/30/2019 Top 10 concepts of Statistics
43/111
Linear Regression
Regression equation:
=dependent variable=predicted value x= independent variable
b0=y-intercept =predicted value of y if x=0
b1
=slope=regression coefficient
=change in y per unit change in x
xy bb 10
y
-
7/30/2019 Top 10 concepts of Statistics
44/111
Slope vs Correlation
Positive slope (b1>0): positive correlationbetween x and y (y increase if x increase)
Negative slope (b1
-
7/30/2019 Top 10 concepts of Statistics
45/111
Simple Linear Regression
Simple: one independent variable, onedependent variable
Linear: graph of regression equation isstraight line
-
7/30/2019 Top 10 concepts of Statistics
46/111
Example
y = salary (female manager, in thousands ofdollars)
x = number of children
n = number of observations
-
7/30/2019 Top 10 concepts of Statistics
47/111
Given Data
x y
2 48
1 52
4 33
-
7/30/2019 Top 10 concepts of Statistics
48/111
Totals
x y
2 48
1 52
4 33 n=3
Sum=7 Sum=133
-
7/30/2019 Top 10 concepts of Statistics
49/111
Slope (b1) = -6.5
Method of Least Squares formulas not onBUS 302 exam
b1= -6.5 given
Interpretation: If one female manager has 1more child than another, salary is $6,500
lower; that is, salary of female managersis expected to decrease by -6.5 (inthousand of dollars) per child
-
7/30/2019 Top 10 concepts of Statistics
50/111
Intercept (b0)
33.237
nxx 33.44
3133
nyy
b0 = 44.33 (-6.5)(2.33) = 59.5
If number of children is zero,expected salary is $59,500
xy bb 10
-
7/30/2019 Top 10 concepts of Statistics
51/111
Regression Equation
xy 5.65.59
-
7/30/2019 Top 10 concepts of Statistics
52/111
Forecast Salary If 3 Children
59.56.5(3) = 40
$40,000 = expected salary
-
7/30/2019 Top 10 concepts of Statistics
53/111
xforecasty bb 10
yyerror
2
)(
2
2
n
yy
n
SSES
Standard Error of Estimate
-
7/30/2019 Top 10 concepts of Statistics
54/111
Standard Error of Estimate
(1)=x (2)=y (3) =59.5-6.5x
(4)=
(2)-(3)
2 48 46.5 1.5 2.25
1 52 53 -1 1
4 33 33.5 -.5 .25
SSE=3.5
y 2)( yy
-
7/30/2019 Top 10 concepts of Statistics
55/111
9.15.3
23
5.3
S
Standard Error of Estimate
Actual salary typically $1,900away from expected salary
-
7/30/2019 Top 10 concepts of Statistics
56/111
Coefficient of Determination
R2 = % of total variation in y that can beexplained by variation in x
Measure of how close the linear regression
line fits the points in a scatter diagram
R2 = 1: max. possible value: perfect linearrelationship between y and x (straight line)
R2 = 0: min. value: no linear relationship
-
7/30/2019 Top 10 concepts of Statistics
57/111
Sources of Variation (V)
Total V = Explained V + Unexplained V
SS = Sum of Squares = V
Total SS = Regression SS + Error SS
SST = SSR + SSE
SSR = Explained V, SSE = Unexplained
-
7/30/2019 Top 10 concepts of Statistics
58/111
Coefficient of Determination
R2 =SSRSST
R2 = 197 = .98
200.5
Interpretation: 98% of total variation in salarycan be explained by variation in number of
children
-
7/30/2019 Top 10 concepts of Statistics
59/111
0 < R2 < 1
0: No linear relationship since SSR=0(explained variation =0)
1: Perfect relationship since SSR = SST
(unexplained variation = SSE = 0), but doesnot prove cause and effect
-
7/30/2019 Top 10 concepts of Statistics
60/111
R=Correlation Coefficient
Case 1: slope (b1) < 0
R < 0
R is negative square root of coefficient of
determination
2
RR
-
7/30/2019 Top 10 concepts of Statistics
61/111
Our Example
Slope = b1 = -6.5
R2 = .98
R = -.99
-
7/30/2019 Top 10 concepts of Statistics
62/111
Case 2: Slope > 0
R is positive square root of coefficient ofdetermination
Ex: R2 = .49
R = .70
R has no interpretation
R overstates relationship
-
7/30/2019 Top 10 concepts of Statistics
63/111
Caution
Nonlinear relationship (parabola, hyperbola,etc) can NOT be measured by R2
In fact, you could get R2=0 with a nonlinear
graph on a scatter diagram
-
7/30/2019 Top 10 concepts of Statistics
64/111
Summary: Correlation Coefficient
Case 1: If b1 > 0, R is the positive square rootof the coefficient of determination Ex#1: y = 4+3x, R2=.36: R = +.60
Case 2: If b1 < 0, R is the negative squareroot of the coefficient of determination Ex#2: y = 80-10x, R2=.49: R = -.70
NOTE! Ex#2 has stronger relationship, asmeasured by coefficient of determination
-
7/30/2019 Top 10 concepts of Statistics
65/111
Extreme Values
R=+1: perfect positive correlation
R= -1: perfect negative correlation
R=0: zero correlation
-
7/30/2019 Top 10 concepts of Statistics
66/111
MS Excel Output
Correlation Coefficient (-0.9912): Note
that you need to change the sign because
the sign of slope (b1) is negative (-6.5)
Coefficient of Determination
Standard Error of Estimate
Regression Coefficient
-
7/30/2019 Top 10 concepts of Statistics
67/111
Top Ten #6
What Distribution to Use?
-
7/30/2019 Top 10 concepts of Statistics
68/111
Use Binomial Distribution If:
Random variable (x) is number of successes in ntrials
Each trial is success or failure
Independent trials
Constant probability of success () on each trial
Sampling with replacement (in practice, people
may use binomial w/o replacement, but theory iswith replacement)
-
7/30/2019 Top 10 concepts of Statistics
69/111
Success vs. Failure
The binomial experiment can result in onlyone of two possible outcomes:
Male vs. Female
Defective vs. Non-defective Yes or No
Pass (8 or more right answers) vs. Fail (fewer
than 8) Buy drink (21 or over) vs. Cannot buy drink
-
7/30/2019 Top 10 concepts of Statistics
70/111
Binomial Is Discrete
Integer values
0,1,2,n
Binomial is often skewed, but may be symmetric
-
7/30/2019 Top 10 concepts of Statistics
71/111
Normal Distribution
Continuous, bell-shaped, symmetric
Mean=median=mode
Measurement (dollars, inches, years)
Cumulative probability under normal curve : useZ table if you know population mean andpopulation standard deviation
Sample mean: use Z table if you know
population standard deviation and either normalpopulation or n > 30
-
7/30/2019 Top 10 concepts of Statistics
72/111
t Distribution
Continuous, mound-shaped, symmetric
Applications similar to normal
More spread out than normal
Use t if normal population but populationstandard deviation not known
Degrees of freedom = df = n-1 if estimating themean of one population
t approaches z as df increases
-
7/30/2019 Top 10 concepts of Statistics
73/111
Normal or t Distribution?
Use t table if normal population but populationstandard deviation () is not known
If you are given the sample standard deviation
(s), use t table, assuming normal population
-
7/30/2019 Top 10 concepts of Statistics
74/111
Top Ten #3
Confidence Intervals: Mean and Proportion
-
7/30/2019 Top 10 concepts of Statistics
75/111
Confidence Interval
A confidence interval is a range of values withinwhich the population parameter is expectedto occur.
-
7/30/2019 Top 10 concepts of Statistics
76/111
Factors for Confidence Interval
The factors that determine the width of aconfidence interval are:
1. The sample size, n2. The variability in the population, usually
estimated by standard deviation.
3. The desired level of confidence.
-
7/30/2019 Top 10 concepts of Statistics
77/111
Confidence Interval: Mean
Use normal distribution (Z table if):
population standard deviation (sigma)known and either (1) or (2):
(1) Normal population
(2) Sample size > 30
-
7/30/2019 Top 10 concepts of Statistics
78/111
Confidence Interval: Mean
If normal table, then
n
z
n
x
-
7/30/2019 Top 10 concepts of Statistics
79/111
Normal Table
Tail = .5(1 confidence level)
NOTE! Different statistics texts have differentnormal tables
This review uses the tail of the bell curve
Ex: 95% confidence: tail = .5(1-.95)= .025
Z.025 = 1.96
-
7/30/2019 Top 10 concepts of Statistics
80/111
Example
n=49, x=490, =2, 95% confidence
9.44 < < 10.56
56.01049
296.1
49
490
-
7/30/2019 Top 10 concepts of Statistics
81/111
One of SOM professors wants toestimate the mean number of hoursworked per week by students. A sample
of 49 students showed a mean of 24hours. It is assumed that the populationstandard deviation is 4 hours. What isthe population mean?
Another Example
-
7/30/2019 Top 10 concepts of Statistics
82/111
95 percent confidence interval for thepopulation mean.
12.100.24
49
4
96.100.2496.1
nX
The confidence limits range from 22.88 to
25.12. We estimate with 95 percentconfidence that the average number of hoursworked per week by students lies between
these two values.
Another Example contd
Confidence Interval: Mean
-
7/30/2019 Top 10 concepts of Statistics
83/111
Confidence Interval: Meant distribution
Use if normal population but populationstandard deviation () not known
If you are given the sample standarddeviation (s), use t table, assuming normalpopulation
If one population, n-1 degrees of freedom
Confidence Interval: Mean
-
7/30/2019 Top 10 concepts of Statistics
84/111
n
s
n
xtn 1
Confidence Interval: Meant distribution
Confidence Interval:
-
7/30/2019 Top 10 concepts of Statistics
85/111
Confidence Interval:Proportion
Use if success or failure
(ex: defective or not-defective,
satisfactory or unsatisfactory)
Normal approximation to binomial ok if(n)() > 5 and (n)(1-) > 5, where
n = sample size
= population proportion
NOTE: NEVER use the t table if proportion!!
Confidence Interval:
-
7/30/2019 Top 10 concepts of Statistics
86/111
Confidence Interval:Proportion
Ex: 8 defectives out of 100, so p = .08 and
n = 100, 95% confidence
n
ppzp
)1(
05.08.100
)92)(.08.0(96.108.
Confidence Interval:
-
7/30/2019 Top 10 concepts of Statistics
87/111
Confidence Interval:Proportion
A sample of 500 people who own their houserevealed that 175 planned to sell their homeswithin five years. Develop a 98% confidence
interval for the proportion of people who plan tosell their house within five years.
0497.35.500
)65)(.35(.33.235.
35.0500
175p
-
7/30/2019 Top 10 concepts of Statistics
88/111
Interpretation
If 95% confidence, then 95% of all confidenceintervals will include the true population parameter
NOTE! Never use the term probability when
estimating a parameter!! (ex: Do NOT sayProbability that population mean is between 23 and32 is .95 because parameter is not a randomvariable. In fact, the population mean is a fixed but
unknown quantity.)
-
7/30/2019 Top 10 concepts of Statistics
89/111
Point vs Interval Estimate
Point estimate: statistic (single number)
Ex: sample mean, sample proportion
Each sample gives different point estimate
Interval estimate: range of values
Ex: Population mean = sample mean + error
Parameter = statistic + error
-
7/30/2019 Top 10 concepts of Statistics
90/111
Width of Interval
Ex: sample mean =23, error = 3
Point estimate = 23
Interval estimate = 23 + 3, or (20,26)
Width of interval = 26-20 = 6
Wide interval: Point estimate unreliable
-
7/30/2019 Top 10 concepts of Statistics
91/111
Wide Confidence Interval If
(1) small sample size(n)
(2) large standard deviation
(3) high confidence interval (ex: 99% confidenceinterval wider than 95% confidence interval)
If you want narrow interval, you need a largesample size or small standard deviation or low
confidence level.
-
7/30/2019 Top 10 concepts of Statistics
92/111
Top Ten #7
P-value
-
7/30/2019 Top 10 concepts of Statistics
93/111
P-value
P-value = probability of getting a sample statisticas extreme (or more extreme) than the samplestatistic you got from your sample, given that the
null hypothesis is true
-
7/30/2019 Top 10 concepts of Statistics
94/111
P-value Example: one tail test
H0: = 40
HA: > 40
Sample mean = 43
P-value = P(sample mean > 43, given H0 true)
Meaning: probability of observing a samplemean as large as 43 when the population mean
is 40 How to use it: Reject H0 if p-value <
(significance level)
-
7/30/2019 Top 10 concepts of Statistics
95/111
Two Cases
Suppose = .05
Case 1: suppose p-value = .02, then reject H0(unlikely H0 is true; you believe population mean> 40)
Case 2: suppose p-value = .08, then do notreject H0 (H0 may be true; you have reason tobelieve that the population mean may be 40)
-
7/30/2019 Top 10 concepts of Statistics
96/111
P-value Example: two tail test
H0 : = 70
HA: 70
Sample mean = 72
If two-tails, then P-value =
2 P(sample mean > 72)=2(.04)=.08
If = .05, p-value > , so do not reject H0
-
7/30/2019 Top 10 concepts of Statistics
97/111
Top Ten #2
Hypothesis Testing
-
7/30/2019 Top 10 concepts of Statistics
98/111
Population mean=
Population proportion=
A statement about the value of a populationparameter
Never include sample statistic (such as, x-bar) in hypothesis
H0: Null Hypothesis
H H Alt ti H th i
-
7/30/2019 Top 10 concepts of Statistics
99/111
HA or H1:Alternative Hypothesis
ONE TAIL ALTERNATIVE
Right tail: >number(smog ck)
>fraction(%defectives)
Left tail:
-
7/30/2019 Top 10 concepts of Statistics
100/111
One-Tailed Tests
A test is one-tailed when the alternatehypothesis, H1 or HA, states a direction, such as:
H1: The mean yearly salaries earned by full-timeemployees is more than $45,000. (>$45,000)
H1: The average speed of cars traveling onfreeway is less than 75 miles per hour. (
-
7/30/2019 Top 10 concepts of Statistics
101/111
Two-Tail Alternative
Population mean not equal to number (toohot or too cold)
Population proportion not equal to fraction (%
alcohol too weak or too strong)
Two-Tailed Tests
-
7/30/2019 Top 10 concepts of Statistics
102/111
Two Tailed Tests
A test is two-tailed when no direction isspecified in the alternate hypothesis
H1: The mean amount of time spent for the
Internet is not equal to 5 hours. ( 5).
H1: The mean price for a gallon of gasoline
is not equal to $2.54. ( $2.54).
-
7/30/2019 Top 10 concepts of Statistics
103/111
Reject Null Hypothesis (H0) If
Absolute value of test statistic* > critical value*
Reject H0 if |Z Value| > critical Z
Reject H0 if | t Value| > critical t
Reject H0 if p-value < significance level (alpha) Note that direction of inequality is reversed!
Reject H0 if very large difference between samplestatistic and population parameter in H
0
* Test statistic: A value, determined from sample information, used to determinewhether or not to reject the null hypothesis.
* Critical value: The dividing point between the region where the null hypothesis isrejected and the region where it is not rejected.
-
7/30/2019 Top 10 concepts of Statistics
104/111
Example: Smog Check
H0 : = 80
HA: > 80
If test statistic =2.2 and critical value = 1.96,
reject H0, and conclude that the populationmean is likely > 80
If test statistic = 1.6 and critical value = 1.96,
do not reject H0, and reserve judgment aboutH0
-
7/30/2019 Top 10 concepts of Statistics
105/111
Type I vs Type II Error
Alpha= = P(type I error) = Significance level =probability that you reject true null hypothesis
Beta= = P(type II error) = probability you do notreject a null hypothesis, given H0 false
Ex: H0 : Defendant innocent
= P(jury convicts innocent person)
=P(jury acquits guilty person)
-
7/30/2019 Top 10 concepts of Statistics
106/111
Type I vs Type II Error
H0 true H0 false
Reject H0 Alpha = =P(type I error)
1 (CorrectDecision)
Do not reject H0 1 (CorrectDecision) Beta = =P(type II error)
E l S Ch k
-
7/30/2019 Top 10 concepts of Statistics
107/111
Example: Smog Check
H0 : = 80
HA: > 80
If p-value = 0.01 and alpha = 0.05, reject H0,
and conclude that the population mean islikely > 80
If p-value = 0.07 and alpha = 0.05, do not
reject H0, and reserve judgment about H0
Test Statistic
-
7/30/2019 Top 10 concepts of Statistics
108/111
Test Statistic
When testing for the population mean from alarge sample and the population standarddeviation is known, the test statistic is given
by:
zX
/ n
E l
-
7/30/2019 Top 10 concepts of Statistics
109/111
The processors of Best Mayo indicate on thelabel that the bottle contains 16 ounces ofmayo. The standard deviation of the process
is 0.5 ounces. A sample of 36 bottles from lasthours production showed a mean weight of16.12 ounces per bottle. At the .05significance level, can we conclude that themean amount per bottle is greater than 16ounces?
Example
E l td
-
7/30/2019 Top 10 concepts of Statistics
110/111
1. State the null and the alternative hypotheses:H0: = 16, H1: > 16
3. Identify the test statistic. Because we know thepopulation standard deviation, the test statistic is z.
4. State the decision rule.
Reject H0 if |z|>1.645 (= z0.05)
2. Select the level of significance. In this case,
we selected the .05 significance level.
Example contd
E l td
-
7/30/2019 Top 10 concepts of Statistics
111/111
5. Compute the value of the test statistic
44.1
365.0
00.1612.16
n
Xz
6. Conclusion: Do not reject the null hypothesis.
We cannot conclude the mean is greater than 16ounces.
Example contd