hypothesis testing
DESCRIPTION
Hypothesis Testing. Example:. Test the performance of two lists in terms of response rates Sample (1,000) from the first list provides a response rate of 3.5% Sample (1,200) from the second list provides a response rate of 4.5% - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/1.jpg)
Hypothesis Testing
![Page 2: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/2.jpg)
Example:
• Test the performance of two lists in terms of response rates
• Sample (1,000) from the first list provides a response rate of 3.5%
• Sample (1,200) from the second list provides a response rate of 4.5%
• Do the two lists (population) really have a difference or is it an artifact of the sample?
![Page 3: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/3.jpg)
The Logic of Hypothesis Testing
• When you want to make statements about a population, you usually draw samples
• How generalizable is your sample-based finding?
• Evidence has to be evaluated statistically before arriving at a conclusion regarding the hypothesis
• Depends on whether information is generated from the sample with fewer or larger observations
![Page 4: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/4.jpg)
Steps in Hyp Testing
Problem Definition
Clearly state the null and alternate hypotheses.
Choose the relevant test and the appropriate
probability distribution
Choose the critical value
Compare test statistic and critical value
Reject null
Does the test statistic fall in the critical region?
Determine the significance level
Compute relevant test statistic
Determine the degrees of freedom
Decide if one-or two-tailed test
Do not reject nullNo
Yes
![Page 5: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/5.jpg)
Basic Concepts of Hypothesis Testing• The Null and Alternate hypothesis• Choosing the relevant statistical test and appropriate
probability distribution. Depends on- Size of the sample- Whether the population standard deviation is known or not
• Choosing the Critical Value. The three criteria used are- Significance Level- Degrees of Freedom- One or Two Tailed Test
![Page 6: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/6.jpg)
Significance Level
• Indicates the percentage of sample means that is outside the cut-off limits (critical value)
• The higher the significance level () used for testing a hypothesis, the higher the probability of rejecting a null hypothesis when it is true (Type I error)
• Accepting a null hypothesis when it is false is called a Type II error and its probability is ()
![Page 7: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/7.jpg)
Significance Level (Contd.)
• When choosing a level of significance, there is an inherent tradeoff between these two types of errors
• Power of hypothesis test (1 - )• A good test of hypothesis ought to reject a
null hypothesis when it is false• 1 - should be as high a value as possible
![Page 8: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/8.jpg)
Degree of Freedom
• The number or bits of "free" or unconstrained data used in calculating a sample statistic or test statistic
• A sample mean (X) has `n' degree of freedom
• A sample variance (s2) has (n-1) degrees of freedom
![Page 9: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/9.jpg)
One or Two-tail Test
• One-tailed Hypothesis Test • Determines whether a particular population parameter is larger
or smaller than some predefined value
• Uses one critical value of test statistic
• Two-tailed Hypothesis Test • Determines the likelihood that a population parameter is
within certain upper and lower bounds
• May use one or two critical values
![Page 10: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/10.jpg)
Hypothesis Testing
DATA ANALYSISOUTCOME
In Population Accept NullHypothesis
Reject NullHypothesis
Null HypothesisTrue
Correct Decision Type I Error
Null HypothesisFalse
Type II Error CorrectDecision
![Page 11: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/11.jpg)
Hypothesis Testing About a Single Mean - Step-by-Step
1) Formulate Hypotheses2) Select appropriate formula3) Select significance level4) Calculate z or t statistic5) Calculate degrees of freedom (for t-test)6) Obtain critical value from table7) Make decision regarding the Null-hypothesis
![Page 12: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/12.jpg)
Hypothesis Testing About a Single Mean - Example 1(2 tailed)• Ho: = 5000 (hypothesized value of population)
• Ha: 5000 (alternative hypothesis)• n = 100
= 4960 = 250 = 0.05
Rejection rule: if |zcalc| > z/2 then reject Ho.
X
![Page 13: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/13.jpg)
Hypothesis Testing About a Single Mean - Example 2
• Ho: = 1000 (hypothesized value of population)
• Ha: 1000 (alternative hypothesis)• n = 12
= 1087.1• s = 191.6 = 0.01
Rejection rule: if |tcalc| > tdf, /2 then reject Ho.
X
![Page 14: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/14.jpg)
Hypothesis Testing About a Single Mean - Example 3(1 tailed)• Ho: 5000 (hypothesized value of population)
• Ha: < 5000 (alternative hypothesis)• n = 50
= 4970• = 250 = 0.01
Rejection rule: if then reject Ho.
X
CalcZZ
![Page 15: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/15.jpg)
Hypothesis Test of Difference between Means
• Mayor of a city wants to see if males and females earn the same
• A random sample of 400 males and 576 females was taken and following was found
Males Females
Mean $105.70 $112.80
Standard dev $5.00 $4.80
![Page 16: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/16.jpg)
Hypothesis Test of Difference between Means
• The appropriate test depends on- whether samples are from related or unrelated samples- whether population standard deviations are known or not- if not, whether they can be assumed to be equal or not
![Page 17: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/17.jpg)
Hypothesis Test of Difference between Means
• In salary example, the null hypothesis isHo: 1- 2 =c (=0) Ha: 1- 2 c
• Since we have unrelated samples with known (for large samples, we can use sample SD as pop SD) but unequal ’s the standard error of difference in means is
32.576
)80.4(4005 22
2
22
1
21
21 n
snsS XX
![Page 18: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/18.jpg)
Hypothesis Test of Difference between Means
• The calculated value of z is
• For =.01 and a two-tailed test, the Z-table value is 2.58
• Since is greater than , the null hypothesis is rejected
19.22)()(
21
2121
XX
calc SXXz
calcz 2/z
![Page 19: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/19.jpg)
Hypothesis Testing of Proportion
• Quality control dept of a light bulb company claims 95% of its products are defect free
• The CEO checks 225 bulbs and finds only 87% to be defect free
• Is the claim of 95% true at .05 level of significance ?
• So we have hypothesized values and sample values
,05.0 ,95.0 oo qp13.0 ,87.0 qp
![Page 20: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/20.jpg)
Hypothesis Testing of Proportion• The null hypothesis is Ho: p=0.95• The alternate hypothesis is Ha: p 0.95• First, calculate the standard error of the proportion using
hypothesized values as
• Since np and nq are large, we can use the Z table. The appropriate z value is 1.96
0145.225
05.95.
nqp oo
p
![Page 21: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/21.jpg)
Hypothesis Testing of Proportion
• The limits of the acceptance region are
• Since the sample proportion of 0.87 does not fall within the acceptance region, the CEO should reject the quality control department’s claim
)978. ,922(.)0145.96.1(95.96.1 pop
![Page 22: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/22.jpg)
Hypothesis Testing of Difference between Proportions
• Manager wants to see if John and Linda, two salespeople, have the same conversion
• He picks samples and finds that
Sample size
Number converted
Proportion converted
John 100 84 0.84 (= )
Linda 100 82 0.82(= )
jp
lp
![Page 23: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/23.jpg)
Hypothesis Testing of Difference between Proportions
• Are their conversion rates different at 0.05 significance level?• The null hypothesis is Ho:
• The alternate hypothesis is Ha:• The best estimate of p (proportion of success)
is
also,
lj pp lj pp
83.0ˆ21
21
nn
pnpnp lj .17 ˆ1 ˆ pq
![Page 24: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/24.jpg)
Hypothesis Testing of Difference between Proportions
• An estimate of the standard error of the difference of proportions is
• The z value can be calculated as
• The z value obtained from the table is 1.96 (for ). Thus, we fail to reject the null hypothesis
053.ˆˆˆˆˆ21
nqp
nqp
lj pp
38.ˆ
0)(
lj pp
ljcalc
ppz
05.
![Page 25: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/25.jpg)
The Probability Values (P-value) Approach to Hypothesis Testing
• P-value provides researcher with alternative method of testing hypothesis without pre-specifying
• Largest level of significance at which we would not reject Ho
![Page 26: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/26.jpg)
The Probability Values (P-value) Approach to Hypothesis Testing
Difference Between Using and p-value
• Hypothesis testing with a pre-specified • Researcher is trying to determine, "is the probability
of what has been observed less than ?"
• Reject or fail to reject Ho accordingly
![Page 27: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/27.jpg)
The Probability Values (P-value) Approach to Hypothesis Testing
Using the p-Value• Researcher can determine "how unlikely is the
result that has been observed?"• Decide whether to reject or fail to reject Ho
without being bound by a pre-specified significance level
• In general, the smaller the p-value, the greater is the researcher's confidence in sample findings
![Page 28: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/28.jpg)
The Probability Values (P-value) Approach to Hypothesis Testing: Example
• Ho: = 25 (hypothesized value of population)
• Ha: 25 (alternative hypothesis)• n = 50
= 25.2 = 0.7• SE( )= = 0.1; Z= =2
• From Z-table, prob Z >2 is 0.0228. As this is a 2-tailed test, the p-value is 2 0.228=.0456
X
X n
X
X
![Page 29: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/29.jpg)
The Probability Values (P-value) Approach to Hypothesis Testing
Using the p-Value• P-value is generally sensitive to sample size
• A large sample should yield a low p-value
• P-value can report the impact of the sample size on the reliability of the results
![Page 30: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/30.jpg)
Relationship between C.I and Hypothesis Testing (Example 1)
• A direct mktr knows that average no of purchases per month in entire database is 5.6
• By sampling ‘loyals’ he finds that their average is 6.1(i.e, =6.1)
• Is it merely a sampling accident?• Ho: = 5.6 (hypothesized value of population) • Ha: 5.6 (alternative hypothesis)• n = 35 = 2.5
X
![Page 31: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/31.jpg)
Relationship between C.I and Hypothesis Testing (Example 1)
X 96.10
• Std err =0.42
• The appropriate Z for =.05 is 1.96 • The Confidence Interval is
= (4.78, 6.42)• Since 6.1 falls in the interval, we cannot
reject the null hypothesis
nX
X 96.10
![Page 32: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/32.jpg)
Confidence Intervals and Hypothesis Testing
• Hypothesis testing and Confidence Intervals are two sides of the same coin.
t = = Interval estimate for
xx
tsXs
X
)(
![Page 33: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/33.jpg)
Relationship between C.I and Hypothesis Testing (Example 2)
• Revisit the first example we started with• Test the performance of two lists in terms of
response rates• Sample (1,000) from the first list provides a
response rate of 3.5%• Sample (1,200) from the second list provides
a response rate of 4.5%• Do the two lists (population) really have a
difference or is it an artifact of the sample?
![Page 34: Hypothesis Testing](https://reader033.vdocuments.us/reader033/viewer/2022042704/56814a74550346895db78bd5/html5/thumbnails/34.jpg)
Relationship between C.I and Hypothesis Testing (Example 2)
– C.I. of list 1:• (0.035)+/- 1.96*(SE1)• SE1 = Sqrt[(0.035*0.965)/1000]=0.006• C.I.1=(0.0232,0.0467)
– C.I. of list 2:• (0.045)+/-1.96*(SE2)• SE2=Sqrt[(0.045*0.955)/1200]=0.006• C.I.2 =(0.033,0.0568)
– What can we infer based on these confidence Intervals?• Lack of sufficient evidence to infer that there is any difference
between the response rates in the two samples.