6 testing of hypothesis1
TRANSCRIPT
-
8/7/2019 6 TESTING OF HYPOTHESIS1
1/32
Sampling andSampling andSampling DistributionsSampling Distributions
Sampling andSampling andSampling DistributionsSampling Distributions
-
8/7/2019 6 TESTING OF HYPOTHESIS1
2/32
A statistical population is the aggregate of all the unitspertaining to a stud
y.
i.e. it is the set of all elements about which we wish tomake inferences.
A sample is a subset of a population.
The process of drawing a sample from a largepopulation is called sampling.
STATISTIC: Characteristic or measure obtained from asampl
e.
PARAMETER: Characteristic or measure obtainedfrom a population.
A sampling distribution is the probability distribution,under repeated sampling of the population, of a givenstatistic.
-
8/7/2019 6 TESTING OF HYPOTHESIS1
3/32
Consider a very large population.
Assume we repeatedly take samplesof a given size from the population
and calculate the sample mean for
each sample.
Different samples will lead to
different sample means.The distribution of these means is
the sampling distribution of the
sample mean.
-
8/7/2019 6 TESTING OF HYPOTHESIS1
4/32
When all of the possible sample means arecomputed, then the following properties are true:
The mean of the sample means will be the meanof the population ().
The variance of the sample means will be thevariance of the population divided by the samplesize (2/n).The standard deviation of the distribution of a sample
statistic is known as the standard error of the statistic.
The nature of the sampling distribution depends onthe distribution of the population and/or thestatistic being considered and the sample sizeused.
-
8/7/2019 6 TESTING OF HYPOTHESIS1
5/32
A population comprises of four numbers:
3, 5, 7 and 9
(a) List all possible samples of size 2 thatcan be drawn from the population withoutreplacement.
(b) Show that the mean of the samplingdistribution of sample means is equal to thepopulation mean.
(c) Calculate the standard deviation of the
sampling distribution of sample means andhence, show that it is less than thepopulation standard deviation.
-
8/7/2019 6 TESTING OF HYPOTHESIS1
6/32
Testing of HypothesisTesting of HypothesisTesting of HypothesisTesting of Hypothesis
-
8/7/2019 6 TESTING OF HYPOTHESIS1
7/32
Hypothesis is an assumption about a population
A few examples are as follows:1. Mean purchases made by females (1) is more than
or equal to the mean purchases made by males (2)
in a textile stores (1 > 2).
2. Mean age of female shoppers (1) is less than or
equal to that of male shoppers (2) in a book
exhibition (1 < 2).
3. Mean monthly income of buyers () in a shop ismore than or equal to Rs 10000\- ( > 10000).
4. The mean stay-over time of customers () in a shop
is at most 45 minutes ( < 45).
-
8/7/2019 6 TESTING OF HYPOTHESIS1
8/32
Definitions
Parameter: It is a function of population values.
Statistic: It is a function of sample values.
Null Hypothesis: It is an assumption about the
population parameter which the statement of nochange. It is denoted by H0.
Alternate Hypothesis: It is the statement ofassumption which can be considered to be thealternative to the null hypothesis is called thealternative hypothesis. It is denoted by H1.
-
8/7/2019 6 TESTING OF HYPOTHESIS1
9/32
As long as there is no apparent contradiction tothe null hypothesis, we retain this belief. But,when we find observations contradicting it, thereis a reason to suspect the validity of this null
hypothesis and the problem of testing the nullhypothesis arises.
When we proceed to test H0, we must be awareof the assumption that is expected to be valid if
null hypothesis turns out to be valid if nullhypothesis turns out to be invalid. Thisassumption is known as alternative hypothesis.
-
8/7/2019 6 TESTING OF HYPOTHESIS1
10/32
H0: The mean I.Q. of all persons in a city is 105
H1: The mean I.Q. of all persons in the city is 100
(if it is known that the mean I.Q. is 105 or 100 andnothing else)
OR H1: The mean I.Q. of all the persons in the city is lessthan 105(if it is known that the mean I.Q. is not more than 105)
OR H1: The mean I.Q. of all the persons in the city is morethan 105
(if it is known that the mean I.Q. is not less than 105)OR
H1: The mean I.Q. of all the persons is not equal to 105(if any information is absent)
-
8/7/2019 6 TESTING OF HYPOTHESIS1
11/32
The first thing to do when given a claim is to
write the claim mathematically (if possible), anddecide whether the given claim is the null or
alternative hypothesis.
If the given claim contains equality, or a
statement of no change from the given or
accepted condition, then it is the null hypothesis,
otherwise, if it represents change, it is the
alternative hypothesis.
-
8/7/2019 6 TESTING OF HYPOTHESIS1
12/32
Example
"He's dead, said Dr. X to Captain K.
Mr. S, as the science officer, is put in charge ofstatistically determining the correctness of Xs'statement and deciding the fate of the crew member(to vaporize or try to revive)
His first step is to arrive at the hypothesis to betested.
Does the statement represent a change in previouscondition?
Yes, there is change, thus it is the alternativehypothesis, H1No, there is no change, therefore is the nullhypothesis, H0
-
8/7/2019 6 TESTING OF HYPOTHESIS1
13/32
The correct answer is that there is change.
Dead represents a change from the acceptedstate of alive.
The null hypothesis always represents no
change.Therefore, the hypotheses are:
H0: Patient is alive. H1: Patient is not alive (dead).
-
8/7/2019 6 TESTING OF HYPOTHESIS1
14/32
PROCEDURE IN HYPOTHESIS TESTING
1.Formulate the Hypothesis: Set up a null hypothesis based
on the belief and an appropriate alternate hypothesis.
2. Set up a Suitable Significance Level: The confidence withwhich a null hypothesis is rejected or accepted depends uponthe significance level used for the purpose.
A level of significance say 5% means the risk of making awrong decision is only in 5 out of 100 cases. Level ofsignificance widely used is 5% or 1%. Thus, a 1% level of
significance provides greater confidence to the decision than a5% significance level as the risk of making wrong decision isonly in 1 out of 100 cases. It is denoted by a Greek alphabet
alpha (). Where (1 ) is the CONFIDENCE LEVEL.
-
8/7/2019 6 TESTING OF HYPOTHESIS1
15/32
3. Select Test Criterion: The test criterion is selectedon the basis of sample size. If the sample is large (n u
30), the z-test implying normal distribution is used;whereas if the sample size is small (n < 30), the t-testis more suitable. The most commonly used tests are z,t, F and 2.
A corresponding TEST STATISTIC is calculated.4. Decision Criterion: The Test Statistic calculated inthe previous step is now classified to fall within theacceptance region or the rejection region at the given
level of significance. Accordingly the null hypothesisis accepted or rejected.
5. Conclusion: On the basis of the decision theconclusion is stated.
-
8/7/2019 6 TESTING OF HYPOTHESIS1
16/32
ERRORS IN DECISION MAKING
The problem of testing of a hypothesis isactually a problem of deciding whether toaccept or to reject the null hypothesis H0, infavor of alternate hypothesis H1.
The decision of rejecting or accepting of thenull hypothesis is taken on the basis of
observations made only on a sample of unitsselected from the population. This decisioncannot be always correct. When this decisionis not correct, an error is said to occur.
-
8/7/2019 6 TESTING OF HYPOTHESIS1
17/32
States of nature are something that you, as a
decision maker has no control over.
Either it is, or it isn't. This represents the true
nature of things.
Possible states of nature (Based on H0)
Crew member is alive (H0 true /H1 false )
Crew member is dead (H0 false / H1 true)
-
8/7/2019 6 TESTING OF HYPOTHESIS1
18/32
Decisions are something that you have controlover.
You may make a correct decision or an incorrectdecision.
It depends on the state of nature as to whether
your decision is correct or incorrect.Possible decisions (Based on H0) / conclusions(Based on claim)
Reject H0 if sufficient evidence to say patient
is dead, is available
Fail to Reject H0 if sufficient evidence to
say patient is dead, is not available
-
8/7/2019 6 TESTING OF HYPOTHESIS1
19/32
Statistically speaking State at e
ecisi n e alse
e ect Patient is alive,
Sufficient evidenceof death
Patient is dead,
Sufficient evidenceof death
ail te ect
Patient is alive,
Insufficient evidence
of death
Patient is dead,
Insufficient evidence
of death
-
8/7/2019 6 TESTING OF HYPOTHESIS1
20/32
State of Nature
Decision Crew member alive Crew member dead
Vaporize the
crew member
Error Right decision
Try to revive
crew member
Right decision Error
-
8/7/2019 6 TESTING OF HYPOTHESIS1
21/32
Following table gives the
possibilities that exist in reality.
Null Hypothesis H0 is
True
Not True
Decision
Reject H0 Type I Error No Error
Do not reject H0
No Error Type II Error
-
8/7/2019 6 TESTING OF HYPOTHESIS1
22/32
Type I Error
Reject H0, when H0 is True
Type II ErrorDo Not Reject H0, when H0 is Not True
Which of the two errors is more serious?
Type I or Type II?
-
8/7/2019 6 TESTING OF HYPOTHESIS1
23/32
Level of significance
To design a good test we would like to arrive at adecision criterion in such a way that none of the twoerrors, (Type I Error and Type II Error) occur.
But when P(Type I Error) 0, P(Type II Error) 1& when P(Type II Error) 0, P(Type I Error) 1
Hence, no test can be perfect. We therefore design atest such that one of the two probabilities is restrictedto a small value (0 < < 1 and is closer to 0) andthen minimize probability of the other error.
-
8/7/2019 6 TESTING OF HYPOTHESIS1
24/32
The error in rejecting H0, when it is true (Type I
Error) is more serious errorthan (Type II Error),
therefore an upper limit is put on P(Type I Error)
and P(Type II Error) is simultaneously
minimized. This upper limit is known as level of
significance.Thus, a test is so designed that
P(Type I Error) <
then is called level of significance
Hence, = Max. P(Type I Error).
-
8/7/2019 6 TESTING OF HYPOTHESIS1
25/32
DECISION CRITERION
In p-value of the teststatistic is less than thelevel of significance ,
reject H0.
-
8/7/2019 6 TESTING OF HYPOTHESIS1
26/32
Distributions used intesting of hypothesis
In order to test different parameters, for
different sample sizes and comparisons of
such parameters for multiple populations,
different statistical distributions are used.
-
8/7/2019 6 TESTING OF HYPOTHESIS1
27/32
Testing of Hypotheses
Testing mean Testing variance
-
8/7/2019 6 TESTING OF HYPOTHESIS1
28/32
Testing mean
Singlesample
Sample
size 30
Z test
Sample
size
-
8/7/2019 6 TESTING OF HYPOTHESIS1
29/32
Testing
variance
Single sample
Chi Square test
Two samples
F test
-
8/7/2019 6 TESTING OF HYPOTHESIS1
30/32
For testing association between two variablesChi-Square test for Independence of
Attributes is used.
Expected frequencies are calculated using the
following formula:
E =
O= Observed frequenciesN
CTRTv
-
8/7/2019 6 TESTING OF HYPOTHESIS1
31/32
For fitting a distribution to a given data
Chi-Square test for Goodness of Fit is used
Expected frequencies are calculated
depending upon the distribution.
-
8/7/2019 6 TESTING OF HYPOTHESIS1
32/32
Thank You