chapter vi hypothesis testing basic concepts in ...€¦ · basic concepts in statistical...

24
Chapter VI Hypothesis Testing Inferential statistics is concerned with the formulation of conclusions or generalizations about a population based on a sample drawn from the population. There are two areas of inferential statistics: estimation which is concerned with determining the true value of population parameters; and hypothesis testing which is concerned with determining the validity of assertions about population parameters. These assertions are in the form of statistical hypothesis. Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia is higher than the average score of pupils in traditional classroom setting. Hypothesis formulated are classified into two: null and alternative. Example: Manager of a certain bank claims that their non-ATM customers need to wait, on the average, for at most ten minutes before they are served. The null and alternative hypotheses are: H0: Non-ATM customers need to wait an average of at most ten minutes before they are served. That is, 10. Ha: Non-ATM customers need to wait an average of more than ten minutes before they are served. That is, 10. Example: There is no difference between the performance of pupils in traditional classroom setting (B) and in classes with multimedia facilities (A). H0: Average grade of pupils in classes with multimedia is equal to the average grade of pupils in traditional classroom. That is, A = B. Ha: Average grade of pupils in classes with multimedia is not equal to the average grade of pupils in traditional classroom. That is, A B. Example: There is no difference between the performance of pupils in traditional classroom setting (B) and in classes with multimedia facilities (A). H0: Average grade of pupils in classes with multimedia is equal to the average grade of pupils in traditional classroom. That is, A = B. Ha: Average grade of pupils in classes with multimedia is greater than the average grade of pupils in traditional classroom. That is, A > B. Statistical Hypothesis an assertion, statement or conjecture concerning the value(s) of one or more unknown parameters of the population Types of Statistical Hypothesis Null Hypothesis denoted by H0 statement being tested a hypothesis of equality or no difference or no improvement Alternative Hypothesis denoted by Ha or H1 a hypothesis believed to be true when the null hypothesis is rejected

Upload: others

Post on 18-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

Chapter VI Hypothesis Testing

Inferential statistics is concerned with the formulation of conclusions or generalizations about a population based on a sample drawn from the population. There are two areas of inferential statistics: estimation which is concerned with determining the true value of population parameters; and hypothesis testing which is concerned with determining the validity of assertions about population parameters. These assertions are in the form of statistical hypothesis. Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia is higher than the

average score of pupils in traditional classroom setting. Hypothesis formulated are classified into two: null and alternative. Example: Manager of a certain bank claims that their non-ATM customers need to wait, on the

average, for at most ten minutes before they are served. The null and alternative hypotheses are:

H0: Non-ATM customers need to wait an average of at most ten minutes before they

are served. That is, 10. Ha: Non-ATM customers need to wait an average of more than ten minutes before

they are served. That is, 10. Example: There is no difference between the performance of pupils in traditional classroom setting

(B) and in classes with multimedia facilities (A). H0: Average grade of pupils in classes with multimedia is equal to the average grade

of pupils in traditional classroom. That is, A = B. Ha: Average grade of pupils in classes with multimedia is not equal to the average

grade of pupils in traditional classroom. That is, A B.

Example: There is no difference between the performance of pupils in traditional classroom setting (B) and in classes with multimedia facilities (A).

H0: Average grade of pupils in classes with multimedia is equal to the average grade

of pupils in traditional classroom. That is, A = B. Ha: Average grade of pupils in classes with multimedia is greater than the average

grade of pupils in traditional classroom. That is, A > B.

Statistical Hypothesis

an assertion, statement or conjecture concerning the value(s) of one or more unknown parameters of the population

Types of Statistical Hypothesis Null Hypothesis

denoted by H0

statement being tested

a hypothesis of equality or no difference or no improvement Alternative Hypothesis

denoted by Ha or H1

a hypothesis believed to be true when the null hypothesis is rejected

Bert
Rectangle
Page 2: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

Information provided by the random sample is used to decide whether a hypothesis is likely

to be true or false by performing a test of hypothesis.

These tests of hypothesis are classified into two: one-tailed and two-tailed tests.

In testing a statistical hypothesis, the null hypothesis is evaluated on the basis of a random

sample. Four possibilities exist in connection with the decision procedure of hypothesis testing.

Decision Fact

H0 is true H0 is false

Accept H0 Correct Decision Type II Error

Reject H0 Type I Error Correct Decision

Incorrect decisions are possible in hypothesis testing. These errors are called Type I and Type II errors.

The extent of committing a Type I error is measured by the probability of committing a Type

I error and is denoted by , while the probability of committing a type II error is denoted by .

and are inversely related. For a fixed sample size, as increases decreases. But both

and can be reduced by increasing the sample size. It is impossible to tell whether a correct decision is made or an error has been committed. The most that can be done is to assume some degree of confidence with a decision.

Test of Hypothesis

a statistical tool used to decide whether or not to reject a statistical hypothesis

Types of Tests of Hypothesis One-tailed Test

used to test a null hypothesis against a directional alternative hypothesis

Two-tailed Test

used to test a null hypothesis against a non-directional alternative hypothesis

Types of Errors Type I Error

error of rejecting a true null hypothesis Type II Error

error of accepting a false null hypothesis

Types of Alternative Hypothesis Non-directional Hypothesis

a statement which asserts that one value is different from another

Directional Hypothesis

an assertion that one measure is less than (or greater than) another measure of similar nature

Page 3: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

Usually, is freely determined by the researcher. The choice of depends on the

consequences associated of making a Type I error. Common choices of are 0.10, 0.05, or 0.01. The decision to reject or not the null hypothesis is based on the information provided by the sample. This information can be in the form of a test statistic or a p-value since both measure the agreement between the sample data and the null hypothesis. The test statistic assumes an entire set of values divided into two regions. One region, called the rejection region or critical region, consists of values that support H1 and leads to the rejection of H0. The other, called the non-rejection or “acceptance” region, consists of values that support H0.

Specifying determines a critical value that defines the regions of rejection and non-rejection. While, the type of test determines the location of the critical region. In a two-tailed test, the critical region is split into two equal parts placed in each tail of the distribution of the test statistic. In a one-tailed test, the critical region lies entirely on one tail of the distribution with the inequality symbol pointing to its direction. Example: If Ha is formulated in such a way that the less than (<) sign is appropriate, then

If Ha is formulated in such a way that the greater than (>) sign is appropriate, the n

Level of Significance

denoted by

also known as the maximum probability of committing a Type I error

a measure of the degree of confidence with a decision

Measures in Decision Making Test Statistic

statistic computed from the sample data that is sensitive to the difference between H0 and Ha

p-value

probability of getting the observed value of the test statistic at least that extreme ( in the direction of Ha), assuming that H0 is true.

critical value

critical region

critical value

Page 4: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

If Ha is formulated in such a way that the greater than () sign is appropriate, then

If the observed value of the test statistic falls into the critical region, then H0 is rejected. If the observed value of the test statistic falls into the non-rejection region, then do not reject H0.

The p-value is an alternative measure that can be used to arrive at a decision. A small p-value (close to zero) indicates that the sample is not consistent with the null hypothesis. That is, the observed value of the test statistic lies far from the hypothesized value of the parameter of interest. On the other hand, a large p-value indicates an agreement between the sample data and the null hypothesis.

Decision Rule Based on Test Statistic Reject H0 if the computed value of the test statistic falls into the critical region. Otherwise, fail to reject H0.

Decision Rule Based on p-value Reject H0 if the p-value is less than or equal to a specified level

of significance . Otherwise, fail to reject H0. p-value>0.05fail to reject Honot significant

p-value<0.05reject Ho significant p-value<0.01 reject Hohighly significant

Steps in Hypothesis Testing 1. Formulate the null hypothesis H0 and alternative hypothesis H1

2. Decide on a level of significance . 3. Decide on the type of data to be collected and choose an

appropriate test statistic and testing procedure. 4. State the decision rule. 5. Compute for the value of the test statistic from the sample data. 6. Make a decision. 7. Interpret results.

critical region

critical values

Page 5: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

Tests of Hypothesis for the Difference of Two Population Means Independent vs Related Samples/Groups It is important to identify the type of samples or groups one is working on because the type of group dictates the type of analysis to perform on the data.

Related or dependent samples arise via the following study designs: 1. Pre-Post test design: A group of subjects (students, employees) are evaluated using

the same (or similar) instrument or test at two different time periods (pre- and post-)

2. Matched-pairs design: Two identical groups of subjects are used in this design. Some form of matching is applied to subjects in each group. This is to ensure that subjects in the first group is as nearly similar as the subjects in the second group. An example is the use of identical twins. One of the twins was assigned to a treatment group (classroom with multimedia) and the other twin to the control group (traditional classroom).

Any study design involving two groups which does not follow any of the above mentioned

related-samples designs is said to be using independent groups or samples.

Case 1. Independent Samples Assumptions:

1. Both samples are random samples from normal populations 2. The two samples are independent of one another.

Test Procedure:

Hypotheses: H0: 1 = 2 vs (a) H1: 1 2

(b) H1: 1 > 2

(c) H1: 1 < 2

Test Statistic: T test for small samples (Z test for large samples)

2

2

2

1

2

1

021

n

σ

n

σ

dxxz

, if samples were drawn from normal populations with 1

2 and

22 known; or whenever both n1 and n2 exceeds 30

or

21

2

p

021

n

1

n

1s

dxxt , if samples are drawn from approximate normal populations

with 12 = 2

2 unknown

where

2

11

21

222

2112

nn

snsnsp is the pooled estimate of 2.

and with df = n1 + n2 –2. or

2

2

2

1

2

1

021

n

s

n

s

dxxt

, if samples were drawn from approximate normal populations

with 12 2

2 unknown

Page 6: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

with

11 2

2

2

22

1

2

1

21

2

2

22

1

21

n

n

s

n

n

s

n

s

n

s

df

Example: A firm has a generous but rather complicated policy concerning end-of-year bonuses

for its lower-level managerial personnel. The policy’s key factor is a subjective judgment of ‘‘contribution to corporate goals.’’ A personnel officer took samples of 24 female and 36 male managers to see whether there was any difference in bonuses, expressed as a percentage of yearly salary. The data is given below.

Female

9.2 8.0 8.4 7.7 9.9 9.6 10.9 6.9

8.4 9.0 9.0 9.3 9.0 8.4 9.1 8.4

7.6 9.2 7.4 9.1 6.7 7.7 6.2 8.7

Male

10.4 9.0 8.7 9.3 8.9 9.1 9.6 10.4 10.1

9.2 8.7 10.5 7.9 9.7 9.0 8.7 9.9 8.9

9.4 10.0 9.2 9.6 9.8 10.1 9.4 9.2 9.0

9.7 9.9 9.2 10.4 8.9 9.0 8.8 9.6 9.0

Objective: Determine if male and female managers received the same percent bonuses. Analysis:

1. Test if the data for the two samples are approximately normal.

Ho: Percent bonuses (both for males and females) are normally distributed. Ha: Percent bonuses (both for males and females) are not normally distributed.

Test of normality in STATA

1. From the menu at the top of the screen, click on StatisticsSummaries, tables, and testsDistributional plots and testsShapiro-Wilk normality test.

2. In the Main tab of the dialog box, select Pctbonus in the pull down menu under the Variables box.

3. Click the by/if/in tab and check the box Repeat command by groups. In the pull down menu under Variables that define groups select Sex.

4. Click OK.

Page 7: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

SCREENSHOTS

OUTPUT

Pctbonus 36 0.96911 1.126 0.249 0.40174

Variable Obs W V z Prob>z

Shapiro-Wilk W test for normal data

-> Sex = Male

Pctbonus 24 0.97755 0.606 -1.022 0.84670

Variable Obs W V z Prob>z

Shapiro-Wilk W test for normal data

-> Sex = Female

. by Sex, sort : swilk Pctbonus

Command

Page 8: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

Interpretation: Since the p-values (0.84670 for Females and 0.40174 for Males) are all greater than a 0.05 level of significance, we fail to reject the null hypothesis (Ho). Therefore, the data (percent bonuses of males and females) are normally distributed.

2. Test of equality of variances (homogeneity of variances)

Ho: The variance of the percent bonuses of males is equal to the variance of the percent bonuses of females. Ha: The variance of the percent bonuses of males is not equal to the variance of the percent bonuses of females.

Test of equality of two variances in STATA 1. From the menu at the top of the screen, click on StatisticsSummaries, tables, and

testsClassical tests of hypothesesVariance-comparison test. 2. In the Main tab of the dialog box, tick the Two-sample using groups radio button. 3. Select Pctbonus in the pull down menu under the Variable name box and Sex under

the Group variable name. 4. Click OK.

SCREENSHOTS

Page 9: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

OUTPUT

Interpretation: Since the two-tailed p-value (2*Pr(F>f)=0.0015) is less than a 0.05 level of significance, reject the null hypothesis (Ho). Therefore, the variance of the percent bonuses of males is not equal to the variance of the percent bonuses of females.

3. Independent samples T test Having made sure that the data are normally distributed and observing that the

variances of the two groups are unequal, we now proceed to testing if there is significant difference in the average percent bonuses of male and female managers.

Ho: There is no significant difference in the mean percent bonus of male and

female managers. Ha: There is significant difference in the mean percent bonus of male and female

managers.

Independent Samples T-test in STATA 1. From the menu at the top of the screen, click on StatisticsSummaries, tables, and

testsClassical tests of hypothesest-test (mean-comparison test). 2. In the Main tab of the dialog box, tick the Two-sample using groups radio button. 3. Select Pctbonus in the pull down menu under the Variable name box and Sex under

the Group variable name. 4. Check the Unequal variances box. 5. Click OK.

Pr(F < f) = 0.9992 2*Pr(F > f) = 0.0015 Pr(F > f) = 0.0008

Ha: ratio < 1 Ha: ratio != 1 Ha: ratio > 1

Ho: ratio = 1 degrees of freedom = 23, 35

ratio = sd(Female) / sd(Male) f = 3.2816

combined 60 9.033333 .1198791 .9285795 8.793456 9.273211

Male 36 9.394444 .0991987 .5951924 9.19306 9.595829

Female 24 8.491667 .2200886 1.07821 8.036379 8.946955

Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]

Variance ratio test

. sdtest Pctbonus, by(Sex)

Command

Page 10: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

SCREENSHOTS

OUTPUT

Pr(T < t) = 0.0004 Pr(|T| > |t|) = 0.0007 Pr(T > t) = 0.9996

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0

Ho: diff = 0 Satterthwaite's degrees of freedom = 32.415

diff = mean(Female) - mean(Male) t = -3.7396

diff -.9027778 .2414113 -1.39427 -.4112859

combined 60 9.033333 .1198791 .9285795 8.793456 9.273211

Male 36 9.394444 .0991987 .5951924 9.19306 9.595829

Female 24 8.491667 .2200886 1.07821 8.036379 8.946955

Group Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]

Two-sample t test with unequal variances

. ttest Pctbonus, by(Sex) unequal

Command

Page 11: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

Interpretation: Since the two-tailed p-value (0.0007) is less than the 0.05 level of significance, we

reject the null hypothesis. Therefore, there is significant difference in the average percent bonus received by male and female managers. In fact, male managers received higher mean percent bonus (mean=9.394) than females (mean=8.492).

Case 2: Related or Paired Samples Assumptions:

1. The two samples are related of one another. 2. The random sample of n differences is from a normal population.

Test Procedure:

Hypotheses: H0: 1 = 2 vs (a) H1: 1 2

(b) H1: 1 > 2

(c) H1: 1 < 2 Test Statistic:

n

s

dt

d

with df = n – 1.

Example: A study was designed to measure the effect of home environment on academic

achievement of 12-year-old students. Because genetic differences may also contribute to academic achievement, the researcher wanted to control for this factor. Thirty sets of identical twins were identified who had been adopted prior to their first birthday, with one twin placed in a home in which academics were emphasized (Academic) and the other twin placed in a home in which academics were not emphasized (Nonacademic). The final grades (based on 100 points) for the 60 students are given below.

Twin Academic Non-

Academic Twin Academic Non-

Academic

1 78 71 16 90 88

2 75 70 17 89 80

3 68 66 18 73 65

4 92 85 19 61 60

5 55 60 20 76 74

6 74 72 21 81 76

7 65 57 22 89 78

8 80 75 23 82 78

9 98 92 24 70 62

10 52 56 25 68 73

11 67 63 26 74 73

12 55 52 27 85 75

13 49 48 28 97 88

14 66 67 29 95 94

15 75 70 30 78 75

Objective: Determine if home environment has effect on academic achievement. Specifically,

determine if students in academically oriented home environment perform better than students in non-academically oriented home environment.

Page 12: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

Analysis: 1. Test if the pairwise differences in the grades are normally distributed.

Computing pairwise differences a) In the Command window type gen diff = Academic - NonAcademic.

b) Hit ENTER.

Test of normality of pairwise differences using STATA

1. From the menu at the top of the screen, click on StatisticsSummaries, tables, and testsDistributional plots and testsShapiro-Wilk normality test.

2. In the Main tab of the dialog box, select diff in the pull down menu under the Variables box.

3. Click OK.

OUTPUT

Interpretation: The p-value (0.42178) is greater than the 0.05 level of significance,

hence, the null hypothesis is not rejected. Therefore, the pairwise differences are normally distributed.

2. Paired samples T test

Ho: There is no significant difference in the mean grade of students in academically oriented home environment and non-academically oriented home environment.

Ha: The mean grade of students in academically oriented home environment is significantly higher than the mean grade of students in non-academically oriented home environment.

Paired-samples T-test using STATA 1. From the menu at the top of the screen, click on StatisticsSummaries, tables,

and testsClassical tests of hypothesest-test (mean-comparison test). 2. In the Main tab of the dialog box, tick the Paired radio button. 3. Select Academic in the pull down menu under the First variable box and

NonAcademic under the Second variable box. 4. Click OK.

diff 30 0.96539 1.100 0.197 0.42178

Variable Obs W V z Prob>z

Shapiro-Wilk W test for normal data

. swilk diff

Page 13: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

SCREENSHOTS

OUTPUT

Interpretation: The mean grade of students in academically oriented home environment is

significantly higher than the mean grade of students in non-academically oriented home environment (p=0.000). On the average, grades of students in academically oriented home environment is 3.8 points higher than the mean grade of students in non-academically oriented home environment. Further, this implies that home environment has significant effect on academic achievement.

Pr(T < t) = 1.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 0.0000

Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0

Ho: mean(diff) = 0 degrees of freedom = 29

mean(diff) = mean(Academic - NonAcademic) t = 4.9496

diff 30 3.8 .7677404 4.205087 2.229795 5.370205

NonAca~c 30 71.43333 2.085904 11.42497 67.16718 75.69949

Academic 30 75.23333 2.426237 13.28905 70.27112 80.19555

Variable Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]

Paired t test

. ttest Academic == NonAcademic

Command

Page 14: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

Test of Hypothesis About Two Population Proportions

Assumptions:

1. The random samples taken from some populations are sufficiently large 2. The two samples are independent of one another.

Test Procedure:

Hypotheses: H0: p1 = p2 vs (a) H1: p1 p2 b) H1: p1 > p2 (c) H1: p1 < p2 Test Statistic:

21

21

11

nnq̂p̂

p̂p̂z

where 21

21

nn

xxp̂

Example: An educational researcher designs a study to compare the effectiveness of teaching

English to non-English-speaking people by a computer software program and by the traditional classroom system. The researcher randomly assigns 125 students from a class of 300 to instruction using the computer. The remaining 175 students are instructed using the traditional method. At the end of a 6-month instructional period, all 300 students are given an examination with the results reported in the following table.

Does instruction using the computer software program appear to increase the

proportion of students passing the examination in comparison to the passing rate using the traditional method of instruction?

Analysis: First, the data must be encoded in STATA in this format.

Method Result No. of

students

1 1 94

1 2 31

2 1 113

2 2 62

where: Method=1 (Computer), Method=2 (Traditional) and Result=1 (Pass), Result=2 (Fail)

Objective: Determine if using the computer software program increases the proportion

of students passing the examination in comparison to the passing rate using the traditional method of instruction

Hypotheses:

Ho: There is no significant difference in the passing rate between computer-aided instruction and traditional instruction.

Ha: The passing rate in computer-aided instruction is higher than the passing rate in traditional instruction.

Page 15: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

Z test for Comparing Two Proportions in STATA 1. From the menu at the top of the screen, click on StatisticsSummaries, tables,

and testsFrequency tablesTwo-way table with measures of association. 2. In the Main tab of the dialog box, select Method in the pull down menu under the

Row variable box and Result under the Column variable box. 3. Check the Fisher’s exact test under Test statistics. 4. Click the Weights tab, tick the Frequency weights radio button and in the pull

down menu under Frequency weight select numstudents. 5. Click OK.

SCREENSHOTS

Page 16: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

OUTPUT

Interpretation: The passing rate in computer-aided instruction is significantly higher than the passing

rate in traditional instruction (p=0.033). Seventy-five percent of students under computer-assisted instruction passed the examination while only 64.6% of students under traditional instruction passed the examination.

Comparing Three or More Population Means (One-way Analysis of Variance)

Basic Idea of Analysis of Variance

A one-way between-subjects ANOVA is the generalized form of an independent-samples T test. The T test involves the comparison of just two groups representing a single independent variable; the ANOVA permits us to compare three or more groups.

The statistical strategy underlying the ANOVA is to partition (analyze) the total variance of the dependent variance into its constituent sources of variance. This total variance is defined in terms of the difference between the grand or overall mean of the entire sample and each score associated with each case. Different ANOVA designs partition the variance of the dependent variable into somewhat different sources of variance. In a one-way between-subjects design, the two sources of variance into which the total variance of the dependent variable is partitioned are as follows:

Between-Groups Variance. This reflects the differences in means between the groups

and is defined in terms of the differences between the group means and the grand mean.

It represents the effect of the independent variable.

Within-Groups Variance. This reflects the variability within each group and is defined in

terms of the difference between the group mean and each score within that group. It

represents the error of measurement in the study.

Assumptions: 1. The random samples are taken from independent normal populations (normality). 2. The samples have the same variance (homogeneity of variances).

Test Procedure:

Hypotheses: H0: There is no difference in the population means. (μ1 = μ2 =…= μt)

Ha: There is difference in the population means. (μi ≠ μj, for at least one pair (i,j))

Test Statistic: F

groups withinVariation

groups between VariationF

1-sided Fisher's exact = 0.033

Fisher's exact = 0.058

Total 207 93 300

2 113 62 175

1 94 31 125

Method 1 2 Total

Result

. tabulate Method Result [fweight = numstudents], exact

Command

Page 17: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

Example: A clinical psychologist wished to compare three methods for reducing hostility levels in university students, and used a certain test (HLT) to measure the degree of hostility. A high score on the test indicated great hostility. The psychologist used 24 students who obtained high and nearly equal scores in the experiment. Eight were selected at random from among the 24 problem cases and were treated with method 1. Seven of the remaining 16 students were selected at random and treated with method 2. The remaining nine students were treated with method 3. All treatments were continued for a one-semester period. Each student was given the HLT test at the end of the semester, with the results shown below.

Analysis:

A. Test of normality in STATA 1. From the menu at the top of the screen, click on StatisticsSummaries, tables, and

testsDistributional plots and testsShapiro-Wilk normality test. 2. In the Main tab of the dialog box, select Score in the pull down menu under the

Variables box. 3. Click the by/if/in tab and check the box Repeat command by groups. In the pull down

menu under Variables that define groups select Method. 4. Click OK.

OUTPUT

Interpretation: The p-values associated with the Shapiro-Wilk test for normality for each method are all greater than the 0.05 level of significance, hence, the null hypothesis is not rejected. Therefore, the HLT scores of students in each method are normally distributed.

Score 9 0.95466 0.666 -0.647 0.74112

Variable Obs W V z Prob>z

Shapiro-Wilk W test for normal data

-> Method = 3

Score 7 0.98674 0.174 -2.185 0.98557

Variable Obs W V z Prob>z

Shapiro-Wilk W test for normal data

-> Method = 2

Score 8 0.97133 0.399 -1.330 0.90820

Variable Obs W V z Prob>z

Shapiro-Wilk W test for normal data

-> Method = 1

. by Method, sort : swilk Score

Page 18: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

B. Test of homogeneity of variances H0: There is no difference in the variances of HLT scores of students in the three

methods. Ha: There is difference in the variances of HLT scores of students in the three

methods.

1. From the menu at the top of the screen, click on StatisticsSummaries, tables, and testsClassical tests of hypothesesRobust equal-variance test.

2. In the Main tab of the dialog box, select Score in the pull down menu under the Variable box and Method under the Variable defining comparison groups.

3. Click OK. SCREENSHOTS

Page 19: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

OUTPUT

Remarks: Three test statistics for testing equality of variances are displayed: Levene’s (W0), Brown-and-Forsythe’s (W50) and Brown-and-Forsythe’s based 10% trimmed mean (W10).

Interpretation: The p-value associated with each test statistic is greater than the 0.05 level

of significance, hence, the null hypothesis is not rejected. Therefore, the variances of HLT scores of students in the three methods are not significantly different.

C. Running one-way ANOVA

H0: There is no difference in the mean HLT scores of students among the three methods.

Ha: There is difference in the mean HLT scores of students among the three methods.

1. From the menu at the top of the screen, click on StatisticsLinear models and

relatedANOVA/MANOVAOne-way ANOVA. 2. In the Main tab of the dialog box, select Score in the pull down menu under the

Response variable box and Method under the Factor variable. 3. Check the box opposite to Scheffe test under the Multiple-comparison test. 4. Under Output, check the Produce summary table box. 5. Click OK.

W10 = 1.7351436 df(2, 21) Pr > F = 0.2007208

W50 = 1.6817088 df(2, 21) Pr > F = 0.21016065

W0 = 1.7351436 df(2, 21) Pr > F = 0.2007208

Total 77.583333 8.0158358 24

3 71 3.6742346 9

2 75.571429 3.101459 7

1 86.75 5.6251984 8

Method Mean Std. Dev. Freq.

Summary of Score

. robvar Score, by(Method)

Page 20: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

SCREENSHOTS

Page 21: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

OUTPUT

Interpretations: 1. Another test for homogeneity of variances (Bartlett’s test) reveals that the

variances are homogeneous (Ho is not rejected, p-value=0.292>0.05). 2. The ANOVA table shows that there is significant difference in the mean HLT

scores of students among the three methods (F=29.574, p=0.000). 3. The Scheffe test for pairwise main comparison shows that the mean HLT score

of students in method 2 is not significantly different from the mean HLT score of students in method 3; the mean HLT score of students in method 1 is significantly different from the means of methods 2 and 3. Methods 2 and 3 produce significant reduction in student hostility.

0.000 0.132

3 -15.75 -4.57143

0.000

2 -11.1786

Col Mean 1 2

Row Mean-

(Scheffe)

Comparison of Score by Method

Bartlett's test for equal variances: chi2(2) = 2.4594 Prob>chi2 = 0.292

Total 1477.83333 23 64.2536232

Within groups 387.214286 21 18.4387755

Between groups 1090.61905 2 545.309524 29.57 0.0000

Source SS df MS F Prob > F

Analysis of Variance

Total 77.583333 8.0158358 24

3 71 3.6742346 9

2 75.571429 3.101459 7

1 86.75 5.6251984 8

Method Mean Std. Dev. Freq.

Summary of Score

. oneway Score Method, scheffe tabulate

Command

1

2

3

Page 22: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

Analysis of Frequency Data: The Chi-square Test Cross-tabulation is one of the most frequently used methods of analysis for categorical

data. It enables us to examine the relationship between categorical variables in greater detail than simple frequencies for individual variables.

Example: Cross-tabulation of teachers by position and sex

Sex T-1 T-2 T-3 Head Teacher/

Principal Total

Male 10 9 7 3 29

Female 13 12 6 4 35

Total 23 21 13 7 64

The statistical analysis associated with cross-tabulation is the chi-square (pronounced ‘kye

square’) test. A popular application of the chi-square test in social science research is evaluating association between two categorical variables (chi-square test of independence).

The null and alternative hypothesis.

Ho: The two variables are independent. Ha: The two variables are not independent.

The test statistic:

r

1i

c

1j i j

2

i ji j2

E

EOχ

where Oij = observed number of cases in the ith row and jth column Eij = expected number of cases under Ho (column total) x (row total)

= --------------------------------------- grand total

Remarks: 1. The test is valid if at least 80% of the cells have expected frequencies of at least 5 and no cell

has an expected frequency 1 . 2. If many expected frequencies are very small, researchers commonly combine categories of

variables to obtain a table having larger cell frequencies. Generally, one should not pool categories unless there is a natural way to combine them.

3. For a 2 x 2 contingency table, a correction called Yate’s correction for continuity is applied. The formula then becomes

r

1i

c

1j i j

2

i ji j2

E

5.0EOχ

Example: Suppose we test whether there exists a relationship between music preference and

IQ of 480 senior high school students. The observed frequencies are given in the table below.

Music Preference

IQ

High Medium Low Total

Classical 40 26 17 83

Pop 47 59 25 131

Rock 83 104 79 266

Total 170 189 121 480

Page 23: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

Analysis: Ho: Music preference and intelligence of students are independent.

Ha: Music preference and intelligence of students are not independent.

The above cross-tabulation has to be entered in STATA similar to the example on the Z test for two proportions. This data when entered in STATA will look like this

Music IQ No. of

Students

1 1 40

1 2 26

1 2 17

2 1 47

2 2 59

2 3 25

3 1 83

3 2 104

3 3 79

where: music=1 if classical, music=2 if pop, music=3 if rock; IQ=1 if high, IQ=2 if medium, IQ=3 if low

Chi-square test of independence in STATA

1. From the menu at the top of the screen, click on StatisticsSummaries, tables, and testsFrequency tablesTwo-way table with measures of association.

2. In the Main tab of the dialog box, select Music in the pull down menu under the Row variable box and IQ under the Column variable box.

3. Check the Pearson’s chi-squared box under Test statistics. 4. Check Expected frequencies box under Cell contents. 5. Click the Weights tab, tick the Frequency weights radio button and in the pull

down menu under Frequency weight select Numstuds. 6. Click OK.

SCREENSHOTS

Page 24: Chapter VI Hypothesis Testing Basic Concepts in ...€¦ · Basic Concepts in Statistical Hypothesis Testing Example: The average score of pupils in classes with provision of multimedia

OUTPUT

Interpretation: There is reason to believe that music preference and IQ of students are not independent.

That is, music preference is related to level of IQ of students (X2=12.4176, p=0.015).

Pearson chi2(4) = 12.4176 Pr = 0.015

170.0 189.0 121.0 480.0

Total 170 189 121 480

94.2 104.7 67.1 266.0

3 83 104 79 266

46.4 51.6 33.0 131.0

2 47 59 25 131

29.4 32.7 20.9 83.0

1 40 26 17 83

Music 1 2 3 Total

IQ

expected frequency

frequency

Key

. tabulate Music IQ [fweight = Numstuds], chi2 expected

Command