chapter 11 chi-square distribution
Post on 02-Jan-2016
60 Views
Preview:
DESCRIPTION
TRANSCRIPT
Review
• So far, we have used several probability distributions for hypothesis testing and confidence intervals with normal distribution and Student’s t distribution.
• In this section, we will be using chi-squre.
What is Chi-Square?
• = Chi-Square
• The values begin at 0 and then all are positive. The graph of is not symmetrical, and like student’s t distribution, it depends on the number of degrees of freedom.
• It can determine if random variables are dependent or independent.
• It can determine if different populations share the same proportions of specified characteristics.
Mode (high point)
• The mode (high point) of a chi-square distribution with n degrees of freedom occurs over n-2 (for
Degrees of Freedom
• Degrees of freedom = (number of rows – 1)(Number of columns – 1)
• R= number of cell rows• C=number of cell columns
Example: (The situation)
• Innovative Machines Incorporated has developed two new letter arrangements for computer keyboards. The company wishes to see if there is any relationship between the arrangement of letters on the keyboard and the number of hours it takes a new typing student to learn to type at 20 words per minute. Or, from another point of view, is the time it takes a student to learn to type independent of the arrangement of the letters on a keyboard? Use 5% level of significance
Example: (step 1)
• Keyboard arrangement and learning times are independent
• Keyboard arrangement and learning times are not independent
Answer for E (will show in class)Keyboard 21-40 h 41-60 h 61-80 h Row Total
A O:25E:24
O:30E:40
O:25E:16
80
B O:30E:36
O:71E:60
O:19E:24
120
Standard O:35E:30
O:49E:50
O:16E:20
100
Column Total 90 150 60 300 (sample size)
Remember
Chart to find
Cell
1 25 24 1 1 0.04
2 30 40 -10 100 2.50
3 25 16 9 81 5.06
4 30 36 -6 36 1.00
5 71 60 11 121 2.02
6 19 24 -5 25 1.04
7 35 30 5 25 0.83
8 49 50 -1 1 0.02
9 16 20 -4 16 0.80
Conclusion
• Look in the book with chi-square table.
• Since we have Chi-square as 13.31 with d.f. 4
• The corresponding P-value falls between 0.005 and 0.010.
• Since (.005< P-Value < 0.010) < .05, we reject null and accept alternate. Based on 5% level of significance, we are taking a chance to conclude that keyboard arrangement and learning time are not independent.
Group Work (the situation)
• Vending Machine is to install soda machines in elementary school and high school. The market analyst wish to know if flavor preference and school level are independent. A random sample of 200 students was taken. Their school level and soda preferences are given. Is independence indicated at the 1% level of significance?
Group Work (table)Soda High School Elementary Row Total
Coke O:33E:
O:57E:
90
Pepsi O:30E:
O:20E:
50
Mountain Dew O:5E:
O:35E:
40
Fanta O:12E:
O:8E:
20
Column Total 80 120 200 (sample size)
Test of homogeneity
• The test claim that different populations share the sample proportions of specified characteristics.
Test of Homogeneity
• The procedure is very much the same as test for independence, except the hypothesis is different.
• For test of independence:
• For test of homogeneity:
Example:
• If you could own one pet, what kind would you choose? The possible responses were of the following. Does the same proportion of males same as females prefer each type of pet? Use 1 % level of significance
Gender Dog Cat Other pet No Pet
Female 120 132 18 30
Male 135 70 20 25
Fill this outGender Dog Cat Other pet No Pet Row Total
Female O:120E:
O:132E:
O:18E:
O:30E:
Male O:135E:
O:70E:
O:20E:
O:25E:
Column Total
AnswerGender Dog Cat Other pet No Pet Row Total
Female O:120E:139.09
O:132E:110.18
O:18E:20.73
O:30E:30
300
Male O:135E:115.91
O:70E:91.82
O:20E:17.27
O:25E:25
250
Column Total
255 202 38 55 550 (sample size)
AnswerCell
1 120 139.09 2.62
2 132 110.18 4.320
3 18 20.73 0.359
4 30 30 0
5 135 115.91 3.144
6 70 91.82 5.185
7 20 17.27 0.431
8 25 25 0
Final Answer
• Chi-square= 16.059• d.f.=3• P-value=.001
• Based on 1% level of significance, we are taking a chance to say that males and female students have different preferences when it comes to selecting a pet because we rejected the null saying preference is the same and accept the alternate saying the preference is different.
Reason Behind Goodness of Fit
• Set up a test to investigate how well a sample distribution fits a given distribution
• Use observed and expected frequencies to compute the sample chi-square statistics
• Find or estimate the P-value and complete the test
Sample statistic
• With degrees of freedom= k-1• E=Expected frequency• O=Observed frequency• k=number of categories in the distribution
Question
• Does present distribution of favorable responses the same or different than last year? To test this hypothesis, a random sample of 500 employees was taken. The chart is on the next slide. Use 1% level of significance
ExampleCategory Percentage of Favorable Responses
Vacation time 4%
Salary 65%
Safety regulations 13%
Health and retirement benefits 12%
Overtime policy and pay 6%
Category Observed
Vacation time 30
Salary 290
Safety regulations 70
Health and retirement benefits
70
Overtime 40
Answer
Category O E
Vacation time 30 20 100 5.00
Salary 290 325 1225 3.77
Safety regulations
70 65 25 0.38
Health and retirement benefits
70 60 100 1.67
Overtime 40 30 100 3.33
Total 500 500 14.15
Answer
• K-1 = 5-1=4
• (.005<P-value<.010) < .01• Reject null, accept alternate
• At the 1% level of significance, we can say that the evidence supports the conclusion that this year’s responses to the issues are different from last years because we reject the null saying they are the same and accept the alternate, saying they are different.
Group Work• The age distribution of the Canadian population and the age
distribution of a random sample of 455 residents in the Indian community (Red Lake village)
• Use 5% level of significance to test the claim that the age distribution fits the age distribution of red lake village
Age % population Observed in Red Lake Village
Under 5 7.2% 47
5-14 13.6% 75
15-64 67.1% 288
65 + 12.1% 45
top related