10/22/20151 puaf 610 ta session 8. 10/22/20152 recover from midterm
TRANSCRIPT
![Page 1: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/1.jpg)
04/22/23 1
PUAF 610 TA
Session 8
![Page 2: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/2.jpg)
04/22/23 2
Recover from midterm
![Page 3: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/3.jpg)
04/22/23 3
TODAY
• F - Distribution
• Analysis-of-variance
• Kruskal-Wallis test
• Correlation
![Page 4: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/4.jpg)
04/22/23 4
F - Distribution• An F-distribution
has two numbers of degrees of freedom :– Degrees of
Freedom for the Numerator (dfn)
– Degrees of Freedom for the Denominator (dfd) There is a different F distribution for
each combination of the degrees of freedom of the numerator and
denominator.
![Page 5: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/5.jpg)
04/22/23 5
F - Distribution
![Page 6: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/6.jpg)
04/22/23 6
F - Distribution
• Using F table– Select the significance level to be used– Determine the appropriate combination of
degrees of freedom
![Page 7: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/7.jpg)
04/22/23 7
F - Distribution
• If the α = 0.10 level of significance is selected.
• There are 5 degrees of freedom in the numerator, and 7 degrees of freedom in the denominator.
• The F value from the table is 2.88. • This means that there is exactly 0.10 of
the area under the F curve that lies to the right of F = 2.88
![Page 8: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/8.jpg)
04/22/23 8
Analysis-of-variance
• Introduction Question:
• Do the means of the quantitative variables depend on which group (given by categorical variable) the individual is in?– If categorical variable has only 2 values: 2-
sample t-test– If categorical variable has 3 or more values:
ANOVA
![Page 9: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/9.jpg)
04/22/23 9
Assumptions of ANOVA
![Page 10: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/10.jpg)
04/22/23 10
Logic of ANOVA
• Hypothesis
• F-statistic
• Compare CV or p
• Conclusion
![Page 11: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/11.jpg)
04/22/23 11
Analysis-of-variance
• ANOVA measures two sources of variation in the data and compares their relative sizes
• variation BETWEEN groups– the difference between its group mean and
the overall mean
• variation WITHIN groups– the difference between that value and the
mean of its group
![Page 12: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/12.jpg)
04/22/23 12
Analysis-of-variance
• The ANOVA F-statistic is a ratio of the “Between Group Variation” divided by the “Within Group Variation”:
• A large F indicates that there is more difference between groups than within groups.
MSE
MSTRF
Within
BetweenTreatment mean square
Error mean square
![Page 13: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/13.jpg)
04/22/23 13
Analysis-of-variance
![Page 14: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/14.jpg)
04/22/23 14
Analysis-of-variance
![Page 15: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/15.jpg)
04/22/23 15
![Page 16: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/16.jpg)
04/22/23 16
![Page 17: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/17.jpg)
04/22/23 17
![Page 18: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/18.jpg)
04/22/23 18
![Page 19: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/19.jpg)
04/22/23 19
Example
• A manager wishes to determine whether the mean times required to complete a certain task differ for the three levels of employee training.
• He randomly selected 10 employees with each of the three levels of training (Beginner, Intermediate and Advanced).
• Do the data provide sufficient evidence to indicate that the mean times required to complete a certain task differ for at least two of the three levels of training?
![Page 20: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/20.jpg)
04/22/23 20
Example
_
xLevel of Training
N
Advanced 10 24.2 21.54
Intermediate 10 27.1 18.64
Beginner 10 30.2 17.76
s2
![Page 21: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/21.jpg)
04/22/23 21
Example
• Ho: The mean times required to complete a certain task do not differ the three levels of training. ( µB = µI = µA)
• Ha: The mean times required to complete a certain task differ for at least two of the three levels of training.
![Page 22: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/22.jpg)
04/22/23 22
Example
Source df SS MS F
Treatments 2 180.067 90.033 4.662
Error 27 521.46 19.313
Total 29 702.527
![Page 23: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/23.jpg)
04/22/23 23
Example
•
![Page 24: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/24.jpg)
04/22/23 24
Example
• Decision: Reject Ho.
• Conclusion: There is sufficient evidence to indicate that the mean times required to complete a certain task differ for at least two of the three levels of training.
![Page 25: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/25.jpg)
Example
• Consider the following random samples from three different populations.
• Do the data provide sufficient evidence to indicate that the mean differ?
04/22/23 25
![Page 26: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/26.jpg)
Example
• The data below resulted from measuring the difference in resistance resulting from subjecting identical resistors to three different temperatures for a period of 24 hours. The sample size of each group was 5. In the language of Design of Experiments, we have an experiment in which each of three treatments was replicated 5 times.
04/22/23 26
![Page 27: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/27.jpg)
Example
Level 1 Level 2 Level 3
• 6.9 8.3 8.0
• 5.4 6.8 10.5
• 5.8 7.8 8.1
• 4.6 9.2 6.9
• 4.0 6.5 9.3
Is there a difference among the population means?
04/22/23 27
![Page 28: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/28.jpg)
04/22/23 28
Kruskal-Wallis test• The Kruskal-Wallis test is a nonparametric
(distribution free) test, which is used to compare three or more groups of independent groups of sampled data.
• Kruskal-Wallis Test is used when assumptions of ANOVA are not met. – In ANOVA, we assume that distribution of each group
should be normally distributed. – In Kruskal-Wallis Test, we do not assume the
distribution. • If normality assumptions are met, then the
Kruskal-Wallis Test is not as powerful as ANOVA.
![Page 29: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/29.jpg)
04/22/23 29
Kruskal-Wallis test
• The hypotheses for the comparison of two independent groups are: – H0 : μ1 = μ2 = μ3– Ha : Not all the means are equal.
![Page 30: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/30.jpg)
04/22/23 30
Kruskal-Wallis test
• The test statistic for the Kruskal-Wallis test is H. • When sample sizes are small in each group (<
5) and the number of groups is less than 4, a tabled value for the Kruskal-Wallis should be compared to the H statistic to determine the significance level.
• Otherwise, a Chi-square with k-1 (the number of groups-1) degrees of freedom can be used to approximate the significance level for the test.
![Page 31: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/31.jpg)
04/22/23 31
![Page 32: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/32.jpg)
04/22/23 32
![Page 33: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/33.jpg)
04/22/23 33
Kruskal-Wallis test
• Do not worry. You just should know how to interpret the STATA output.
![Page 34: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/34.jpg)
04/22/23 34
When to use what?
![Page 35: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/35.jpg)
04/22/23 35
Correlation
• Find out the relationship of the two variables: if they are associated or not?
• Standard measure of the relationship is called correlation.
• This relationship could be – negative (with increased x y goes down) – positive (as x increases y also increases).
![Page 36: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/36.jpg)
04/22/23 36
Correlation
• Scatter plots provide a graphical way of interpreting the correlation. – Scatter plots are similar to line graphs in that
they use horizontal and vertical axes to plot data points
– The closer the data points come when plotted to making a straight line, the higher the correlation between the two variables, or the stronger the relationship.
![Page 37: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/37.jpg)
04/22/23 37
Correlation
• Scatter plots provide an approximation.– If the data points make a
straight line going from the origin out to high x- and y-values, then the variables have a positive correlation.
– If the line goes from a high-value on the y-axis down to a high-value on the x-axis, the variables have a negative correlation.
![Page 38: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/38.jpg)
04/22/23 38
Correlation
• Variables may not have a correlation at all. There is no pattern to where the data points lie. They do not seem to go in any particular direction.
![Page 39: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/39.jpg)
04/22/23 39
Correlation
• we use correlation coefficient to figure out the exact value for the relationship between two variables. – the strength and the direction of a linear relationship
between two variables.
• Correlation coefficient (r) has a range of -1 to +1 • The absolute value of the coefficient is a
measure of magnitude.
![Page 40: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/40.jpg)
04/22/23 40
Correlation• Positive correlation:
– If x and y have a strong positive linear correlation, r is close to +1.
– An r value of exactly +1 indicates a perfect positive fit. – As values for x increases, values for y also increase.
• Negative correlation: – If x and y have a strong negative linear correlation, r is close
to -1. – An r value of exactly -1 indicates a perfect negative fit. – As values for x increase, values for y decrease.
• No correlation: – If there is no linear correlation or a weak linear correlation, r
is close to 0.
![Page 41: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/41.jpg)
04/22/23 41
![Page 42: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/42.jpg)
04/22/23 42
Correlation
• Interpret the Correlation Coefficient !• Suppose that a research study that
reported a correlation of r = .75 between compensation and number of years of education. What could you say about the relationship between these two variables?
![Page 43: 10/22/20151 PUAF 610 TA Session 8. 10/22/20152 Recover from midterm](https://reader036.vdocuments.us/reader036/viewer/2022062305/56649ee55503460f94bf50b4/html5/thumbnails/43.jpg)
04/22/23 43
Correlation
• There is a positive relationship between level of education and compensation.
• This means that people with more education tend to earn higher salaries. Similarly, people with low levels of education tend to have correspondingly lower salaries.