chapter 10 copyright © allyn & bacon 2008 this multimedia product and its contents are...
TRANSCRIPT
Chapter 10
Copyright © Allyn & Bacon 2008
This multimedia product and its contents are protected under copyright law. The following are prohibited by law:• Any public performance or display, including transmission of any image over a network;• Preparation of any derivative work, including the extraction, in whole or in part, of any images;• Any rental, lease, or lending of the program.
Inferential statistics Purpose Error Terminology Hypothesis testing
Inferential tests Criteria for evaluating the inferential
statistics reports in studies
Copyright © Allyn & Bacon 2008
The purpose of inferential statistics is to draw inferences about a population on the basis of an estimate from a sample
Inferential statistics - specific statistical procedures that accomplish this purpose
The ultimate goal is to draw accurate conclusions about the population
Copyright © Allyn & Bacon 2008
Two types of errors Sampling errors
▪ Without measuring the entire population, the results can be inaccurate due to sampling error▪ The larger the proportion of the population that is sampled,
the lower the sampling error; the smaller the proportion of the population that is sampled, the higher the sampling error
▪ A sample of 99% of a population is likely to show results that are very, very similar to those that would have been found if everyone in the population was measured
▪ A sample of 1% is likely to show results that are different from those in the population - the question is how different are the sample results
▪ Need to estimate the level of sampling error relative to the inferences being drawn
Copyright © Allyn & Bacon 2008
Measurement errors Regardless of the sample size, the
results can be inaccurate due to measurement error▪ Lack of validity▪ Lack of reliability
Need to estimate the level of measurement error relative to the inferences being drawn
Copyright © Allyn & Bacon 2008
Terminology Null hypothesis
▪ No differences between groups▪ No relationships between variables
Level of significance▪ Probability of being wrong in rejecting the null hypothesis▪ Known as alpha ()
Types of errors▪ Type I - rejecting the null hypothesis when it is true▪ Type II - not rejecting (i.e., accepting) the null hypothesis
when it is not true
Copyright © Allyn & Bacon 2008
Hypothesis testing exemplified with an experimental control group comparison The five stages of the process
▪ State the null hypothesis - no difference between the mean scores for the experimental and control groups
▪ Assume the null hypothesis is true to establish a base from which the statistician can work▪ The base is actually the sampling distribution of the test statistic,
in this case the sampling distribution of the difference between two means, t
▪ Through statistical theory we can establish the characteristics of this sampling distribution (i.e., mean; standard deviation, known as the standard error in this situation; and shape)
Copyright © Allyn & Bacon 2008
The five stages of the process (continued)▪ Calculate the observed difference between the mean
scores for the two groups▪ Compare the observed difference between mean scores
to the sampling distribution of the test statistic▪ Accept or reject the null hypothesis based on this
comparison▪ If the observed difference is typical of the sampling
distribution, the null hypothesis is likely true and it is accepted▪ If the observed difference is atypical of the sampling
distribution, the null hypothesis is likely untrue and it is rejected.
Copyright © Allyn & Bacon 2008
Issues related to statistical and practical significance▪ Statistical significance
▪ The typical or atypical nature of the comparison of the observed difference to the sampling distribution can be estimated using statistical theory The estimate is the probability of being wrong in
rejecting the null hypothesis It is stated as p = x where x is the specific
probability of the comparison (e.g., p = .001, p = .042, p = .56) or as p < y where y is the alpha level (e.g., .10, .05, .01)
Copyright © Allyn & Bacon 2008
▪ There is always the possibility of making a mistake given that this is based on a probability model▪ Type I error - deciding to reject the null hypothesis when in
reality it is true▪ Type II error - accepting the null hypothesis when it in reality it
is false
▪ Typical levels of significance in education - .10, .05, and .01
▪ Factors affecting the level of significance▪ The actual differences between the groups▪ The degree to which sampling and measurement errors exist▪ The size of the sample
Copyright © Allyn & Bacon 2008
Practical significance▪ Practical significance is related to the
importance and usefulness of the results▪ Estimates of practical significance
▪ For correlations the coefficient of determination (i.e., r2) is used
▪ For comparisons an effect size is used Effect size is the difference between two group
means in terms of the control group standard deviation
Evaluating effect sizes – small (.30), moderate (.50), and large (.75)
Copyright © Allyn & Bacon 2008
Each consumer of the research must judge the balance between the statistical significance and the practical significance of the statistical results given the context in which the results might be used
Copyright © Allyn & Bacon 2008
Two types of inferential tests Parametric - inferential procedures
using interval or ratio level data Non-parametric - inferential
procedures using nominal or ordinal data
Copyright © Allyn & Bacon 2008
T-test A comparison of the means for two
groups▪ Do the mean scores on the final exam differ
for the experimental and control groups?▪ Independent samples t-test - compares the
means of two separate groups on one variable▪ Posttest means for Group 1 and Group 2
▪ Dependent sample t-test - compares the means of two variables for one group▪ Pre-test and posttest means for Group 1
Copyright © Allyn & Bacon 2008
T-test (continued) A determination of whether a
relationship exists▪ Does a correlation of +.63 between students’
math attitudes and math achievement indicate a relationship exists between these two variables?
▪ Correlation t-test - compares the magnitude of the difference between a correlation coefficient and 0.00
Copyright © Allyn & Bacon 2008
Analysis of variance (ANOVA) A comparison of the means for two or
more groups Omnibus ANOVA - a procedure that
indicates whether one of more pairs of means are different
Do the mean scores differ for the groups using co-operative group, lecture, or web-based instruction?
Copyright © Allyn & Bacon 2008
ANOVA (continued) Multiple comparisons (i.e., post-hoc)
▪ Procedures that indicate which specific pairs of means are different as a follow-up to a significant omnibus ANOVA result
▪ Do the mean scores differ between the co-operative group and lecture, co-operative group and web-based, and lecture and web-based instruction?
▪ Two common tests▪ Tukey▪ Scheffe
Copyright © Allyn & Bacon 2008
Factorial ANOVA A procedure that analyzes the difference between
groups across two or more independent variables Do the mean scores differ for co-operative group,
lecture, and web-based instruction for males and females?
Effects▪ Main effects - differences between the levels of each
independent variable▪ Interaction effects - differences between combinations of the
levels of each independent variable
Copyright © Allyn & Bacon 2008
Analysis of covariance (ANCOVA) A procedure that compares means after
statistically adjusting them for pretest differences between groups
Very stringent assumptions that must be met to use this procedure
Adjusts for small to moderate - not large - pretest differences
Copyright © Allyn & Bacon 2008
Multivariate statistics Comparisons or relationships involving two or more
dependent variables Comparison of means
▪ Are there differences in the attitudes and performances of students being taught with lecture or web-based instruction?
▪ Specific tests▪ Multivariate ANOVA (MANVOA)▪ Multivariate ANCOVA (MANCOVA)▪ Hotelling’s T
Copyright © Allyn & Bacon 2008
Multivariate statistics (continued) Relationships
▪ Are students’ affective traits (e.g., attitudes, self-esteem, preferences, etc.) predictive of their knowledge (i.e., test scores) and skills (i.e., performances)?
▪ Canonical correlation
Copyright © Allyn & Bacon 2008
Chi-square - differences in frequencies across different categories Do mothers and fathers differ in their
support of a year-round school calendar? Do the percentages of undergraduate,
graduate, and doctoral students differ in terms of their support for the new class attendance policy?
Copyright © Allyn & Bacon 2008
Comparison of means Mann Whitney U-test Wilcoxon test Kruskal-Wallis ANOVA
Relationships Spearman r
Copyright © Allyn & Bacon 2008
Basic descriptive statistics are needed to evaluate the inferential results
Inferential analyses report statistical significance, not practical significance
Inferential analyses do not indicate internal or external validity
The results depend on sample sizesCopyright © Allyn & Bacon 2008
The appropriate statistical procedures are used
The level of significance is interpreted correctly
Caution is used to interpret non-parametric results from studies with few subjects in one or more groups or categories
Copyright © Allyn & Bacon 2008