chapter 10 copyright © allyn & bacon 2008 this multimedia product and its contents are...

Chapter 10

Copyright © Allyn & Bacon 2008

This multimedia product and its contents are protected under copyright law. The following are prohibited by law:• Any public performance or display, including transmission of any image over a network;• Preparation of any derivative work, including the extraction, in whole or in part, of any images;• Any rental, lease, or lending of the program.

Inferential statistics Purpose Error Terminology Hypothesis testing

Inferential tests Criteria for evaluating the inferential

statistics reports in studies


The purpose of inferential statistics is to draw inferences about a population on the basis of an estimate from a sample

Inferential statistics - specific statistical procedures that accomplish this purpose

The ultimate goal is to draw accurate conclusions about the population


Two types of errors Sampling errors

▪ Without measuring the entire population, the results can be inaccurate due to sampling error▪ The larger the proportion of the population that is sampled,

the lower the sampling error; the smaller the proportion of the population that is sampled, the higher the sampling error

▪ A sample of 99% of a population is likely to show results that are very, very similar to those that would have been found if everyone in the population was measured

▪ A sample of 1% is likely to show results that are different from those in the population - the question is how different are the sample results

▪ Need to estimate the level of sampling error relative to the inferences being drawn


Measurement errors Regardless of the sample size, the

results can be inaccurate due to measurement error▪ Lack of validity▪ Lack of reliability

Need to estimate the level of measurement error relative to the inferences being drawn


Terminology Null hypothesis

▪ No differences between groups▪ No relationships between variables

Level of significance▪ Probability of being wrong in rejecting the null hypothesis▪ Known as alpha ()

Types of errors▪ Type I - rejecting the null hypothesis when it is true▪ Type II - not rejecting (i.e., accepting) the null hypothesis

when it is not true


Hypothesis testing exemplified with an experimental control group comparison The five stages of the process

▪ State the null hypothesis - no difference between the mean scores for the experimental and control groups

▪ Assume the null hypothesis is true to establish a base from which the statistician can work▪ The base is actually the sampling distribution of the test statistic,

in this case the sampling distribution of the difference between two means, t

▪ Through statistical theory we can establish the characteristics of this sampling distribution (i.e., mean; standard deviation, known as the standard error in this situation; and shape)


The five stages of the process (continued)▪ Calculate the observed difference between the mean

scores for the two groups▪ Compare the observed difference between mean scores

to the sampling distribution of the test statistic▪ Accept or reject the null hypothesis based on this

comparison▪ If the observed difference is typical of the sampling

distribution, the null hypothesis is likely true and it is accepted▪ If the observed difference is atypical of the sampling

distribution, the null hypothesis is likely untrue and it is rejected.


Issues related to statistical and practical significance▪ Statistical significance

▪ The typical or atypical nature of the comparison of the observed difference to the sampling distribution can be estimated using statistical theory The estimate is the probability of being wrong in

rejecting the null hypothesis It is stated as p = x where x is the specific

probability of the comparison (e.g., p = .001, p = .042, p = .56) or as p < y where y is the alpha level (e.g., .10, .05, .01)


▪ There is always the possibility of making a mistake given that this is based on a probability model▪ Type I error - deciding to reject the null hypothesis when in

reality it is true▪ Type II error - accepting the null hypothesis when it in reality it

is false

▪ Typical levels of significance in education - .10, .05, and .01

▪ Factors affecting the level of significance▪ The actual differences between the groups▪ The degree to which sampling and measurement errors exist▪ The size of the sample


Practical significance▪ Practical significance is related to the

importance and usefulness of the results▪ Estimates of practical significance

▪ For correlations the coefficient of determination (i.e., r2) is used

▪ For comparisons an effect size is used Effect size is the difference between two group

means in terms of the control group standard deviation

Evaluating effect sizes – small (.30), moderate (.50), and large (.75)


Each consumer of the research must judge the balance between the statistical significance and the practical significance of the statistical results given the context in which the results might be used


Two types of inferential tests Parametric - inferential procedures

using interval or ratio level data Non-parametric - inferential

procedures using nominal or ordinal data


T-test A comparison of the means for two

groups▪ Do the mean scores on the final exam differ

for the experimental and control groups?▪ Independent samples t-test - compares the

means of two separate groups on one variable▪ Posttest means for Group 1 and Group 2

▪ Dependent sample t-test - compares the means of two variables for one group▪ Pre-test and posttest means for Group 1


T-test (continued) A determination of whether a

relationship exists▪ Does a correlation of +.63 between students’

math attitudes and math achievement indicate a relationship exists between these two variables?

▪ Correlation t-test - compares the magnitude of the difference between a correlation coefficient and 0.00


Analysis of variance (ANOVA) A comparison of the means for two or

more groups Omnibus ANOVA - a procedure that

indicates whether one of more pairs of means are different

Do the mean scores differ for the groups using co-operative group, lecture, or web-based instruction?


ANOVA (continued) Multiple comparisons (i.e., post-hoc)

▪ Procedures that indicate which specific pairs of means are different as a follow-up to a significant omnibus ANOVA result

▪ Do the mean scores differ between the co-operative group and lecture, co-operative group and web-based, and lecture and web-based instruction?

▪ Two common tests▪ Tukey▪ Scheffe


Factorial ANOVA A procedure that analyzes the difference between

groups across two or more independent variables Do the mean scores differ for co-operative group,

lecture, and web-based instruction for males and females?

Effects▪ Main effects - differences between the levels of each

independent variable▪ Interaction effects - differences between combinations of the

levels of each independent variable


Analysis of covariance (ANCOVA) A procedure that compares means after

statistically adjusting them for pretest differences between groups

Very stringent assumptions that must be met to use this procedure

Adjusts for small to moderate - not large - pretest differences


Multivariate statistics Comparisons or relationships involving two or more

dependent variables Comparison of means

▪ Are there differences in the attitudes and performances of students being taught with lecture or web-based instruction?

▪ Specific tests▪ Multivariate ANOVA (MANVOA)▪ Multivariate ANCOVA (MANCOVA)▪ Hotelling’s T


Multivariate statistics (continued) Relationships

▪ Are students’ affective traits (e.g., attitudes, self-esteem, preferences, etc.) predictive of their knowledge (i.e., test scores) and skills (i.e., performances)?

▪ Canonical correlation


Chi-square - differences in frequencies across different categories Do mothers and fathers differ in their

support of a year-round school calendar? Do the percentages of undergraduate,

graduate, and doctoral students differ in terms of their support for the new class attendance policy?


Comparison of means Mann Whitney U-test Wilcoxon test Kruskal-Wallis ANOVA

Relationships Spearman r


Basic descriptive statistics are needed to evaluate the inferential results

Inferential analyses report statistical significance, not practical significance

Inferential analyses do not indicate internal or external validity

The results depend on sample sizesCopyright © Allyn & Bacon 2008

The appropriate statistical procedures are used

The level of significance is interpreted correctly

Caution is used to interpret non-parametric results from studies with few subjects in one or more groups or categories


chapter 10 copyright © allyn & bacon 2008 this multimedia product and its contents are...

Documents

drawncopyright allyn

shapecopyright allyn

studiescopyright allyn

truecopyright allyn

copyright law

sampling distribution

sampling errora sample

entire population