handout three: review of t-tests partitioning of variance, f-statistic, & f-distributions epse...

45
Handout Three: Review of t-Tests Partitioning of Variance, F- Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr. Amery Wu 1

Upload: candace-lawson

Post on 08-Jan-2018

219 views

Category:

Documents


0 download

DESCRIPTION

3 Review of “Describing and Explaining Quantitative Data”Brushing up Your SPSS Goals of Today’s Class

TRANSCRIPT

Page 1: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Handout Three: Review of t-Tests

Partitioning of Variance, F-Statistic, & F-Distributions

EPSE 592Experimental Designs and Analysis in

Educational ResearchInstructor: Dr. Amery Wu

1

Page 2: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

About Analysis of Variance Designs• Measurement of the data: quantitative• Type of statistical inference: descriptive or

inferential• Type of Modeling: Within each of the four cells,

the statistical model can be summative/descriptive or explanatory/predictive.

Analysis of Variance Design Measurement of Data

Quantitative Categorical

Type ofInference

Descriptive Summative/Descriptive Summative/Descriptive

Explanatory/Predictive Explanatory/Predictive

Inferential Summative/Descriptive Summative/Descriptive

Explanatory/Predictive Explanatory/Predictive2

Page 3: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

3

Review of “Describing and Explaining Quantitative Data”

Brushing up Your SPSS

Goals of Today’s Class

Page 4: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

4

Page 5: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Where We have been & Where We Are Today

So far, this course has reviewed descriptive statistics (model) for the sample data.

Today, we will review the two sets of machinery that will transit our discussion from the sample statistics to inferences about the population parameters.

we will use the one-sample t-test as an example for our review of inferential statistics.

Measurement of Data

Continuous Categorical

Type ofthe

Inference

DescriptiveA B

InferentialC D

5

Page 6: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Sampling Errors

The sampling error is the difference between sample statistic and the population parameter.

Sampling errors are caused by sporadic and unpredictable sources as a result of sampling, i.e., sample to sample variation.

For example, among the randomly sampled participants, one got fired at work, another won the lottery,... just before being surveyed to rate their happiness.

6

Page 7: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Problems with Inferring the Population

???

???

Modeling the unknown population is analogous to putting the jigsaw puzzles together without knowing its real image

Sampling distribution Hypothesis testing

M= 65.32 μ= 0, 50, 65, or 70?

7

Page 8: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

8

Two Statistical Machinery for Inferential Statistics

Page 9: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Inferential statistics takes into account the sampling errors when trying to infer from the statistic of one single sample to the population parameter.

Two essential but esoteric machinery for making inferences about the population based on one singe sample:

1. Sampling Distribution2. Hypothesis Testing

Inferential Statistics

9

Page 10: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Sampling Distributions Consider a very large normal population,

assume we repeatedly take 1000 samples of a size of 100 from the population and calculate the sample mean, our “statistic of interest”, for each sample.

Each sample of 100 will yield a somewhat different sample means from the rest. The distribution of the 1000 means is the "sampling distribution of the sample mean”.

A sampling distribution is the (probability) distribution of a statistic of interest under repeated sampling from the population. 10

Page 11: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

11

A sampling distribution is a theoretical distribution, against which the sample statistic be tested.

A sampling distribution is a theoretical distribution that takes into account the sampling errors by considering the size of the sample (i.e., the degrees of freedom).

The shape of the sampling distribution depends on the population distribution, the statistic of interest, and the degree of freedom.

Sampling Distributions

Page 12: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

According to the central limit theorem, even if the population is not normal, the sampling distribution of the sample mean will still be approximately normal provided the sample size is sufficiently large.

Also, the central limit theorem states that given a population distribution with a mean of μ and a SD of σ, as the sample size increases, the mean of “the sampling distribution of the mean” approaches μ, and the SD approaches σ/N.

Namely, the sampling distribution of the mean, with a sufficient sample size, will distribute normally with a mean of μ and standard deviation of σ/N .

The standard deviation, σ/N, of the sampling distribution has a special name - the standard error of the mean.     

Sampling Distributions of the sample Mean

12

Page 13: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Lab Activity: Online Sampling DistributionSimulation

Go to Home page of Rice Virtual Lab in Statistics.Choose “Simulation and Demonstrations” on the menu.Choose “Sampling Distribution” and enter.

Or directly go tohttp://onlinestatbook.com/stat_sim/sampling_dist/index.htmlRead and follow the instructions.

13

Page 14: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

The Family of t-sampling (Theoretical) Distributions

14

Page 15: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

In this class, we used the mean to demonstrate the notion of sampling distribution. Statistics other than the mean have sampling distributions too. Students often equate "sampling distribution" with the “sampling distribution of the mean”. That is an unfortunate mistake.

For instance, the sampling distribution of the median is the distribution that would result if the median instead of the mean were computed for each re-sampling.

It is crucial to understand the notion and mechanism of sampling distribution since almost all inferential statistics (you will learn) are based on reference to appropriate sampling distributions.

Side Notes for Sampling Distribution

15

Page 16: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Hypothesis Testing (HT)

HT is a statistical mechanism for making inference from the sample statistic to the population parameter by taking into account the “random error” (sample to sample error).

In other words, one uses the data to provide evidence for their hypothesis about the population via the “sampling distribution”.

16

Page 17: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Step One: Specify the Hypotheses Specify the null hypothesis (H0) and the

alternative hypothesis (H1). Typically, the null hypothesis is the statement a researcher would like to “reject”, and the alternative hypothesis is the statement a researcher would like to “retain”.

Take the kid’s self-reported injection pain (kidrate) for

example, if a researcher hypothesizes that the pain score would be different from 0, the

H0: μkidrate = 0 H1: μkidrate = 0

Note that the hypotheses are specified about the Population parameter NOT the sample statistic. Generally, The Greek letters are used to denote population parameters.

17

Page 18: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Step Two: Specify the Significance Level (α) The significance level (α) is the criterion

chosen for rejecting the H0. Simply put, the significance level is the maximum Type-1 error (false positive) a researcher is willing to tolerate.

If the p-value (explained in the following two slides) is Less than α, then the p-value is said to be statistically significant, and the H0 is rejected..

Typically, the α level is set to be 0.05 or 0.01. The lower the α level, the more the data must diverge from the null hypothesis (i.e., a smaller p-value) to be significant.

Therefore, an α of 0.01 is more conservative than that of 0.05 in rejecting the H0.

18

Page 19: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

The third step is to calculate the “statistic of interest”, which is a sample estimate for the population parameter specified in Ho

In our example, the mean of kidrate Mkidrate is the statistic of interest, which we base to infer the population mean μkidrate.

Step Three: Calculate the Statistic of Interest

19

Page 20: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Next, one needs to know what is the theoretical distribution of the statistic of interest” This is when the mechanism of sampling distribution comes to play and helps identify the appropriate theoretical distribution for the statistic of interest.

In our example, our “statistic of interest” is the mean. We have learned that the sampling distribution of the mean follows a family of t-distributions distinguished by the sample size (a.k.a., degrees of freedom). In our case, we identified the t-distribution with degrees of freedom 39 (N-1), which has a mean of 0 (under null hypothesis) and SE of 2.74 as our sampling distribution.

Step Four: Identify the Sampling Distribution ofthe Statistic of interest

20

Page 21: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

The “test statistic” is the ratio of the “statistic of interest” to the “standard error” of the same statistic. The test statistic is denoted as “t” if the statistic follows a t-distribution, as “F” if the statistic follows an F-distribution, or as 2 if the statistic follows a 2 distribution, etc.

Test statistic = (t, F, 2… )

The sstandard eerror (SE). A measure of how much the value of the statistic, for a given sample size, may vary from sample to sample taken from the same distribution. In other words, it is the standard deviation of the sampling distribution of the statistic for given a sample size.

The purpose of the calculating the test statistic is to show the location (on the X-axis) of the statistic on the sampling distribution, so that the probability of the statistic can be obtained.

In our example, the sample statistic is the mean = 65.32, the standard error is 2.74, so the “test statistic” t = 65.32-0/2.74= 23.82.

statistics

StatisticsSE

Step Five: Calculate the Test Statistic

21

Page 22: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Step Six: Obtain the p-value The p-value is “the probability of getting a value of the

test statistic as large as or larger than that observed by chance alone, given that the null hypothesis is true”, e.g., the probability of getting a t value of 23.82 or greater given that the population mean is zero is... One obtains the p-value by examining the location of the sample statistic on the theoretical sampling distribution.

In the time when PCs were rare, the “test statistic” were compared to the critical value that is pre-calculated and listed on a table (for a particular sampling distribution with specific α levels and sample sizes) to see if the p value is less than the α value .

Today, statistical packages such as SPSS for PCs can directly produces a p-value.

In our example, SPSS produced a p-value< 0.001. Alternatively, using the t-table: 23.82 is larger than the critical value of 2.022 of t-distribution with of degrees of freedom 39 and an α =0.05. Hence, p<0.05.

22

Page 23: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

The probability obtained in Step six is compared with the significance level chosen in Step 2. If the probability is less than or equal to the significance level, then the null hypothesis is rejected; if the probability is greater than the significance level then the null hypothesis is retained. When the null hypothesis is rejected, the outcome is said to be “Statistically Significant”; when the null hypothesis is not rejected then the outcome is said be "not statistically significant.“

In our example, because p is less than the significance level of 0.05. the null hypothesis μkidrate = 0 is rejected. We conclude that the population mean is “significantly different” from zero.

Step Seven: Conclude

23

Page 24: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Lab Activity- One-Sample t-test Using SPSS

Using the injection pain data, run a one-sample t-test inSPSS to test whether the population mean of Kid’s self-reported pain is different from zero.

Use the following path of the drop-down menu:Analyze Compare Means One-Sample T-test(Specify the test value to be 0).

Compare the SPSS results to those that we hand-calculated in the 7-step hypothesis testing procedures.

24

Page 25: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Lab Activity- Calculating Confidence Interval

N MeanStd.

DeviationStd. Error

Meankidrate

40 65.3250 17.34327 2.74221

One-Sample Test

Test Value = 0

t dfSig. (2-tailed)

Mean Difference

95% Confidence Interval of the

Difference

Lower Upperkidrate 23.822 39 .000 65.32500 59.7784 70.8716

SPSS Results for One-Sample t-test with set to 0.05

Using the above information, hand-calculate the 95% confidence interval of the population injection pain, which is between ________ and ________ . Compare your answers to the results of SPSS. Note that the critical value for t-distribution of df= 39 at = 0.05 is 2.0227. 25

Page 26: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Lab Activity- Calculating Confidence Interval

SPSS Results for One-Sample t-test with set to 0.01

Using the above information, hand-calculate the 99% confidence interval of the population injection pain, which is between ________ and ________ . Compare your answers to the results of SPSS. Note that the critical value for t-distribution of df= 39 at = 0.01 is 2.7079.

N MeanStd.

DeviationStd. Error

Meankidrate

40 65.3250 17.34327 2.74221

One-Sample Test

Test Value = 0

t dfSig. (2-tailed)

Mean Difference

99% Confidence Interval of the Difference

Lower Upperkidrate 23.822 39 .000 65.32500 57.8993 72.7507

26

Page 27: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

27

http://davidmlane.com/hyperstat/z_table.html

Lab Activity: One- or Two-tailed Hypothesis Testing

Page 28: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

28

Partitioning Variance (partitioning the Sum of Squares)F statisticF sampling Distributions

Getting Ready for Analysis of Variance

Page 29: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

2929

One-way ANOVA Partitions the Total Sum ofSquares by the IV

1 2 3 2 4 5 3 4 6 8 5 5

4 6

4

2

Page 30: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

3030

One-way ANOVA Partitions the Total Sum ofSquares of the DV by One IV

SSttot= 42

SSw-g

SSb-g

IV10

32

Page 31: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Between-subject one way aanalysis of variance (ANOVA) is used to test a hypothesis about differences between means of two or more independent groups.

The t-test can only be used to test difference between two means. When there are more than two means, it is possible to compare all possible pairs of means using multiple t-tests. However, conducting multiple t-tests can lead to severe inflation of the Type-I error rate.

ANOVA can be used to test differences among several means “without” inflating the Type-I error rate.

Data Assumption: Same as independent-samples t-test

Between-subject (Independent) One Way Analysis of Variance

31

Page 32: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Activity: Hypothesis Testing Using Between-subject) One Way Analysis of Variance

1. Specify the null hypothesis (H0) and the alternative Hypothesis (H1).

2. Specify the significance level (α)3. Calculate the statistic of interest.4. Calculate the test statistic5. Identify the sampling distribution of the statistic

of interest

6. Obtain the p-value7. Conclude

b-g

w-g

VarVar

F

32

Page 33: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Within-group VarianceWithin-group variance is the variation among the individuals’ raw score from the mean of a particular group (of the independent variable).

For a given data, say with three groups, the total within-group variance is the sum of the within-group variances across the three groups.

The within-group variation is caused by chance. It is the sample to sample variation of the individuals being recruited (i.e., random error or noise), namely, the individual differences due to sampling. 3

3

Page 34: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Between-group variance can be caused by(1) Chance (i.e., random error) and(2) True group difference (e.g., there is a true

difference between the mean of the control group and the treatment groups (from the grand mean). Such variation is NOT caused randomly by chance but systematically by the independent variable (i.e., groups).

Between-group variance is the variation of the groupmeans from the grand mean

Between-group Variance

34

Page 35: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Logic behind ANOVA

Under the null hypothesis that there is no true mean difference among any of pairs of groups, the between-group variance will be caused ONLY by the random error but NOT the true difference among the groups.

Thus, the within-group variance will be equal to the between-group variance because there is no true between-group variance. Hence, the F = 1.

B-G Variance = random variance + true between-group varianceW-G Variance = random variance

b-g

w-g

VarVar

F random variance + true between-group variance random variance

=

35

Page 36: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Lab Activity: Calculate the Test Statistic: F Ratio

First Step: Total Sum of SquaresTotal Sum of Squares

Group Raw score Deviation score Sum of Squareyellow 1 -3 9yellow 2 -2 4yellow 3 -1 1yellow 2 -2 4blue 4 0 0blue 5 1 1blue 3 -1 1blue 4 0 0green 6 2 4green 8 4 16green 5 1 1green 5 1 1Sum M=4 0 42

36

Page 37: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Lab Activity: Calculate the Test Statistic: F Ratio

Second Step: Within-group Sum of Squares

37

Page 38: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Lab Activity: Calculate the Test Statistic: F Ratio

Third Step: Between-group Sum of Squares

Between-group Sum of SqauresGroup Raw score Deviation score Sum of Squareyellow      yellow      yellow      yellow      

  M=2    blue      blue      blue      blue      

  M=4    green      green      green      green      

  M=6    Grand M=4

38

Page 39: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

The family of F-distributions

39

Page 40: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

F Table

40

Page 41: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

df- between = P-1 (P is the number of groups)df-within = N-P (N is the overall sample size)df- total = N-1

Calculate the p-value and Conclude

41

Note that the number of units for calculating the Varb-g. Varw-g, and Vartot should be the same and equals to the total sample size. Also note that the F value for an one-way ANOVA would equal to t2, where t is the test statistic in a independent -sample t test.

Page 42: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

42

Note that the number of units involved in the calculation of the Varb-g. Varw-g, and Vartot should be the same and equal to the total sample size.

Also note that the F value for an one-way ANOVA

With two groups would equal to t2, where t is the test Statistic for an independent -sample t test.

Side Notes

Page 43: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Research QuestionDoes the dose of the drug treatment have an effect on the patients’ depression level?This data was originally contrived by Dr. Karl L. Wuensch and was modified by the instructor for pedagogy reasons. The independent variable is the daily does of the new drug with 3 levels (control, 10mg, & 20mg) that were randomly prescribed to 3 groups of patients (20 in each group). The dependent variable is the patients’ depression level in quantity measured after two months of the new treatment.

SPSS Activity: Run the descriptive statistics separately for each group and report what you’ve observe.

Lab Activity: Between-Group One Way ANOVA in SPSS

43

Page 44: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

Lab Activity: Between-Group One Way ANOVA in SPSS

44

Page 45: Handout Three: Review of t-Tests Partitioning of Variance, F-Statistic, & F-Distributions EPSE 592 Experimental Designs and Analysis in Educational Research

ANOVA

depression

4345.033 2 2172.517 26.874 .0004607.950 57 80.8418952.983 59

Between GroupsWithin GroupsTotal

Sum ofSquares df Mean Square F Sig.

Lab Activity: SPSS Output of One Way ANOVA

45