lecture 7 analysis and interpretation of inferential data using t distribution

53
EDRS 6208: Fundamentals of Educational Research 1 Lecture 7 Analysis and Interpretation of Inferential Data Using t Distribution

Upload: stewart-hoover

Post on 12-Jan-2016

214 views

Category:

Documents


2 download

TRANSCRIPT

CHAPTER 10:

EDRS 6208: Fundamentals of Educational Research 1Lecture 7Analysis and Interpretation of Inferential DataUsing t Distribution

ReadingBest and Kahn Chapter 11 Pages 406 to 423

OutlineIndependent versus Dependent SamplesAssumptions about the independent-samples t-testCalculate the independent sample t-test.Degrees of freedom for the independent-samples t test.Using EXCEL to calculate t testInterpretation of t-test from SPSSPresenting the results in APA format.

Review Hypothesis Testing. Identify hypothesis to be tested and put it in symbolic form. Identify the null hypothesisIdentify the alternative hypothesis.Select the significance level based on the seriousness of the Type 1 error. Identify the statistic that is relevant to the test and identify the sampling distribution. Determine the test statistic either p value or critical value. Draw the graph. Reject H0: Test statistic is in the critical region or the p value . Fail to reject H0: test is not in the critical region or p value>

Finding P-Values left rightHypothesisIs it a two tailed test?Type of testP-value = area to the left of the test statisticP-value = twice the area to the left of the test statisticP-value = twice the area to the right of the test statistic. P-value = area to the right of the test statistic. Left tailedright tailedIndependent versus Dependent SamplesDefinitionTwo samples drawn from two populations are independent if the selection of one sample from one population does not affect the selection of the second sample from the second population. Otherwise, the samples are dependent. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedExample 1Suppose we want to estimate the difference between the mean salaries of all male and all female executives. To do so, we draw two samples, one from the population of male executives and another from the population of female executives. These two samples are independent because they are drawn from two different populations, and the samples have no effect on each other. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedExample 2Suppose we want to estimate the difference between the mean weights of all participants before and after a weight loss program. To accomplish this, suppose we take a sample of 40 participants and measure their weights before and after the completion of this program. Note that these two samples include the same 40 participants. This is an example of two dependent samples. Such samples are also called paired or matched samples. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedT-test: DefinitionThe t-test compares the means for two groups of individuals. It is a versatile statistical test. It can be used to test whether two group means are different. For example: Is Nightmare on Elm Street 2 is scarier than Nightmare on Elm street 1?Watch both movies and measure heart rate. Does listening to music while you work improve your attention? Get some people to write an essay while listening to music and then write a different essay when working in silence. Then compare their essay grades.

Why use T test?Case 1The population standard deviation is not known.The sample size is small ( n < 30).The population from which the sample is selected is normally distributed.Case 2The population standard deviation is not knownThe sample size is large ( n 30).

Hypothesis about two groupsSuppose I ask you about your anxiety in taking the basic statistics course. If I ask you to number your anxiety level on a scale of 1 to 10, where 1 would mean very little anxiety and 10 would indicate that high anxiety. I then pose the following questionsIs the male any different from female anxiety?Do students who have previous mathematics experience suffer less anxiety?Do part-time students experience as much anxiety as full time students?Are undergraduate students more anxious than post graduate students?Each one of these questions can be examined using an independent sample t-test.Is there a difference between the two group means?If you calculate two sample means and they are different, there are 2 possible reasons for the difference.Each group comes from a different population and the sample means represent two different population means. When this happens you reject the null Hypothesis.The groups come from the same population and the means vary by chance. You just happen to pick two groups with means that are far apart. You fail to reject the null hypothesis.The independent t-testUsed in situations in which there are two experimental conditions and different participants used in each condition. The assumptions about independent sample t-testsThe variable being measured is normally distributed.The variances of the groups being assessed are equivalent ( homogeneous)Sample 1 is randomly sampled from population1 and sample 2 is randomly sampled form population2. Independent sample t test equation

estimate of the standard errort =Example

Estimate of Standard errorRecall: the standard error tells us how variable the differences between sample means are by chance alone. If the standard deviations high then large differences between sample means can occur by chance. If the standard deviation is small then only small differences between the sample means are expected.The standard error of the sampling distribution is used to assess whether the difference between two samples means is statistically significantly meaningful or simply a chance result. .Variance Sum LawThe variance sum law is used to calculate the standard deviation of the sampling distribution of differences between sample means. It statesThe variance of the difference between independent variables is equal to the sum of their variances ( Howell, 2006).In essence this tells you that: The variance of the sampling distribution is equal to the sum of variances of the two populations from which the samples were taken. Calculate the Standard error of each population.Using the sample standard deviation we calculate the standard error of each populations sample distribution. SE of sampling distribution of population 1 =

SE of the sample distribution of population 2 =

Recall: Variance is equal to standard deviation squared. Calculate the variance of each population. Variance of Sampling distribution of population 1 =

Variance of sampling distribution of population 2 =

The variance sum law: to find the variance of the sampling distribution of differences we add the variances of the sampling distribution. Variance of sampling distribution of differences =

To find the standard error of the sampling distribution of differences we find the square root of the varianceSE of the sampling distribution of differences =

Therefore substitute SE in the previous equation for t. ( See page 409, Best and Kahn).

This equation works only when the sample sizes are equal. Sometimes we ant to compare two groups that contain different numbers of participants then the above equation is not appropriate. Instead the pooled variance estimate t-test is used

The pooled variance estimate for two samples.

Pooled Standard Deviation for Two SamplesThe pooled standard deviation for two samples is computed as

where n1 and n2 are the sizes of the two samples and and are the variances of the two samples, respectively. Here is an estimator of .

Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedEstimator of the Standard Deviation of x1 x2 Estimator of the Standard Deviation of x1 x2 The estimator of the standard deviation of is

Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedDegree of freedom (df) for independent samples t- testThe degree of freedom that we calculate for the independent sample t-test must reflect the number in each sample minus one. df = n1 + n2 2 or df = ( n1 -1) + (n2 1)ordf = df1 + df2

Example A sample of 14 cans of Brand I diet soda gave the mean number of calories of 23 per can with a standard deviation of 3 calories. Another sample of 16 cans of Brand II diet soda gave the mean number of calories of 25 per can with a standard deviation of 4 calories. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedQuestionAt the 1% significance level, can you conclude that the mean number of calories per can are different for these two brands of diet soda? Assume that the calories per can of diet soda are normally distributed for each of the two brands and that the standard deviations for the two populations are equal. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedSolutionStep 1:H0: 1 2 = 0 (The mean numbers of calories are not different.)H1: 1 2 0 (The mean numbers of calories are different.) Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved SolutionStep 2:The two samples are independent 1 and 2 are unknown but equal The sample sizes are small but both populations are normally distributedWe will use the t distribution

Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved SolutionStep 3:The sign in the alternative hypothesis indicates that the test is two-tailed. = .01.Area in each tail = / 2 = .01 / 2 = .005df = n1 + n2 2 = 14 + 16 2 = 28Critical values of t are -2.763 and 2.763. ( page 483 Best and Kahn) Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedDraw figure Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved

Solution

Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedStep 4:SolutionStep 5:The value of the test statistic t = -1.531 It falls in the nonrejection regionTherefore, we fail to reject the null hypothesisConsequently, we conclude that there is no difference in the mean numbers of calories per can for the two brands of diet soda. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedExample 2 A sample of 40 children from New York State showed that the mean time they spend watching television is 28.50 hours per week with a standard deviation of 4 hours. Another sample of 35 children from California showed that the mean time spent by them watching television is 23.25 hours per week with a standard deviation of 5 hours. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedQuestionUsing a 2.5% significance level, can you conclude that the mean time spent watching television by children in New York State is greater than that for children in California? Assume that the standard deviations for the two populations are equal. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedExample Solution Step 1:H0: 1 2 = 0 H1: 1 2 > 0 Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedStep 2:The two samples are independentStandard deviations of the two populations are unknown but assumed to be equalBoth samples are largeWe use the t distribution to make the test

Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedStep 3: = .025Area in the right tail of the t distribution = = .025df = n1 + n2 2 = 40 + 35 2 = 73Critical value of t is 1.993

Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedFigure 10.4 Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reserved

Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedStep 4:Step 5:The value of the test statistic t = 5.048 It falls in the rejection regionTherefore, we reject the null hypothesis H0Hence, we conclude that children in New York State spend more time, on average, watching TV than children in California. Prem Mann, Introductory Statistics, 7/E Copyright 2010 John Wiley & Sons. All right reservedUsing EXCEL to calculate thttp://www.ehow.com/video_4983079_use-excels-ttest-function.html?sms_ss=gmail&at_xt=4cc6188f5805f4b4,0

Output from the independent sample t-testSPSS gives an out put for independent t-test in 2 tables. Let us assume that we are assessing your statistics anxiety: from the population I have 2 groups sample size is 12. Group one is exposed to an innovative teaching strategy and group two the traditional statistics lecture. The group who were exposed to innovative strategy had a mean anxiety of 5.50 with a standard deviation of 2.78 SE of the group is .802. the group exposed to traditional lecture had a mean anxiety level of 5.58,with a standard deviation of 2.503 and SE of .723.SPSS Output ( table 1)Teaching strategyNMeanStd. DeviationStd. Error MeanAnxiety Lecture Innovative12125.585.502.779802.50303.80246.72256Table 2Levenes test for Equality of varianceT-test for equity of meansFSigTdfSig ( 2 tailed)Mean differenceStd error difference95% confidence interval of the differencelowerUpperAnxiety

Equal variances assumed

Equal variances not assumed

.201.659-.077

-..07722

21.762.939

.939-.083

-.0831.079

1.079-2.322

-2.3242.156

2.158ExplanationNotice there is more information in the second table. The top row label the statistics computed, below each label are the values calculated by SPSS. The first column is divided into two more rows. Row 1: equal variances assumedRow 2: equal variances not assumed. Remember: one of the assumptions of t is equal variances. When this is violated, we have the option of using a more conservative estimate. Levenes Test for equity of variancesF Levenes test of homogeneity of variances computes a statistic called F. For our data F = .201Sig:In this column, SPSS reports the significance of the Levenes F. If the significance level is .05, then we conclude that the variances of the two groups differ significantly. The alpha associated with Levenes F is .659. Since .659 is greater than .05, the difference between the variances is not significant. Thus we do not have evidence that we have violated the assumption of equal variances.T-test for Equality of meansBecause Levenes test was not significant, we use the top row of the output labeled equal variances assumedtThe t value is -.077df there are 22 degrees of freedomSig( 2 tailed) the p at which the t is significant is .939. Since we use p.05, and .939 is greater than .05, the difference between students who were exposed to innovative teaching strategies and traditional lecture statistics anxiety level is not significant. Mean Difference The difference in anxiety between the samples is -.0833Std. Error Difference: The standard error mean difference

95% Confidence Interval of the differenceAnother ways of determining if there is a significant difference between the two means is to compute confidence bands around the observed t. If 0 falls within the band, we do not have a significant difference, if the band does not include 0, the difference is significant.Lower The lower point of the band is -2.322Upper: the upper point of the band is 2.156Since 0 falls within 95% confidence band, the difference between the two samples statistics anxiety levels is not significant. What do the results say?Based on our tow samples, t = -.077. Our calculated t does not exceed the critical value of t = 2.074. Thus we fail to reject the null hypothesis. There is no significant difference between students statistics anxiety between student exposed to traditional lecture methods and those exposed to innovative strategies.

If the results indicated a significant difference we would compute and effect size using Cohens d.

Presenting results using APA formatYou then present your resultsStudents who were exposed to innovative had a statistics anxiety level of 5.50 ( s = 2.78) while that of students exposed to traditional lectures was 5.58 ( s 2.00). Statistics anxiety levels did not differ significantly between students exposed to traditional teaching methods and those exposed to innovative methods within this study sample. t(22) = -.077. p> .05.Or: no significant difference was found between students exposed to traditional teaching methods and those exposed to an innovative strategy. T(22) = .077, p = .939In-class ExerciseBest and KahnPage 444 nos. 8 and 9