1104 ws6-15 ans

4
Dr AGR McClelland Division of Psychology and Language Sciences, UCL 1 1104 Statistics: Worksheet 6 Answers 1. There are two approaches to estimating the magnitude of an experimental effect; the first is to express the variability accounted for by the full model as a proportion of the total variability by computing R 2 . The second is to express the difference between the sample means in standard deviation units. According to Cohen (1988), a large effect would have a d = 0.8, a medium effect a d = 0.5, and a small effect a d = 0.2. In addition to providing an index of effect size, d can also be used to estimate the power of a statistical test, and to decide upon the sample size needed in a particular study. 2. Paired T-Test and Confidence Interval Paired T for Placebo - Drug N Mean StDev SE Mean Placebo 12 7.417 2.503 0.723 Drug 12 8.250 3.166 0.914 Difference 12 -0.833 1.528 0.441 95% CI for mean difference: (-1.804, 0.137) T-Test of mean difference = 0 (vs not = 0): T-Value = -1.89 P-Value = 0.085 a. Power and Sample Size Paired t Test Testing mean paired difference = 0 (versus not = 0) Calculating power for mean paired difference = difference Alpha = 0.05 Assumed standard deviation of paired differences = 1 Sample Difference Size Power 0.55 12 0.412720 You will also get a power curve plot which I have not reproduced here. The estimated power of the test is approximately 41%. As we would normally like the power to be 80% or greater, this is a low power test, and this fact might explain the non-significant result on the two-sided test. b. Power and Sample Size Paired t Test Testing mean paired difference = 0 (versus not = 0) Calculating power for mean paired difference = difference Alpha = 0.05 Assumed standard deviation of paired differences = 1 Sample Target Difference Size Power Actual Power 0.55 28 0.80 0.801083 0.55 32 0.85 0.853969 0.55 37 0.90 0.902350 0.55 45 0.95 0.950312 To achieve power of 80% with d = .55, the experimenter would have needed to run 28 children in the experiment. To achieve power of 95%, 45 children would have been needed.

Upload: nirvanaismyreligion

Post on 14-Apr-2016

214 views

Category:

Documents


0 download

DESCRIPTION

Stats

TRANSCRIPT

Page 1: 1104 ws6-15 ans

Dr AGR McClelland Division of Psychology and Language Sciences, UCL

1

1104 Statistics: Worksheet 6

Answers

1.

There are two approaches to estimating the magnitude of an experimental effect; the first is to express the variabilityaccounted for by the full model as a proportion of the total variability by computing R

2. The second is to express the

difference between the sample means in standard deviation units.According to Cohen (1988), a large effect would have a d = 0.8, a medium effect a d = 0.5, and a small effect a

d = 0.2. In addition to providing an index of effect size, d can also be used to estimate the power of a statistical test,and to decide upon the sample size needed in a particular study.

2.

Paired T-Test and Confidence Interval

Paired T for Placebo - Drug

N Mean StDev SE MeanPlacebo 12 7.417 2.503 0.723Drug 12 8.250 3.166 0.914Difference 12 -0.833 1.528 0.441

95% CI for mean difference: (-1.804, 0.137)T-Test of mean difference = 0 (vs not = 0): T-Value = -1.89 P-Value = 0.085

a.

Power and Sample Size

Paired t Test

Testing mean paired difference = 0 (versus not = 0)Calculating power for mean paired difference = differenceAlpha = 0.05 Assumed standard deviation of paired differences = 1

SampleDifference Size Power

0.55 12 0.412720

You will also get a power curve plot which I have not reproduced here. The estimated power of the test is approximately41%. As we would normally like the power to be 80% or greater, this is a low power test, and this fact might explain thenon-significant result on the two-sided test.

b.

Power and Sample Size

Paired t Test

Testing mean paired difference = 0 (versus not = 0)Calculating power for mean paired difference = differenceAlpha = 0.05 Assumed standard deviation of paired differences = 1

Sample TargetDifference Size Power Actual Power

0.55 28 0.80 0.8010830.55 32 0.85 0.8539690.55 37 0.90 0.9023500.55 45 0.95 0.950312

To achieve power of 80% with d = .55, the experimenter would have needed to run 28 children in the experiment. Toachieve power of 95%, 45 children would have been needed.

Page 2: 1104 ws6-15 ans

Dr AGR McClelland Division of Psychology and Language Sciences, UCL

2

3.Paired T-Test and Confidence Interval

Paired T for First - Third

N Mean StDev SE MeanFirst 8 4.500 0.756 0.267Third 8 7.125 1.458 0.515Difference 8 -2.625 1.847 0.653

95% CI for mean difference: (-4.170, -1.080)T-Test of mean difference = 0 (vs not = 0): T-Value = -4.02 P-Value = 0.005

MTB > let k4=2.625/1.847MTB > print k4

Data Display

K4 1.42122

Thus d = 1.42, which using Cohen’s criteria, is a very large effect size.

4.a. A quasi-experimental design. Participants have been sampled from existing categories (crowded roomsphobics and snake & spider phobics). This design is a type of correlational design, although it is analyzed in the samemanner as a true experimental design where individuals have been randomly assigned to conditions.b.

Two Sample T-Test and Confidence Interval

Two sample T for Room vs Snakes

N Mean StDev SE MeanRoom 7 42.43 5.44 2.1Snakes 7 36.29 5.85 2.2

95% CI for mu Room - mu Snakes: ( -0.4, 12.7)T-Test mu Room = mu Snakes (vs not =): T = 2.03 P = 0.065 DF = 12Both use Pooled StDev = 5.65

There is no significant evidence to suggest that the two groups differ with respect to their degree of depression, t(12) =2.03, p = .065..

This has to be a 2-sided test, as the experimenter did not make a directional prediction. The result is not significant, asthe associate probability (0.065) is greater than the significance level (0.050). If she had predicted (in advance!) thatcrowed room phobics would be more depressed than snake and spider phobics, and had decided to conduct a 1-sidedtest, the result would have been significant (the associated probability would have been 0.033 - less than thesignificance level).

c.MTB > let k2=(42.43-36.29)/5.65MTB > print k2

Data Display

K2 1.08673

Using Cohen’s criteria, a d = 1.087 would be considered to be a large effect. The failure to declare the result significantis almost certainly due to low power (see below).

d.Power and Sample Size

2-Sample t Test

Testing mean 1 = mean 2 (versus not =)Calculating power for mean 1 = mean 2 + differenceAlpha = 0.05 Assumed standard deviation = 1

Page 3: 1104 ws6-15 ans

Dr AGR McClelland Division of Psychology and Language Sciences, UCL

3

SampleDifference Size Power

1.087 7 0.464390

The sample size is for each group.

This is again a low power test. Estimated power 46%.

Power and Sample Size

2-Sample t Test

Testing mean 1 = mean 2 (versus not =)Calculating power for mean 1 = mean 2 + differenceAlpha = 0.05 Assumed standard deviation = 1

Sample TargetDifference Size Power Actual Power

1.087 15 0.80 0.8194941.087 17 0.85 0.8670891.087 19 0.90 0.9031591.087 24 0.95 0.957808

The sample size is for each group.

To achieve power = 80%, the psychologist would have to have run 15 participants in each group (i.e., 30 in total). Toachieve power = 95%, the would have needed to be 24 participants in each group (48 in total).

5.a. A true experiment. Participants have been randomly assigned to one of two conditions.b. Type of therapy (2 levels: Hypnosis vs. Standard)c. Degree of anxiety as measured by the anxiety questionnaired. Null: The two therapies do not differ in their efficacy with respect to the treatment of anxiety

Alternative: The Hypnosis treatment is more effective in the treatment of anxiety than the Standard treatment.e.

dardSHypnosisA

dardSHypnosis

H

H

tan

tan0

:

:

f.Two Sample T-Test and Confidence Interval

Two sample T for Hypnosis vs Standard

N Mean StDev SE MeanHypnosis 11 32.4 10.1 3.0Standard 11 43.8 13.0 3.9

95% CI for mu Hypnosis - mu Standard: (-21.8, -1.1)T-Test mu Hypnosis = mu Standard (vs not =): T = -2.31 P = 0.032 DF = 20Both use Pooled StDev = 11.7

There is significant evidence to suggest that the Hypnosis treatment is more effective in treating anxiety than theStandard treatment, t(20) = 2.31, p = .032. (If the test had been one-sided, the associated probability would have been0.016).

d. Cohen’s d = 0.97. Again, using Cohen’s criteria, this is a large effect.

6.a. Predictor Variable: Teaching Method (2 Levels: Imitation and Physical Guidance)b. Response Variable: Level of assistance required (on a rating scale)

Page 4: 1104 ws6-15 ans

Dr AGR McClelland Division of Psychology and Language Sciences, UCL

4

c.D

ata

Physical GuidanceImitation

20

15

10

5

0

Boxplot of Imitation, Physical Guidance

No evidence of outliers in either condition

d.Mean SD

Imitation 9.80 5.86Physical Guidance 7.93 4.82

e.Paired T-Test and Confidence Interval

Paired T for Imitation - Phys. Guide

N Mean StDev SE MeanImitatio 15 9.80 5.86 1.51Phys. Gu 15 7.93 4.82 1.24Difference 15 1.867 3.248 0.839

95% CI for mean difference: (0.068, 3.666)T-Test of mean difference = 0 (vs not = 0): T-Value = 2.23 P-Value = 0.043

There is significant evidence to suggest that the amount of assistance required went down in the second period, t(14) =2.23, p = .043.

f. Cohen’s d = 0.58. This is a medium-sized effect.

7.a. This is a fully randomized (true) experiment.b.

Two Sample T-Test and Confidence Interval

Two sample T for Low vs High

N Mean StDev SE MeanLow 15 3.35 1.64 0.42High 15 5.40 2.22 0.57

95% CI for mu Low - mu High: ( -3.51, -0.58)T-Test mu Low = mu High (vs not =): T = -2.87 P = 0.0078 DF = 28Both use Pooled StDev = 1.95

There is significant evidence to suggest that there is a difference in approach behaviour as a function of perceivedarousal, t(28) = 2.87, p = .008. The High Arousal group stopped at a greater mean distance from the snake than the LowArousal group.

c. Cohen’s d = 1.05. Using Cohen’s criteria, this is a large effect.