if you think you made a lot of mistakes in the survey project…. think of how much you accomplished...

If you think you made a If you think you made a lot of mistakes in the lot of mistakes in the

survey project….survey project….

Think of how much you accomplished Think of how much you accomplished and the mistakes you did not make…and the mistakes you did not make…

• Went from not knowing much about Went from not knowing much about surveys to having designed, deployed, surveys to having designed, deployed, and completed one in 1 ½ monthsand completed one in 1 ½ months

• Actually got people to respond!Actually got people to respond!

• Did not end up with 100 open ended Did not end up with 100 open ended responses which you had to content responses which you had to content analyze!analyze!

One Tailed and Two Tailed tests

One tailed tests: Based on a uni-directional hypothesisExample: Effect of training on problems using PowerPoint

Population figures for usability of PP are knownHypothesis: Training will decrease number of problems with PP

Two tailed tests: Based on a bi-directional hypothesisHypothesis: Training will change the number of problems

with PP

If we know the population meanIf we know the population mean

Mean Usability Index

7.257.00

6.756.50

6.256.00

5.755.50

5.255.00

4.754.50

4.254.00

3.75

Sampling Distribution

Population for usability of Powerpoint

Fre

quen

cy

1400

1200

1000

800

600

400

200

0

Std. Dev = .45

Mean = 5.65

N = 10000.00

Unidirectional hypothesis: .05 level

Bidirectional hypothesis: .05 level

Identify region

• What does it mean if our significance level What does it mean if our significance level is .05?is .05? For a uni-directional hypothesisFor a uni-directional hypothesis For a bi-directional hypothesisFor a bi-directional hypothesis

PowerPoint example: PowerPoint example:

• UnidirectionalUnidirectional If we set significance level at .05 level, If we set significance level at .05 level,

• 5% of the time we will higher mean by chance5% of the time we will higher mean by chance• 95% of the time the higher mean mean will be real95% of the time the higher mean mean will be real

• BidirectionalBidirectional If we set significance level at .05 levelIf we set significance level at .05 level

• 2.5 % of the time we will find higher mean by chance2.5 % of the time we will find higher mean by chance• 2.5% of the time we will find lower mean by chance2.5% of the time we will find lower mean by chance• 95% of time difference will be real95% of time difference will be real

Changing significance Changing significance levelslevels

•What happens if we decrease our What happens if we decrease our significance level from .01 to .05significance level from .01 to .05

Probability of finding differences that don’t Probability of finding differences that don’t exist goes up (criteria becomes more lenient)exist goes up (criteria becomes more lenient)

•What happens if we increase our What happens if we increase our significance from .01 to .001significance from .01 to .001

Probability of not finding differences that exist Probability of not finding differences that exist goes up (criteria becomes more conservative)goes up (criteria becomes more conservative)

• PowerPoint example:PowerPoint example: If we set If we set significance level at .05 level,significance level at .05 level,

• 5% of the time we will find a difference by chance5% of the time we will find a difference by chance• 95% of the time the difference will be real95% of the time the difference will be real

If we set If we set significance level at .01 levelsignificance level at .01 level• 1% of the time we will find a difference by chance1% of the time we will find a difference by chance• 99% of time difference will be real99% of time difference will be real

• For usability, if you are set out to find For usability, if you are set out to find problems: setting lenient criteria might work problems: setting lenient criteria might work better (you will identify more problems)better (you will identify more problems)

• Effect of decreasing significance level from Effect of decreasing significance level from .01 to .05.01 to .05 Probability of finding differences that don’t exist Probability of finding differences that don’t exist

goes up (criteria becomes more lenient)goes up (criteria becomes more lenient)

Also called Also called Type I error (Alpha)Type I error (Alpha)

• Effect of increasing significance from .01 Effect of increasing significance from .01 to .001to .001 Probability of not finding differences that exist Probability of not finding differences that exist

goes up (criteria becomes more conservative)goes up (criteria becomes more conservative)

Also called Also called Type II error (Beta)Type II error (Beta)

Degree of FreedomDegree of Freedom

• The number of independent pieces of information The number of independent pieces of information remaining after estimating one or more remaining after estimating one or more parametersparameters

• Example: List= 1, 2, 3, 4 Average= 2.5Example: List= 1, 2, 3, 4 Average= 2.5

• For average to remain the same three of the For average to remain the same three of the numbers can be anything you want, fourth is fixednumbers can be anything you want, fourth is fixed

• New List = 1, 5, 2.5, __ Average = 2.5New List = 1, 5, 2.5, __ Average = 2.5

Major PointsMajor Points

• T tests: are differences significant?T tests: are differences significant?

• One sample t tests, comparing one mean One sample t tests, comparing one mean to populationto population

• Within subjects test: Comparing mean in Within subjects test: Comparing mean in condition 1 to mean in condition 2condition 1 to mean in condition 2

• Between Subjects test: Comparing mean Between Subjects test: Comparing mean in condition 1 to mean in condition 2in condition 1 to mean in condition 2

Effect of training on Powerpoint Effect of training on Powerpoint useuse

• Does training lead to lesser problems Does training lead to lesser problems with PP?with PP?

• 9 subjects were trained on the use of 9 subjects were trained on the use of PP.PP.

• Then designed a presentation with PP.Then designed a presentation with PP. No of problems they had was DVNo of problems they had was DV

Powerpoint study dataPowerpoint study data

• Mean = 23.89Mean = 23.89

• SD = 4.20SD = 4.20

21

24

21

26

32

27

21

25

18

Mean 23.89

SD 4.20

Results of Powerpoint Results of Powerpoint study.study.

• ResultsResults Mean number of problems = 23.89Mean number of problems = 23.89

• Assume we know that without training the Assume we know that without training the mean would be 30, but not the standard mean would be 30, but not the standard deviation deviation

Population mean = 30Population mean = 30

• Is 23.89 enough smaller than 30 to conclude Is 23.89 enough smaller than 30 to conclude that training affected results?that training affected results?

One sample t test cont.One sample t test cont.

• Assume mean of population known, but Assume mean of population known, but standard deviation (SD) not knownstandard deviation (SD) not known

• Substitute sample SD for population SD Substitute sample SD for population SD (standard error)(standard error)

• Gives you the t statisticsGives you the t statistics

• Compare Compare t t to tabled values which show to tabled values which show critical values of tcritical values of t

tt Test for One Mean Test for One Mean

• Get mean difference between Get mean difference between sample and population mean sample and population mean

• Use sample SD as variance metric = Use sample SD as variance metric = 4.404.40

48.146.1

11.6

9

40.489.2330

n

sX

t

Degrees of FreedomDegrees of Freedom

• Skewness of sampling distribution of Skewness of sampling distribution of variance decreases as variance decreases as nn increases increases

• tt will differ from will differ from zz less as sample size less as sample size increasesincreases

• Therefore need to adjust Therefore need to adjust tt accordingly accordingly

• dfdf = = nn - 1 - 1

• tt based on based on dfdf

Looking up critical t (Table Looking up critical t (Table E.6)E.6)

Two-Tailed Significance Level

df .10 .05 .02 .01 4 1.812 2.228 2.764 3.169

5 1.753 2.131 2.602 2.947 6 1.725 2.086 2.528 2.845 7 1.708 2.060 2.485 2.787 8 1.697 2.042 2.457 2.750 9 1.660 1.984 2.364 2.626

ConclusionsConclusions

• Critical t= Critical t= nn = 9, = 9, tt.05.05 = 2.62 (two tail = 2.62 (two tail

significance)significance)

• If If tt > 2.62, reject > 2.62, reject HH00

• Conclude that training leads to less Conclude that training leads to less problemsproblems

Factors Affecting Factors Affecting tt

• Difference between sample and Difference between sample and population meanspopulation means

• Magnitude of sample varianceMagnitude of sample variance

• Sample sizeSample size

Factors Affecting DecisionFactors Affecting Decision

• Significance level Significance level

• One-tailed versus two-tailed testOne-tailed versus two-tailed test

Sampling Distribution of Sampling Distribution of the Meanthe Mean

• We need to know what kinds of We need to know what kinds of sample means to expect if training sample means to expect if training has no effect.has no effect. i. e. What kinds of sample means if i. e. What kinds of sample means if

population mean = 23.89population mean = 23.89

Recall the sampling distribution of the Recall the sampling distribution of the mean.mean.

Sampling Distribution of Sampling Distribution of the Mean--cont.the Mean--cont.

• The sampling distribution of the The sampling distribution of the mean depends onmean depends on Mean of sampled populationMean of sampled population

St. dev. of sampled populationSt. dev. of sampled population

Size of sampleSize of sample

Mean Number of problems

7.257.00

6.756.50

6.256.00

5.755.50

5.255.00

4.754.50

4.254.00

3.75


Number of problems with Powerpoint UseF

requ

ency

1400

1200

1000

800

600

400

200

0

Std. Dev = .45

Mean = 23.89

N = 10000.00

Cont.

Sampling Distribution of Sampling Distribution of the mean--cont.the mean--cont.

• Shape of the sampled populationShape of the sampled population Approaches normalApproaches normal

Rate of approach depends on sample Rate of approach depends on sample sizesize

Also depends on the shape of the Also depends on the shape of the population distributionpopulation distribution

Implications of the Central Implications of the Central Limit TheoremLimit Theorem

• Given a population with mean = Given a population with mean = and and standard deviation = standard deviation = , the sampling , the sampling distribution of the mean (the distribution of the mean (the distribution of sample means) has a distribution of sample means) has a mean = mean = , and a standard deviation = , and a standard deviation = / /nn. .

• The distribution approaches normal as The distribution approaches normal as nn, the sample size, increases., the sample size, increases.

DemonstrationDemonstration

• Let population be very skewedLet population be very skewed

• Draw samples of 3 and calculate meansDraw samples of 3 and calculate means

• Draw samples of 10 and calculate meansDraw samples of 10 and calculate means

• Plot meansPlot means

• Note changes in means, standard Note changes in means, standard deviations, and shapesdeviations, and shapes

Cont.

X

20.018.0

16.014.0

12.010.0

8.06.0

4.02.0

0.0

Skewed Population F

req

ue

ncy

3000

2000

1000

0

Std. Dev = 2.43

Mean = 3.0

N = 10000.00

Parent PopulationParent Population

Cont.

Sampling Distribution Sampling Distribution nn = = 33

Sample Mean

13.0012.00

11.0010.00

9.008.00

7.006.00

5.004.00

3.002.00

1.000.00


Sample size = n = 3F

req

ue

ncy

2000

1000

0

Std. Dev = 1.40

Mean = 2.99

N = 10000.00

Cont.

Sampling Distribution Sampling Distribution nn = = 1010

Sample Mean

6.506.00

5.505.00

4.504.00

3.503.00

2.502.00

1.501.00


Sample size = n = 10F

req

ue

ncy

1600

1400

1200

1000

800

600

400

200

0

Std. Dev = .77

Mean = 2.99

N = 10000.00

Cont.

Demonstration--cont.Demonstration--cont.

• Means have stayed at 3.00 Means have stayed at 3.00 throughout--except for minor sampling throughout--except for minor sampling errorerror

• Standard deviations have decreased Standard deviations have decreased appropriatelyappropriately

• Shapes have become more normal--Shapes have become more normal--see superimposed normal distribution see superimposed normal distribution for referencefor reference

Within subjects t testsWithin subjects t tests

• Related samplesRelated samples

• Difference scoresDifference scores

• tt tests on difference scores tests on difference scores

• Advantages and disadvantagesAdvantages and disadvantages

Related SamplesRelated Samples

• The same participants give us data on The same participants give us data on two measurestwo measures e. g. Before and After treatmente. g. Before and After treatment

Usability problems before training on PP and Usability problems before training on PP and after trainingafter training

• With related samples, someone high on With related samples, someone high on one measure probably high on one measure probably high on other(individual variability).other(individual variability).

Cont.

Related Samples--cont.Related Samples--cont.

• Correlation between before and Correlation between before and after scoresafter scores Causes a change in the statistic we can Causes a change in the statistic we can

useuse

• Sometimes called matched samples Sometimes called matched samples or repeated measuresor repeated measures

Difference ScoresDifference Scores

• Calculate difference between first Calculate difference between first and second scoreand second score e. g. Difference = Before - Aftere. g. Difference = Before - After

• Base subsequent analysis on Base subsequent analysis on difference scoresdifference scores Ignoring Before and After dataIgnoring Before and After data

Effect of training Effect of training Before After Diff.

21 24 21 26 32 27 21 25 18

15 15 17 20 17 20 8

19 10

6 9 4 6

15 7

13 6 8

Mean St. Dev.

23.84 4.20

15.67 4.24

8.17 3.60

ResultsResults

• The training decreased the number of The training decreased the number of problems with Powerpointproblems with Powerpoint

• Was this enough of a change to be Was this enough of a change to be significant?significant?

• Before and After scores are not Before and After scores are not independent.independent. See raw dataSee raw data

rr = .64 = .64

Cont.

Results--cont.Results--cont.

• If no change, mean of differences If no change, mean of differences should be zeroshould be zero So, test the obtained mean of So, test the obtained mean of

difference scores against difference scores against = 0. = 0.

Use same test as in one sample testUse same test as in one sample test

tt test test

85.62.1

22.8

9

6.322.8

n

sD

tD

D and sD = mean and standard deviation of differences.

df = n - 1 = 9 - 1 = 8

Cont.

tt test--cont. test--cont.

• With 8 With 8 dfdf, , tt.025.025 = = ++2.306 (Table E.6)2.306 (Table E.6)

• We calculated We calculated tt = 6.85 = 6.85

• Since 6.85 > 2.306, reject Since 6.85 > 2.306, reject HH00

• Conclude that the mean number of Conclude that the mean number of problems after training was less problems after training was less than mean number before trainingthan mean number before training

Advantages of Related Advantages of Related SamplesSamples

• Eliminate subject-to-subject Eliminate subject-to-subject variabilityvariability

• Control for extraneous variablesControl for extraneous variables

• Need fewer subjectsNeed fewer subjects

Disadvantages of Related Disadvantages of Related SamplesSamples

• Order effectsOrder effects

• Carry-over effectsCarry-over effects

• Subjects no longer naïveSubjects no longer naïve

• Change may just be a function of timeChange may just be a function of time

• Sometimes not logically possibleSometimes not logically possible

Between subjects t testBetween subjects t test

• Distribution of differences between Distribution of differences between meansmeans

• Heterogeneity of VarianceHeterogeneity of Variance

• NonnormalityNonnormality

Powerpoint training againPowerpoint training again

• Effect of training on problems using Effect of training on problems using PowerpointPowerpoint Same study as before --almostSame study as before --almost

• Now we have two independent Now we have two independent groupsgroups Trained versus untrained usersTrained versus untrained users We want to compare mean number of We want to compare mean number of

problems between groupsproblems between groups

Effect of training Effect of training Before After Diff.

21 24 21 26 32 27 21 25 18

15 15 17 20 17 20 8

19 10

6 9 4 6

15 7

13 6 8

Mean St. Dev.

23.84 4.20

15.67 4.24

8.17 3.60

Differences from within Differences from within subjects testsubjects test

Cannot compute pairwise differences, since we cannot compare two random people

We want to test differences between the two sample means (not between a sample and population)

AnalysisAnalysis

• How are sample means distributed How are sample means distributed if if HH00 is true? is true?

• Need sampling distribution of Need sampling distribution of differences between meansdifferences between means Same idea as before, except statistic is Same idea as before, except statistic is

(X(X11 - X - X22) (mean 1 – mean2)) (mean 1 – mean2)

Sampling Distribution of Sampling Distribution of Mean DifferencesMean Differences

• Mean of sampling distribution = Mean of sampling distribution = 11 - - 22

• Standard deviation of sampling Standard deviation of sampling distribution (standard error of mean distribution (standard error of mean differences) = differences) =

2

2

2

1

2

121 n

sns

sXX

Cont.

Sampling Distribution--Sampling Distribution--cont.cont.

• Distribution approaches normal as Distribution approaches normal as nn increases.increases.

• Later we will modify this to “pool” Later we will modify this to “pool” variances.variances.

Analysis--cont.Analysis--cont.

• Same basic formula as before, but with Same basic formula as before, but with accommodation to 2 groups.accommodation to 2 groups.

• Note parallels with earlier Note parallels with earlier tt2

2

2

1

2

1

2121

21

ns

ns

XXs

XXt

XX

Degrees of FreedomDegrees of Freedom

• Each group has 6 subjects. Each group has 6 subjects.

• Each group has Each group has nn - 1 = 9 - 1 = 8 - 1 = 9 - 1 = 8 dfdf

• Total Total dfdf = = nn11 - 1 + - 1 + nn22 - 1 = - 1 = nn11 + + nn22 - 2 - 2

9 + 9 - 2 = 16 9 + 9 - 2 = 16 dfdf

• tt.025.025(16) = (16) = ++2.12 (approx.)2.12 (approx.)

ConclusionsConclusions

• T = 4.13T = 4.13

• Critical t = 2.12Critical t = 2.12

• Since 4.13 > 2.12, reject Since 4.13 > 2.12, reject HH00..

• Conclude that those who get Conclude that those who get training have less problems than training have less problems than those without training those without training

AssumptionsAssumptions

• Two major assumptionsTwo major assumptions Both groups are sampled from Both groups are sampled from

populations with the same variancepopulations with the same variance• ““homogeneity of variance”homogeneity of variance”

Both groups are sampled from normal Both groups are sampled from normal populationspopulations• Assumption of normalityAssumption of normality

Frequently violated with little harm.Frequently violated with little harm.

Heterogeneous VariancesHeterogeneous Variances

• Refers to case of unequal population Refers to case of unequal population variances.variances.

• We don’t pool the sample variances.We don’t pool the sample variances.

• We adjust We adjust dfdf and look and look tt up in tables for up in tables for adjusted adjusted dfdf..

• Minimum Minimum dfdf = smaller = smaller nn - 1. - 1. Most software calculates optimal Most software calculates optimal dfdf..

if you think you made a lot of mistakes in the survey project…. think of how much you accomplished...

Documents