if you think you made a lot of mistakes in the survey project…. think of how much you accomplished...
Post on 20-Dec-2015
217 views
TRANSCRIPT
If you think you made a If you think you made a lot of mistakes in the lot of mistakes in the
survey project….survey project….
Think of how much you accomplished Think of how much you accomplished and the mistakes you did not make…and the mistakes you did not make…
• Went from not knowing much about Went from not knowing much about surveys to having designed, deployed, surveys to having designed, deployed, and completed one in 1 ½ monthsand completed one in 1 ½ months
• Actually got people to respond!Actually got people to respond!
• Did not end up with 100 open ended Did not end up with 100 open ended responses which you had to content responses which you had to content analyze!analyze!
One Tailed and Two Tailed tests
One tailed tests: Based on a uni-directional hypothesisExample: Effect of training on problems using PowerPoint
Population figures for usability of PP are knownHypothesis: Training will decrease number of problems with PP
Two tailed tests: Based on a bi-directional hypothesisHypothesis: Training will change the number of problems
with PP
If we know the population meanIf we know the population mean
Mean Usability Index
7.257.00
6.756.50
6.256.00
5.755.50
5.255.00
4.754.50
4.254.00
3.75
Sampling Distribution
Population for usability of Powerpoint
Fre
quen
cy
1400
1200
1000
800
600
400
200
0
Std. Dev = .45
Mean = 5.65
N = 10000.00
Unidirectional hypothesis: .05 level
Bidirectional hypothesis: .05 level
Identify region
• What does it mean if our significance level What does it mean if our significance level is .05?is .05? For a uni-directional hypothesisFor a uni-directional hypothesis For a bi-directional hypothesisFor a bi-directional hypothesis
PowerPoint example: PowerPoint example:
• UnidirectionalUnidirectional If we set significance level at .05 level, If we set significance level at .05 level,
• 5% of the time we will higher mean by chance5% of the time we will higher mean by chance• 95% of the time the higher mean mean will be real95% of the time the higher mean mean will be real
• BidirectionalBidirectional If we set significance level at .05 levelIf we set significance level at .05 level
• 2.5 % of the time we will find higher mean by chance2.5 % of the time we will find higher mean by chance• 2.5% of the time we will find lower mean by chance2.5% of the time we will find lower mean by chance• 95% of time difference will be real95% of time difference will be real
Changing significance Changing significance levelslevels
•What happens if we decrease our What happens if we decrease our significance level from .01 to .05significance level from .01 to .05
Probability of finding differences that don’t Probability of finding differences that don’t exist goes up (criteria becomes more lenient)exist goes up (criteria becomes more lenient)
•What happens if we increase our What happens if we increase our significance from .01 to .001significance from .01 to .001
Probability of not finding differences that exist Probability of not finding differences that exist goes up (criteria becomes more conservative)goes up (criteria becomes more conservative)
• PowerPoint example:PowerPoint example: If we set If we set significance level at .05 level,significance level at .05 level,
• 5% of the time we will find a difference by chance5% of the time we will find a difference by chance• 95% of the time the difference will be real95% of the time the difference will be real
If we set If we set significance level at .01 levelsignificance level at .01 level• 1% of the time we will find a difference by chance1% of the time we will find a difference by chance• 99% of time difference will be real99% of time difference will be real
• For usability, if you are set out to find For usability, if you are set out to find problems: setting lenient criteria might work problems: setting lenient criteria might work better (you will identify more problems)better (you will identify more problems)
• Effect of decreasing significance level from Effect of decreasing significance level from .01 to .05.01 to .05 Probability of finding differences that don’t exist Probability of finding differences that don’t exist
goes up (criteria becomes more lenient)goes up (criteria becomes more lenient)
Also called Also called Type I error (Alpha)Type I error (Alpha)
• Effect of increasing significance from .01 Effect of increasing significance from .01 to .001to .001 Probability of not finding differences that exist Probability of not finding differences that exist
goes up (criteria becomes more conservative)goes up (criteria becomes more conservative)
Also called Also called Type II error (Beta)Type II error (Beta)
Degree of FreedomDegree of Freedom
• The number of independent pieces of information The number of independent pieces of information remaining after estimating one or more remaining after estimating one or more parametersparameters
• Example: List= 1, 2, 3, 4 Average= 2.5Example: List= 1, 2, 3, 4 Average= 2.5
• For average to remain the same three of the For average to remain the same three of the numbers can be anything you want, fourth is fixednumbers can be anything you want, fourth is fixed
• New List = 1, 5, 2.5, __ Average = 2.5New List = 1, 5, 2.5, __ Average = 2.5
Major PointsMajor Points
• T tests: are differences significant?T tests: are differences significant?
• One sample t tests, comparing one mean One sample t tests, comparing one mean to populationto population
• Within subjects test: Comparing mean in Within subjects test: Comparing mean in condition 1 to mean in condition 2condition 1 to mean in condition 2
• Between Subjects test: Comparing mean Between Subjects test: Comparing mean in condition 1 to mean in condition 2in condition 1 to mean in condition 2
Effect of training on Powerpoint Effect of training on Powerpoint useuse
• Does training lead to lesser problems Does training lead to lesser problems with PP?with PP?
• 9 subjects were trained on the use of 9 subjects were trained on the use of PP.PP.
• Then designed a presentation with PP.Then designed a presentation with PP. No of problems they had was DVNo of problems they had was DV
Powerpoint study dataPowerpoint study data
• Mean = 23.89Mean = 23.89
• SD = 4.20SD = 4.20
21
24
21
26
32
27
21
25
18
Mean 23.89
SD 4.20
Results of Powerpoint Results of Powerpoint study.study.
• ResultsResults Mean number of problems = 23.89Mean number of problems = 23.89
• Assume we know that without training the Assume we know that without training the mean would be 30, but not the standard mean would be 30, but not the standard deviation deviation
Population mean = 30Population mean = 30
• Is 23.89 enough smaller than 30 to conclude Is 23.89 enough smaller than 30 to conclude that training affected results?that training affected results?
One sample t test cont.One sample t test cont.
• Assume mean of population known, but Assume mean of population known, but standard deviation (SD) not knownstandard deviation (SD) not known
• Substitute sample SD for population SD Substitute sample SD for population SD (standard error)(standard error)
• Gives you the t statisticsGives you the t statistics
• Compare Compare t t to tabled values which show to tabled values which show critical values of tcritical values of t
tt Test for One Mean Test for One Mean
• Get mean difference between Get mean difference between sample and population mean sample and population mean
• Use sample SD as variance metric = Use sample SD as variance metric = 4.404.40
48.146.1
11.6
9
40.489.2330
n
sX
t
Degrees of FreedomDegrees of Freedom
• Skewness of sampling distribution of Skewness of sampling distribution of variance decreases as variance decreases as nn increases increases
• tt will differ from will differ from zz less as sample size less as sample size increasesincreases
• Therefore need to adjust Therefore need to adjust tt accordingly accordingly
• dfdf = = nn - 1 - 1
• tt based on based on dfdf
Looking up critical t (Table Looking up critical t (Table E.6)E.6)
Two-Tailed Significance Level
df .10 .05 .02 .01 4 1.812 2.228 2.764 3.169
5 1.753 2.131 2.602 2.947 6 1.725 2.086 2.528 2.845 7 1.708 2.060 2.485 2.787 8 1.697 2.042 2.457 2.750 9 1.660 1.984 2.364 2.626
ConclusionsConclusions
• Critical t= Critical t= nn = 9, = 9, tt.05.05 = 2.62 (two tail = 2.62 (two tail
significance)significance)
• If If tt > 2.62, reject > 2.62, reject HH00
• Conclude that training leads to less Conclude that training leads to less problemsproblems
Factors Affecting Factors Affecting tt
• Difference between sample and Difference between sample and population meanspopulation means
• Magnitude of sample varianceMagnitude of sample variance
• Sample sizeSample size
Factors Affecting DecisionFactors Affecting Decision
• Significance level Significance level
• One-tailed versus two-tailed testOne-tailed versus two-tailed test
Sampling Distribution of Sampling Distribution of the Meanthe Mean
• We need to know what kinds of We need to know what kinds of sample means to expect if training sample means to expect if training has no effect.has no effect. i. e. What kinds of sample means if i. e. What kinds of sample means if
population mean = 23.89population mean = 23.89
Recall the sampling distribution of the Recall the sampling distribution of the mean.mean.
Sampling Distribution of Sampling Distribution of the Mean--cont.the Mean--cont.
• The sampling distribution of the The sampling distribution of the mean depends onmean depends on Mean of sampled populationMean of sampled population
St. dev. of sampled populationSt. dev. of sampled population
Size of sampleSize of sample
Mean Number of problems
7.257.00
6.756.50
6.256.00
5.755.50
5.255.00
4.754.50
4.254.00
3.75
Sampling Distribution
Number of problems with Powerpoint UseF
requ
ency
1400
1200
1000
800
600
400
200
0
Std. Dev = .45
Mean = 23.89
N = 10000.00
Cont.
Sampling Distribution of Sampling Distribution of the mean--cont.the mean--cont.
• Shape of the sampled populationShape of the sampled population Approaches normalApproaches normal
Rate of approach depends on sample Rate of approach depends on sample sizesize
Also depends on the shape of the Also depends on the shape of the population distributionpopulation distribution
Implications of the Central Implications of the Central Limit TheoremLimit Theorem
• Given a population with mean = Given a population with mean = and and standard deviation = standard deviation = , the sampling , the sampling distribution of the mean (the distribution of the mean (the distribution of sample means) has a distribution of sample means) has a mean = mean = , and a standard deviation = , and a standard deviation = / /nn. .
• The distribution approaches normal as The distribution approaches normal as nn, the sample size, increases., the sample size, increases.
DemonstrationDemonstration
• Let population be very skewedLet population be very skewed
• Draw samples of 3 and calculate meansDraw samples of 3 and calculate means
• Draw samples of 10 and calculate meansDraw samples of 10 and calculate means
• Plot meansPlot means
• Note changes in means, standard Note changes in means, standard deviations, and shapesdeviations, and shapes
Cont.
X
20.018.0
16.014.0
12.010.0
8.06.0
4.02.0
0.0
Skewed Population F
req
ue
ncy
3000
2000
1000
0
Std. Dev = 2.43
Mean = 3.0
N = 10000.00
Parent PopulationParent Population
Cont.
Sampling Distribution Sampling Distribution nn = = 33
Sample Mean
13.0012.00
11.0010.00
9.008.00
7.006.00
5.004.00
3.002.00
1.000.00
Sampling Distribution
Sample size = n = 3F
req
ue
ncy
2000
1000
0
Std. Dev = 1.40
Mean = 2.99
N = 10000.00
Cont.
Sampling Distribution Sampling Distribution nn = = 1010
Sample Mean
6.506.00
5.505.00
4.504.00
3.503.00
2.502.00
1.501.00
Sampling Distribution
Sample size = n = 10F
req
ue
ncy
1600
1400
1200
1000
800
600
400
200
0
Std. Dev = .77
Mean = 2.99
N = 10000.00
Cont.
Demonstration--cont.Demonstration--cont.
• Means have stayed at 3.00 Means have stayed at 3.00 throughout--except for minor sampling throughout--except for minor sampling errorerror
• Standard deviations have decreased Standard deviations have decreased appropriatelyappropriately
• Shapes have become more normal--Shapes have become more normal--see superimposed normal distribution see superimposed normal distribution for referencefor reference
Within subjects t testsWithin subjects t tests
• Related samplesRelated samples
• Difference scoresDifference scores
• tt tests on difference scores tests on difference scores
• Advantages and disadvantagesAdvantages and disadvantages
Related SamplesRelated Samples
• The same participants give us data on The same participants give us data on two measurestwo measures e. g. Before and After treatmente. g. Before and After treatment
Usability problems before training on PP and Usability problems before training on PP and after trainingafter training
• With related samples, someone high on With related samples, someone high on one measure probably high on one measure probably high on other(individual variability).other(individual variability).
Cont.
Related Samples--cont.Related Samples--cont.
• Correlation between before and Correlation between before and after scoresafter scores Causes a change in the statistic we can Causes a change in the statistic we can
useuse
• Sometimes called matched samples Sometimes called matched samples or repeated measuresor repeated measures
Difference ScoresDifference Scores
• Calculate difference between first Calculate difference between first and second scoreand second score e. g. Difference = Before - Aftere. g. Difference = Before - After
• Base subsequent analysis on Base subsequent analysis on difference scoresdifference scores Ignoring Before and After dataIgnoring Before and After data
Effect of training Effect of training Before After Diff.
21 24 21 26 32 27 21 25 18
15 15 17 20 17 20 8
19 10
6 9 4 6
15 7
13 6 8
Mean St. Dev.
23.84 4.20
15.67 4.24
8.17 3.60
ResultsResults
• The training decreased the number of The training decreased the number of problems with Powerpointproblems with Powerpoint
• Was this enough of a change to be Was this enough of a change to be significant?significant?
• Before and After scores are not Before and After scores are not independent.independent. See raw dataSee raw data
rr = .64 = .64
Cont.
Results--cont.Results--cont.
• If no change, mean of differences If no change, mean of differences should be zeroshould be zero So, test the obtained mean of So, test the obtained mean of
difference scores against difference scores against = 0. = 0.
Use same test as in one sample testUse same test as in one sample test
tt test test
85.62.1
22.8
9
6.322.8
n
sD
tD
D and sD = mean and standard deviation of differences.
df = n - 1 = 9 - 1 = 8
Cont.
tt test--cont. test--cont.
• With 8 With 8 dfdf, , tt.025.025 = = ++2.306 (Table E.6)2.306 (Table E.6)
• We calculated We calculated tt = 6.85 = 6.85
• Since 6.85 > 2.306, reject Since 6.85 > 2.306, reject HH00
• Conclude that the mean number of Conclude that the mean number of problems after training was less problems after training was less than mean number before trainingthan mean number before training
Advantages of Related Advantages of Related SamplesSamples
• Eliminate subject-to-subject Eliminate subject-to-subject variabilityvariability
• Control for extraneous variablesControl for extraneous variables
• Need fewer subjectsNeed fewer subjects
Disadvantages of Related Disadvantages of Related SamplesSamples
• Order effectsOrder effects
• Carry-over effectsCarry-over effects
• Subjects no longer naïveSubjects no longer naïve
• Change may just be a function of timeChange may just be a function of time
• Sometimes not logically possibleSometimes not logically possible
Between subjects t testBetween subjects t test
• Distribution of differences between Distribution of differences between meansmeans
• Heterogeneity of VarianceHeterogeneity of Variance
• NonnormalityNonnormality
Powerpoint training againPowerpoint training again
• Effect of training on problems using Effect of training on problems using PowerpointPowerpoint Same study as before --almostSame study as before --almost
• Now we have two independent Now we have two independent groupsgroups Trained versus untrained usersTrained versus untrained users We want to compare mean number of We want to compare mean number of
problems between groupsproblems between groups
Effect of training Effect of training Before After Diff.
21 24 21 26 32 27 21 25 18
15 15 17 20 17 20 8
19 10
6 9 4 6
15 7
13 6 8
Mean St. Dev.
23.84 4.20
15.67 4.24
8.17 3.60
Differences from within Differences from within subjects testsubjects test
Cannot compute pairwise differences, since we cannot compare two random people
We want to test differences between the two sample means (not between a sample and population)
AnalysisAnalysis
• How are sample means distributed How are sample means distributed if if HH00 is true? is true?
• Need sampling distribution of Need sampling distribution of differences between meansdifferences between means Same idea as before, except statistic is Same idea as before, except statistic is
(X(X11 - X - X22) (mean 1 – mean2)) (mean 1 – mean2)
Sampling Distribution of Sampling Distribution of Mean DifferencesMean Differences
• Mean of sampling distribution = Mean of sampling distribution = 11 - - 22
• Standard deviation of sampling Standard deviation of sampling distribution (standard error of mean distribution (standard error of mean differences) = differences) =
2
2
2
1
2
121 n
sns
sXX
Cont.
Sampling Distribution--Sampling Distribution--cont.cont.
• Distribution approaches normal as Distribution approaches normal as nn increases.increases.
• Later we will modify this to “pool” Later we will modify this to “pool” variances.variances.
Analysis--cont.Analysis--cont.
• Same basic formula as before, but with Same basic formula as before, but with accommodation to 2 groups.accommodation to 2 groups.
• Note parallels with earlier Note parallels with earlier tt2
2
2
1
2
1
2121
21
ns
ns
XXs
XXt
XX
Degrees of FreedomDegrees of Freedom
• Each group has 6 subjects. Each group has 6 subjects.
• Each group has Each group has nn - 1 = 9 - 1 = 8 - 1 = 9 - 1 = 8 dfdf
• Total Total dfdf = = nn11 - 1 + - 1 + nn22 - 1 = - 1 = nn11 + + nn22 - 2 - 2
9 + 9 - 2 = 16 9 + 9 - 2 = 16 dfdf
• tt.025.025(16) = (16) = ++2.12 (approx.)2.12 (approx.)
ConclusionsConclusions
• T = 4.13T = 4.13
• Critical t = 2.12Critical t = 2.12
• Since 4.13 > 2.12, reject Since 4.13 > 2.12, reject HH00..
• Conclude that those who get Conclude that those who get training have less problems than training have less problems than those without training those without training
AssumptionsAssumptions
• Two major assumptionsTwo major assumptions Both groups are sampled from Both groups are sampled from
populations with the same variancepopulations with the same variance• ““homogeneity of variance”homogeneity of variance”
Both groups are sampled from normal Both groups are sampled from normal populationspopulations• Assumption of normalityAssumption of normality
Frequently violated with little harm.Frequently violated with little harm.
Heterogeneous VariancesHeterogeneous Variances
• Refers to case of unequal population Refers to case of unequal population variances.variances.
• We don’t pool the sample variances.We don’t pool the sample variances.
• We adjust We adjust dfdf and look and look tt up in tables for up in tables for adjusted adjusted dfdf..
• Minimum Minimum dfdf = smaller = smaller nn - 1. - 1. Most software calculates optimal Most software calculates optimal dfdf..