statistical methodology t-test, chi-squared, mean, median, mode
TRANSCRIPT
STATISTICAL METHODOLOGYT-Test, Chi-Squared, Mean, Median, Mode
2 Types of Statistics• 2 types of analysis techniques:• 1. Descriptive statistics: techniques that
help summarize large amounts of info. Include measures of variability and measures of correlation (Describe the data)
• Population, Bag of M&Ms• 2. Inferential statistics: techniques that
help researchers make generalizations about a finding, based on a limited number of subjects
• Sample, Handful of M&Ms
M&M Sampling
• 13% brown, red• 14% yellow• 24% blue• 20% orange• 16% green• What was yours?
Descriptive Statistics
– Frequency distribution - organizational technique that shows the number of times each score occurs, so that the scores can be interpreted• Graph depictions– frequency polygon - curve– frequency histogram - bars
Descriptive Statistics
–Central Tendency - a number that represents the entire group or sample – Tend to hover towards the center• Average IQ score, around 100• 2 genius parents tend to have average IQ child• Politicians (Dem or Rep) dance in the center for max. votes• Weight distribution
Descriptive Statistics
–The Bell Curve–Grades, IQ, Poverty – Link between intelligence and salary• When did a C become an F?• Is a C acceptable? C=average• Does everyone get a trophy, ribbon?• Can everyone get an A?
Descriptive Statistics • mean - the arithmetic average • median - middle score when arranged lowest to highest• mode - the most frequent score in a distribution–unimodal - one high point–bimodal - two high points
Set: 2, 2, 3, 5, 8 Median: 3 Mode: 2Mean: Add up (20), divide by 5= 4
Descriptive Statistics –bimodal - two high points–The more overlap in the bimodal arches, the higher
the variable link between the data–The less overlap, the lower the connection
–Bimodal or unimodal?
A
B
Descriptive Measures• 2 ways we measure:• 1. Range: Highest score minus the lowest score--
tells how far apart the scores are – simplest measures of variability to calculate.• (weakness of range: it can easily be influenced by one extreme
score, Savant IQ of 220)
• Set: 2, 2, 3, 5, 8• Range: 8 - 2 = 6
Ex: Age Range 15-17, Difference 2 7-17, Difference 10
Child prodigies, Dougie Houser, Chess, sci, art, music
Descriptive Measures• The other way to measure is:• 2. Standard Deviation: measure of variability that describes
how scores are distributed around the mean. – (1 SD, 2 SD, -1, -2)
– Central Tendency: tend to hover near the center.
Savant, 2201 in 30 million
Einstein 160Bill GatesStephen Hawking
Hillary ClintonMadonna 140
Clinton 137
Bush Rumor: 85Actual: 125
Obama 145-148
-1 SD-2 SD-3 SD +1 SD +2 SD +3 SD
Standard Deviation
34% 34%13.5%13.5%2% 2%
68%
95%
99%
1% outliersSavant, 2201 in 30 million
Case Study: Marilyn vos Savant– Born Aug 11, 1946 (63) Missouri– American magazine columnist “ask Marilyn”, Parade
Magazine (logic, math puzzles), books• Guinness Book World records, Highest IQ (220+)– Age 10 (1957) scored 167-218 (1 in 30 million)
Case Study: Rain Man• 1988 comedy-drama (Tom Cruise)• Dustin Hoffman portrays Raymond Babbitt • Autistic Savant– Based on 2 real people (Kim Peek)
Video clip: Rain Man
Standard DeviationTo calculate standard deviation (SD):• 1. find the mean of the distribution 4• 2. subtract each score from the mean 4-2, 4-2, 4-3, 4-5, 4-8• 3. square each result – “deviations” 4-2=2 2 squared=4• 4. add the squared deviations 4 + 4 + 1 + 1 + 16 = 26• 5. divide by the total number (n - 1) of scores; this result is called the
variance 26 / 4 (5 – 1) = 6.5 (V)• 6. find the square root of the variance; this is the standard deviation
(SD) 2.55 (SD)• 7. n = biased sample – does not accurately represent population being
tested (out of the norm, get rid of out-liers) 5• 8. (n - 1) = unbiased sample 4• 9. now you can compare distributions with different means and
standard deviations (ex: 3 different class scores, 78, 80, 92)
Set: 2, 2, 3, 5, 8
Sigma Σ• Σ the symbol for standard deviation (SD) is s. – Greek letter “sigma” (lower case form)
• S upper case letter (other Greek “sigma”) – Standard meaning in mathematics, “add up a list
of numbers.” – Represents Sum, i.e. add together
Z-ScoreZ-scores: a way of expressing a score’s distance from the mean
in terms of the standard deviation (SD)• to find a Z-score for a number in a distribution, subtract
the mean from that number, and divide the result by the standard deviation 8 – 4 (M)= 4 / 2.55 (SD) = 1.56• a positive Z-score shows that the number is higher than
the mean (You’re OK, IQ, health average or higher)• a negative Z-score allows psychologists to compare
distributions with different means and standard deviations (Below average, health, psych concerns)• Sometimes Z-scores are necessary to explain standard
deviation in an experiment’s results/discussionPOS ZNEG Z
Skewed Results• When there are more scores at
the high or low end of a distribution it is said to be skewed–tail signifies the extreme score–Single tailed = extreme score
on either side–Which direction are the
“outliers?” –Called Right/Left Skew–Also Pos./Neg. Skew
Majority
Majority
Outliers: fringe, oddball, genius, bad egg
A Skewed Distribution
Are the results positively or negatively skewed?
Positive Skew orSkewed Right
Statistical Significance• Statistics & Data• T-test, CHI-square, Z-score• Psychometrics
• Statistical Significance– “I want to prove that my independent variable causes
my dependent variable 95% of the time”– 95% to be valid– Probability= P<.05(5%) chance, random, chaos theory
Inferential StatisticsTests of Significance - used for determining
whether the difference in scores between the experimental and control groups is really due to the effects of the independent variable or just due to random chance
• If p < .05 (95%) the outcome (or the difference between experimental and control groups) has a probability of occurring by random chance less than 5 x per 100– Researchers conclude the effect of the independent
variable is significant (real).
Confused about Significance?Tests of Significance – • You want brain surgery to work (at least) 95% of the
time.• Your car?• Guns in military?• Prescription drugs?• Cancer?
• Dr. House: knows what the results of a test/disease SHOULD be 95%, move on to the next test.
There is a 5% chance for random results (chaos theory)
Inferential Statistics• Statistically Significant – • It is concluded that the independent variable
made a real difference between the experimental group and the control group
– Ritalin really DOES help ADHD– Raising serotonin levels DOES
help Depression (yoga)
Null Hypothesis• Null Hypothesis: any alternative hypothesis, if yours is
wrong!• Significance tests are used to accept or reject the null
hypothesis. – If the probability of observing your result is < .05 (95%)– Your theory is true, reject the null hypothesis• Meaning that your original hypothesis is possible
(without chance, random, chaos)– If the probability of observing your result is > .05 accept
the null.• Meaning that your original hypothesis is not possible
(too much left to chance, random events)• You need a backup, alternative hypothesis
Year 2 IB Psych Only
Null Hypothesis Practice• Accept or Reject the Null?• My hypothesis: Drug X will stop sleep walking 95%.• Do the testing. Do the data. • Drug X has a probability of 63%.• Is it greater than or less than 5% chance? <>.05? • Do you accept the Null or reject the Null Hypothesis?• ACCEPT the NULL! My theory was wrong!• 37% chance, error, random– Maybe it’s the patients I chose? – Maybe too much caffeine before bed? – Maybe drug was contaminated in the lab?– Start over, new test, new drug, new data
Null Hypothesis Practice• Accept or Reject the Null?• My hypothesis: Stress causes mice to gain weight.• Do the testing. Do the data. • The “stressed mice” gained weight 97%.• The “control group” of mice showed no weight gain.• Is it greater than or less than 5% chance? <>.05? • Do you accept the Null or reject the Null Hypothesis?• REJECT the NULL! My theory was right!• 3% chance, error, random– Good Job! Bonus and a raise!
Types of Tests
•1. T-Test•2. Chi-Square Test•3. Mann-Whitney U•4. Sign Test•5. Wilcoxon Matched-Pairs Signed-Rank
Test
Which letters belong together?
•AGHOLPEWQMCANSWER:
AHLEWMGOPQC
When to Use the T-Test?• T-Test – when 1 variable is used in 2
situations-- Ex: Ritalin effects in either ADHD males or ADHD females-- Ex: subject has to pick out a letter in a round list or a square list
• Common situation in psychology:• Randomly assign people to an “experimental”
group or a “control” group to study the effect– In this situation, we are interested in the mean
difference between the 2 conditions.– The significance test used in this kind of scenario is
called a t-test. • Used to determine whether the observed mean
difference is within the range (less that.05) that would be expected if the null hypothesis were true.
When to Use the T-Test?
How to Use the T-Test?
• T-Test• 1. Subtract mean from each score• 2. Rank items• 3. Sum of Positive Ranks• 4. Sum of Negative Ranks• 5. Smallest score = T• If t > 1.96 or < - 1.96, then p < .05 (Test is Valid)
GIRLSBOYS74522314415364
3021
How to Use the T-Test?• T-Test• 1. Subtract mean from each score
Mean= 21 divided by 7 = 34-3, 2-3, 3-3, 4-3, 1-3, 3-3, 4-31 , -1, 0, 1, -2, 0, 1
• 2. Rank items1, 1, 1, 0, -1, -2
• 3. Sum of Positive Ranks1+1+1+0=3
• 4. Sum of Negative Ranks-1 + -2 = -3
• 5. Smallest score = t (-3)• If t > 1.96 or < - 1.96, then p < .05
BOYS4234134
21
7scores
Test is VALID
We have to redo our
hypothesis ???
That bites
Awesome Calculators!
• www.graphpad.com/quickcalcs/index.cfm
• T-Test• Chi-Square
When do I Use Chi-Square?• A common situation in psychology is when a researcher is
interested in the relationship between 2 nominal or categorical variables.
• The significance test used in this kind of situation is called a chi-square (2).
• Ex: We are interested in whether single men vs. women are more likely to own cats vs. dogs.
• Notice that both variables are categorical.– Kind of pet– Gender male or female.
Chai-squared
Example Data: Observed (Actual Data)
• Males are more likely to have dogs as opposed to cats
• Females are more likely to have cats than dogs
Cat Dog
Male 20 30 50
Female 30 20 50
50 50 100
NHST (Null Hypothesis Significance Testing)
Question: Are these differences best accounted for by the null hypothesis?
Is there is a real relationship between gender and pet ownership?
Example Data: Observed (Actual Data)
• Are females more emotional? Emotional Not
Emotional
Female 20 30 50
Male 30 20 50
50 50 100
Chi-Square Test – when there are 2 variables
– The closer your results (Experimental and Control), the harder to prove if indep. variable (IV) really worked.
– Further apart, you can see definite difference.
Example Data: Expected Data
• To find expected value for a cell of the table, multiply the corresponding row total by the column total, and divide by the grand total
• For the first cell (and all other cells)
• (50 x 50)/100 = 25• Thus, if the two variables
are unrelated, we would expect to observe 25 people in each cell
Cat Dog
Male 25 25 50
Female 25 25 50
50 50 100
Example Data: Expected vs. Observed
• The differences between these (E) expected values (25) and the (O) observed values (see boxes) are aggregated according to the Chi-square formula:
Cat Dog
Male 20 30 50
Female 30 20 50
50 50 100
E
EO 22
25
2520
25
2530
25
2530
25
2520 22222
4111125
25
25
25
25
25
25
252
• Once you have the chi-square statistic, it can be evaluated against a chi-square sampling distribution
• The sampling distribution characterizes the range of chi-square values we might observe if the null hypothesis is true, but sampling error is giving rise to deviations from the expected values.
• In our example in which the chi-square was 4.0, the associated p-value was >.05
• Accept the Null Hypothesis, need an Alternative Hypothesis, You did NOT prove your experiment
Do We Accept or Reject the Null?
Prisoner’s Dilemma, Social Trap Game Matrix, Non-zero-Sum-Game, Game Theory (Nash) ALL CHI SQUARED
Mann-Whitney U Test• Skewed results? Are they from the same
distribution?– Use to determine if there were problems with
sampling, population, contamination– Use for 2 groups (samples)– Sub. For T-Score (T-Test)– Ex: Experimental & Control
How To Use Mann-Whitney U Test– Ex: Experimental & Control– Lay out all of your scores (in both groups)– Rate them Rank 1 (lowest) - Rank 15 (highest)
• Experimental Group Control Group• Time (min) Rank Time (min) Rank• 140 4 130 1• 147 6 135 2• 153 8 138 3• 160 10 144 5• 165 11 148 7• 170 13 155 9• 171 14 168 12• 193 15
How To Use Mann-Whitney U Test• Add up the sum of both groups (+)• Experimental Group Control Group• Time (min) Rank Time (min) Rank• 140 4 130 1• 147 6 135 2• 153 8 138 3• 160 10 144 5• 165 11 N1=8 148 7• 170 13 155 9• 171 14 168 12• 193 15________________________________• R1 =81, N1=8 R2 =39, N2=7
N2=7
How To Use Mann-Whitney U Test• Experimental Group Control Group• R1 =81, N1=8 R2 =39, N2=7
• Formula to find U (Hypothetical Data Statistics)• U=N1N2 + N1(N1+1)-R1
2• U=(8)(7) + 8(9) -81
2• U= 56 + 36 – 81
• U= 11
How To Use Mann-Whitney U Test• Experimental Group Control Group• R1 =81, N1=8 R2 =39, N2=7 U=11• Is 11 in between the N1-N2 range of #s on the chart? (6-50) YES• Go to the Mann-Whitney Chart (Table 1)
N1 2 3 4 5 6 7 8N22
3
4
5
6
7 6/50
How To Use Mann-Whitney U Test• Experimental Group Control Group• R1 =81, N1=8 R2 =39, N2=7 U=11• Is 11 in between the N1-N2 range of #s on the chart? (6-50)
• If YES, reject the Null Hypothesis, your data is acceptable to use• Your distribution and population is acceptable, even though a skew
has occurred, you are within the acceptable range
• If NO, accept the Null Hypothesis, your data is not acceptable• Something has contaminated your population or your data, you must
go to a Null, or Alternate Hypothesis.
Video Clips
• Stossel, Media Scare (Statistics)• ThePsychFiles, Faces, Metacafe, 14 min• Eddie Izzard, Part 9, Tea & Cake or Death?