![Page 1: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/1.jpg)
![Page 2: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/2.jpg)
Summary from last weekThe normal distributionHypothesis testing
Type I and II errors Statistical power
Exercises Exercises on SD etc. Descriptive data analysis in SPSS
![Page 3: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/3.jpg)
We covered the following:
Populations and samplesFrequenzy distributionsMode, Median, MeanStandard DeviationConfidence intervals
![Page 4: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/4.jpg)
We are [often] interested in answering questions about populations i.e. going beyond the samples
Samples are used to make a guess about what results we would get, if we used the entire population
![Page 5: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/5.jpg)
One of the first operations we perform having obtained new data from a sample of people, is to summarize them
This is done to figure out the general patterns within the data
Two choices: Calculate a summary statistic, which tells us
something about the scores collected Draw a graph – for the same purpose
![Page 6: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/6.jpg)
The simplest graph summarizes how many times each score collected occurs: A frequency distribution (or histogram)
Histogram shows that most people had 6+ errors
![Page 7: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/7.jpg)
Different ways of summarizing data: The Mean:
Add all the scores together and divide by the total number of scores.
e.g. (3+4+4+5+6) / 5 = 22 / 5 = 4.4
XX
N
![Page 8: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/8.jpg)
Standard deviation shows the accuracy of the mean
![Page 9: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/9.jpg)
Often we have more than one sample of a population
This permits the calculation different sample means, whose value will vary, giving us a sampling distribution = 10
M = 8M = 10
M = 9
M = 11
M = 12M = 11
M = 9
M = 10
M = 10Sample Mean
6 7 8 9 10 11 12 13 14
Fre
quen
cy
0
1
2
3
4
Mean = 10SD = 1.22
Sampling distribution
![Page 10: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/10.jpg)
The distribution of sample means is normally distributed
... No matter what the shape of the original distribution of raw scores in the population.
This is due to the Central Limit Theorem
This holds true only for sample sizes of 30 and greater
Means: odds of sample means being similar is very high
![Page 11: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/11.jpg)
If we obtain a sample mean that is much higher or lower than the population mean, there are two possible reasons:
(1) Our sample mean is a rare "fluke" (a quirk of sampling variation);
(2) Our sample has not come from the population we thought it did, but from some other, different, population.
The greater the difference between the sample and population means, the more plausible (2) becomes
![Page 12: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/12.jpg)
Sample means from populations tend to be
similar. If not, there are two explanations:
(1) the sample is a fluke: By chance our random samples contained people with very different properties
(2) the sample does not come from the population we thought they did
![Page 13: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/13.jpg)
We can decide between these alternatives as follows:
The differences between any two sample means from the same population are normally distributed, around a mean difference of zero.
Most differences will be relatively small, since the Central Limit Theorem tells us that most samples will have similar means to the population mean (similar means to each other).
If we obtain a very large difference between our sample means, it could have occurred by chance, but this is very unlikely - it is more likely that the two samples come from different populations.
![Page 14: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/14.jpg)
The term does not necessarily refer to a set of individuals or items (e.g. cars). Rather, it refers to a state of individuals or items.
Example: After a major earthquake in a city (in which no one died) the actual set of individuals remains the same. But the anxiety level, for example, may change. The anxiety level of the individuals before and after the quake defines them as two populations.
![Page 15: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/15.jpg)
![Page 16: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/16.jpg)
The Normal curve is a mathematical abstraction which conveniently describes ("models") many frequency distributions of scores in real-life.
![Page 17: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/17.jpg)
length of pickled gherkins:
length of time before someone looks away in a staring contest:
![Page 18: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/18.jpg)
Francis Galton (1876) 'On the height and weight of boys aged 14, in town and country public schools.' Journal of the Anthropological Institute, 5, 174-180:
![Page 19: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/19.jpg)
Francis Galton (1876) 'On the height and weight of boys aged 14, in town and country public schools.' Journal of the Anthropological Institute, 5, 174-180:
Height of 14 year-old children
0
2
4
6
8
10
12
14
16
51-5
2
53-5
4
55-5
6
57-5
8
59-6
0
61-6
2
63-6
4
65-6
6
67-6
8
69-7
0
height (inches)
freq
uen
cy (
%)
country
town
![Page 20: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/20.jpg)
Properties of the Normal Distribution:
1. It is bell-shaped and asymptotic at the extremes.
![Page 21: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/21.jpg)
2. It's symmetrical around the mean.
![Page 22: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/22.jpg)
3. The mean, median and mode all have same value.
![Page 23: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/23.jpg)
4. It can be specified completely, once mean and s.d. are known.
![Page 24: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/24.jpg)
5. The area under the curve is directly proportional to the relative frequency of observations.
![Page 25: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/25.jpg)
e.g. here, 50% of scores fall below the mean, as does 50% of the area under the curve.
![Page 26: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/26.jpg)
e.g. here, 85% of scores fall below score X, corresponding to 85% of the area under the curve.
![Page 27: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/27.jpg)
Relationship between the normal curve and the standard deviation:
All normal curves share this property: the s.d. cuts off a constant proportion of the distribution of scores:-
-3 -2 -1 mean +1 +2 +3
Number of standard deviations either side of mean
fre
qu
enc
y
99.7%
68%
95%
![Page 28: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/28.jpg)
About 68% of scores will fall in the range of the mean plus and minus 1 s.d.;
95% in the range of the mean +/- 2 s.d.'s;
99.7% in the range of the mean +/- 3 s.d.'s.
e.g.: I.Q. is normally distributed, with a mean of 100 and s.d. of 15.
Therefore, 68% of people have I.Q's between 85 and 115 (100 +/- 15).
95% have I.Q.'s between 70 and 130 (100 +/- (2*15).
99.7% have I.Q's between 55 and 145 (100 +/- (3*15).
![Page 29: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/29.jpg)
85 (mean - 1 s.d.) 115 (mean + 1 s.d.)
68%
![Page 30: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/30.jpg)
Just by knowing the mean, SD, and that scores are normally distributed, we can tell a lot about a population.
If we encounter someone with a particular score, we can assess how they stand in relation to the rest of their group.
e.g.: someone with an I.Q. of 145 is quite unusual: This is 3 SD's above the mean. I.Q.'s of 3 SD's or above occur in only 0.15% of the population [ (100-99.7) / 2 ]. Note: divide with 2 as there are 2 sides to the normal distribution!
![Page 31: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/31.jpg)
Conclusions:
Many psychological/biological properties are normally distributed.
This is very important for statistical inference (extrapolating from samples to populations)
![Page 32: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/32.jpg)
![Page 33: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/33.jpg)
Scientists are interested in testing hypotheses Testing the scientific question they are interested
in
With most experimental work, we have a prediction that our manipulation of the IV will lead to a result – this is the experimental hypothesis
The reverse of the experimental hypothesis is called the null hypothesis – this specifies that our prediction was wrong and that the experiment did not have an effect
![Page 34: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/34.jpg)
Example: Alchohol makes you fall over The experimental hypothesis (H) is
that those that drink alchohol will fall over more than those that do not
The null hypothesis (H0) is that people will fall over the same amount regardless of how much alchohol they have drunk
![Page 35: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/35.jpg)
Inferential statistics are used to discover whether the experimental hypothesis is likely to be true
We can never be 100% sure– so we deal with probabilities
We calculate the probability that the result we have obtained are a chance result – typically 5% (0.05)
As this probability decreases, we become more confident that out experimental hypothesis is correct (and the null hypothesis can be rejected)
Working with humans, we normally work with a 95% threshold for confidence
![Page 36: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/36.jpg)
Example: Two groups of dinosaurs We hit one of the groups over the head with meteors,
measuring how many hits it take before they get a headache.
We would expect the means of the two groups to be similar – i.e. require similar numbers of meteors ▪ Only different means if by random chance we got dinosaurs from the
extremes of the populations – unlikely given the normal distribution
We would expect our manipulation to have an effect on the mean of the experimental group
![Page 37: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/37.jpg)
We manipulate our experimental group Measure the mean of the DV. If the mean is different from the control
group, there are two possible explanations:▪ The manipulation changed the thing we are measuring –
we now have samples from 2 different populations
▪ The manipulation did not have an effect, but we just two samples of people who are by random chance very different – the observed difference is a fluke of sampling
![Page 38: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/38.jpg)
The bigger the difference between our sample means, the more likely for our experiment to have had an effect!
When means are similar between control groups and experiment groups after experimentation, we are less confident about our experiment having had an effect
These ideas form the basis for hypothesis testing
![Page 39: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/39.jpg)
We calculate the probability that two samples come from the same population
When this is high, we conclude our experiment had no effect (null hypothesis is true)
When it is low, we conclude the experiment had an effect (experimental hypothesis is true)
If the propability of the two samples being from the same population is 5% or less, we accept that the experimental manipulation was succesful
![Page 40: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/40.jpg)
How do we calculate propabilities that samples are from the same population? This depends on the experimental design and the test used
Some general principles: We know there are two types of variation in an
experiment: ▪ Systematic: Variation due to the experimental manipulation of the IV ▪ Unsystematic: Variation due to natural differences in people
We compare the amount of variation created by the experimental manipulation with the amount of variance due to random factors – this because we expect our experiment to have a greater effect than than the random factors alone
![Page 41: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/41.jpg)
When trying to find our if the experimentally caused variance is bigger than random variance, we calculate a test statistic
A test statistic is a statistic that has a known frequency distribution – by knowing this we can work out the probability of obtaining a particular value▪ E.g.: 2% chance of getting the value ”34”
In general: Test statistic = systematic variance / unsystematic
variance
Should be greater than 1!
![Page 42: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/42.jpg)
Once we have calculated a test statistic, we can use its frequency distribution to tell us how probable it was that we got this value E.g.: ”A test statistic value of 34 has a 2% (0.02) chance of occuring”
The bigger the test statistic, the less likely to occur by chance
When the probability of a test statistic size falls below 0.05 (5%), we have enough confidence to assume the test statistic is as large as it is because of our experimental manipulation – and we accept our experimental hypothesis
Which test statistic to use? This depends on the experiment design and the test we are using – more on this next week!
![Page 43: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/43.jpg)
Two types errors can occur when testing hypothesis:
Type 1 error: Reject H0 when it is true We think our experimental manipulation has had
an effect, when in fact it has not (Also known as α, "alpha“ error)
Type 2 error: Retain H0 when it is false We think our experimental manipulation has not had an effect, when in fact it has (Also known as β, "beta“ error)
![Page 44: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/44.jpg)
Any observed difference between two sample means could in principle be either "real" or due to chance - we can never tell for certain
But:
Large differences between samples from the same population are unlikely to arise by chance.
Small differences between samples are likely to have arisen by chance.
![Page 45: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/45.jpg)
Problem: Reducing the chances of making a Type 1 error increases the chances of making a Type 2 error, and vice versa.
We therefore compromise between the chances of making a Type 1 error, and the chances of making a Type 2 error:
We (generally) set the probability of making a Type 1 error at 0.05 (5%)
When we do an experiment, we accept a difference between two samples as "real", if a difference of that size would be likely to occur, by chance, 5% of the time, i.e. five times in every hundred experiments performed
![Page 46: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/46.jpg)
Non-directional (two-tailed) hypothesis Merely predicts that sample mean will be significantly
different from population mean (µ)
> µ: Differences this extreme (or more) occur by chance with p = 0.025.
= µ > µ < µ
< µ: Differences this extreme (or more) occur by chance with p = 0.025.
possible differences
X
X
X
X
X
X
![Page 47: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/47.jpg)
Directional (one-tailed) hypothesisMore precise - predicts the direction of difference (i.e., either predicts is bigger than µ, or predicts is smaller than µ).
> µ: Differences this extreme (or more) occur by chance with p = 0.05.
possible differences
X
= µ > µ < µ
X
X
X
X
X
![Page 48: Inferential statistics. Summary from last week The normal distribution Hypothesis testing Type I and II errors Statistical power Exercises](https://reader035.vdocuments.us/reader035/viewer/2022062409/56649e7b5503460f94b7c231/html5/thumbnails/48.jpg)
So: Whether a test is one-tailed or two-tailed is related to whether our hypothesis is directional or not Men and women eat different amounts of
chocolate -> non-directional Men eat more chocolate than women ->
directional
Some statistical tests are run differently depending on whether the hypothesis is directional or non-directional