parametric statistics
DESCRIPTION
Parametric statistics. 922. Outline. Measuring the accuracy of the mean Practical notes for practice Inferential statistics T-test ANOVA. Measuring the accuracy of the mean. The mean is the simplest statistical model that we use This statistic predicts the likely score of a person - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/1.jpg)
Parametric statistics
922
![Page 2: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/2.jpg)
Outline
Measuring the accuracy of the mean Practical notes for practice Inferential statistics
T-testANOVA
![Page 3: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/3.jpg)
Measuring the accuracy of the mean The mean is the simplest statistical model
that we use This statistic predicts the likely score of a
person The mean is a summry statistic
![Page 4: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/4.jpg)
Measuring the accuracy of the mean The model we choose (mean/ median /
mode) should represent the state in the real worldDoes the model represent the world
precisely?The mean is a prefect representation only if
all the scores we collect are the same as the mean.
![Page 5: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/5.jpg)
Mean
When the mean is a perfect fit: there is no difference between the mean and each data point
Score Child
10 1
10 2
10 3
10 4
10 5
10 6
10 7
10 8
80/8=10 Mean
![Page 6: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/6.jpg)
Mean
Usually, there are differences between the mean and the raw scores
If the mean is representative of the data these differences are small.
Score Child
10 1
9 2
8 3
12 4
8 5
11 6
10 7
12 8
80/8=10 Mean
![Page 7: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/7.jpg)
Deviation
The differences between the model prediction (=mean) and each raw score is the deviation
Deviation Mean Score Child
10 10 1
10 9 2
10 8 3
10 12 4
10 8 5
10 11 6
10 10 7
10 12 8
10 Mean
![Page 8: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/8.jpg)
Deviation
Compute the deviation of each score from the mean
Measure the overall deviation (sum)
Deviation Mean Score Child
0 10 10 1
1 10 9 2
2 10 8 3
2- 10 12 4
2 10 8 5
1- 10 11 6
0 10 10 7
2- 10 12 8
0 Sum 10 Mean
![Page 9: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/9.jpg)
Deviation
Squared dev. Deviation
0 01 14 24 2-4 21 1-0 04 2-18 0
Mean Raw score
10 10
10 9
10 8
10 12
10 8
10 11
10 10
10 12
Sum
![Page 10: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/10.jpg)
Deviation
Squared dev. Deviation
0 01 14 24 2-4 21 1-0 04 2-18 0
Sum of squared deviations (also called the sum of squared errors) is a good measure of the accuracy of the mean
Except that it gets bigger with more scores
![Page 11: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/11.jpg)
Variance
Divide sum of squared deviations
by the number of scores minus 1
We can compare variance across samples Square root of variance is standard deviation
Standard deviation
Variance N-1 Number of scores (N)
Sum of square deviations
1.62.57= 18/7=2.57 7 8 18
![Page 12: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/12.jpg)
Accuracy of the mean
Sum of squared deviations (sum of squared errors), variance and standard deviation all measure the same thing: variability of the data
Standard deviation (SD) measures how well the mean represents the data: small SD indicate data points close to the mean
![Page 13: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/13.jpg)
Standard Deviation (SD)
SD close to the meanSD Mean
0.2 7 Sentence 1
0.5 5 Sentence 2
0.1 2 Sentence 3
0.5 2 Sentence 4
0.2 5 Sentence 5
0.2 6 Sentence 6
0.1 4 Sentence 7
0.2 6 Sentence 8
![Page 14: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/14.jpg)
Standard Deviation (SD)
SD far from the meanSD Mean
1.5 7 Sentence 1
2.5 5 Sentence 2
1.5 2 Sentence 3
2.5 2 Sentence 4
3 5 Sentence 5
1.5 6 Sentence 6
2.5 4 Sentence 7
3.5 6 Sentence 8
![Page 15: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/15.jpg)
Why use number of scores minus 1?
We are using a sample to estimate the variance in the population
Population?
![Page 16: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/16.jpg)
Sample-population
The intended population of psycholinguistic research can be all people / all children aged 3 / etc.
Actually, we collect data only from a sample of the population we are interested in.
We use the sample to make a guess about the linguistic behavior of the relevant population.
![Page 17: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/17.jpg)
Sample - population
Size of the sample The mean as a model is resistant to
sampling variation: different samples from the same populations usually have a similar mean
![Page 18: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/18.jpg)
Why use number of scores minus 1?
We are using a sample to estimate the variance in the population
Population?
![Page 19: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/19.jpg)
Why use number of scores minus 1?
We are using a sample to estimate the variance in the population
Variance in the sample: observations can vary (5, 6, 2, 9, 3) mean=5
But: if we assume that the sample mean is the same as the population (mean=5)
![Page 20: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/20.jpg)
Why use number of scores minus 1?
For the next sample, not all observations are free to vary.
For a sample of (5, 7, 1, 8, ?) we already need to assume that the mean is 5. (?=4)
![Page 21: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/21.jpg)
Why use number of scores minus 1?
This does not mean we fix the value of the observation, but simply that for various statistics we have to calculate the number of observations that are free to vary.
This number is called: Degrees of freedom and it must be one less than the sample size (N-1).
![Page 22: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/22.jpg)
To summarize
Mean represents the sample
Sample represents population
![Page 23: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/23.jpg)
Many samples - population
Theoretically, if we take several samples from the same population
Each sample will have its own Mean and SD
If the samples are taken from the same population, they are expected to be reasonably similar.
![Page 24: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/24.jpg)
Many samples
mean
Population 10
Sample 1 9
Sample 2 11
Sample 3 10
Sample 4 12
Sample 5 9
Sample 6 10
Sample 7 11
Sample 8 8
Frequency Mean
1 8
2 9
3 10
2 11
1 12
![Page 25: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/25.jpg)
Sampling distribution
Frequency Mean
1 8
2 9
3 10
2 11
1 12
Average of all sample means will give the value for the population mean
![Page 26: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/26.jpg)
Sampling distribution
How accurate is a sample likely to be?
Calculate the SD of the sampling distribution
This is called the standard error of the mean (SE)
![Page 27: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/27.jpg)
Standard Error (SE)
We do not collect many samples, but compute SE
SE= SD/N
SD Mean
0.2 7 Sentence 1
0.5 5 Sentence 2
0.1 2 Sentence 3
0.5 2 Sentence 4
0.2 5 Sentence 5
0.2 6 Sentence 6
0.1 4 Sentence 7
0.2 6 Sentence 8
![Page 28: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/28.jpg)
Standard Error (SE)
SE SD Mean
0.07 0.2 7
0.18 0.5 5
0.04 0.1 2
0.18 0.5 2
0.07 0.2 5
0.07 0.2 6
0.04 0.1 4
0.07 0.2 6
![Page 29: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/29.jpg)
Why are the samples different?
Source of the variance: Different population or Sampling error (random effect, can be
calculated) Can we take results from the sample to
make generalizations about the population?
![Page 30: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/30.jpg)
Accuracy of sample means
Calculate the boundaries within which most sample means will fall.
Looking at the means of 100 samples, the lowest mean is 2, and the highest mean is 7.
The mean of any additional sample will fall within these limits.
![Page 31: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/31.jpg)
Confidence Interval
The limits within which a certain percent (typically we look at 95%) of sample means will fall.
If we collect 100 samples, 95 of them will have a mean within the confidence interval
![Page 32: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/32.jpg)
PRACTICE
![Page 33: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/33.jpg)
Experiment: Compare children with specific language
impairment (SLI) and children who are typically developing (TD).
Hypothesis: effect of word order SVO vs. VSO
![Page 34: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/34.jpg)
Task : repeat a sentence
30 Children, each was presented with 10 sentences (5 SVO, 5 VSO).
V CORRECT?
YES/NO SVO
YES/NO VSO
![Page 35: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/35.jpg)
VSO2 VSO1 SVO2 SVO1 gender age SLI
0 1 0 1 1 4;02 1 Child 1
0 1 1 0 1 3;04 0 Child 2
0 1 0 0 1 4;06 1 Child 3
0 1 1 1 0 3;11 0 Child 4
Compute: Mean? Frequency?
![Page 36: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/36.jpg)
SD Mean
SVO SLI
VSO
SVO TD
VSO
![Page 37: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/37.jpg)
Basic analysis with Excel
Descriptive statistics: Sum, Average, Percentage
Drawing graphs Parametric statistics: Mean, Standard
Deviation, t-test Smart sheets: COUNTIF
![Page 38: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/38.jpg)
INFERENTIAL STATISTICS
![Page 39: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/39.jpg)
Statistical hypothesis
Hypothesis for the effect of a linguistic phenomenon
Findings from a sample Do the findings support the hypothesis?
Do they show a linguistic effect? To answer this, we consider a null
hypothesis
![Page 40: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/40.jpg)
The null hypothesis (H0)
H0= the experiment has no effect The purpose of statistical inference is to
reject this hypothesis H1 = the mean of the population affected
by the experiment is different from the general population
![Page 41: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/41.jpg)
Rejecting the null hypothesis
Compare the mean of the sample to two populations (under H1 or H0).
We cannot show the sample belongs to the population under H1.
All we can do is compare the sample to population under H0 and consider the likelihood that it belongs to it.
![Page 42: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/42.jpg)
![Page 43: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/43.jpg)
Rejecting the null hypothesis
Check if our sample belongs to the population under H1 or H0
Consider confidence interval, SE Compare means Compare varience
![Page 44: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/44.jpg)
Level of significance (alpha)
Is the difference between the sample and the population big enough to reject H0?
Determine a critical value (alpha) as criterion for including the sample in the population
< 0.05
![Page 45: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/45.jpg)
![Page 46: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/46.jpg)
![Page 47: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/47.jpg)
![Page 48: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/48.jpg)
Parametric statistics
Variables are on interval scale (at least) Compute means of raw grade (several
items to one condition)
t-testsANOVAANACOVA
![Page 49: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/49.jpg)
t-Tests t-tests are used in order to compare two
samples and decide whether they are significantly different or not.
The t-tests represent the difference between the means of the two samples which takes into consideration the degree to which these means could differ by chance.
![Page 50: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/50.jpg)
t-Tests The degree to which the means could
differ by chance is the Standard Error (SE) We do not calculate the t-value ourselves,
but we use it to determine the effect of the experiment on the sample.
How do we know if the t-value is significant (p<0.05)?
![Page 51: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/51.jpg)
t-Tests
Every sample belongs to a different t-curve. This depends on the degree of freedom (df= N-1.)
Check the table of values called the Student's t-distributions, which is based on df determines. We mark the df on t.
t(32)=1.15
![Page 52: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/52.jpg)
Types of t-tests:
Matched/Paired/Dependent t-test - compares two sets of scores for the same sample, or scores of matched samples. Sample can be of equal variance or unequal variance.
Independent (two sample) t-test - compares two different samples (on the same test). Samples can be of equal size or unequal size, of equal variance or unequal variance. The df for independent t-test is Nx-1+Ny-1.
![Page 53: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/53.jpg)
ANOVA - Comparing Means of more than two samples ANOVA - Analysis of variance. It considers
within group variability as well as random between group variability and nonrandom between group variability.
The type of ANOVA depends on the research design - the number of independent and dependent variables.
![Page 54: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/54.jpg)
One way ANOVA - one independent variable, one dependent variable with more than two values
Two-Way Independent ANOVA - Two independent variables, with different participants in all groups (each person contributes one score).
Two-Way Repeated Measures ANOVA - Everything comes from the same participants
Two-Way Mixed ANOVA - one independent variable is tested on the same participants, the other on different participants
![Page 55: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/55.jpg)
F-score is the product of dividing the between group variance (which takes into consideration random and non-random variance) by the within group variance.
For every F-score we can determine the significance based on the dfs (between groups and within groups).
![Page 56: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/56.jpg)
Post hoc comparisons
Post-hoc comparisons are used to find out where the differences which yielded significant come from.
Tukey Test - used when the sample sizes are the same.
Scheffe Test - used with unequal sample sizes, but could be used with equal sample sizes
Bonferroni correction - when there are multiple comparisons the level of significance is divided by the number of test to avoid family-wise errors.
These tests can also be used to test unplanned comparisons
![Page 57: Parametric statistics](https://reader035.vdocuments.us/reader035/viewer/2022062221/56813fd1550346895daab341/html5/thumbnails/57.jpg)
ANCOVA - Analysis of covariance
Allows introducing covariates (factors other than the experimental design which might influence the results) into the ANOVA.