probability
DESCRIPTION
Probability. Principles of probability calculations. Probability values range from 0 to 1. Adding all probabilities of the sample yields 1. The probability that an event A will not occur is 1 minus the probability of A. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/1.jpg)
Probability
![Page 2: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/2.jpg)
Probability values range from 0 to 1. Adding all probabilities of the sample yields 1. The probability that an event A will not occur is 1
minus the probability of A. If two events are independent, the probability that
one or the other event occurs is the sum of their individual probabilities.
Principles of probability calculations
![Page 3: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/3.jpg)
Simple probability
P(A) = 1/6 = 0.1666
Sample space: 1,2,3,4,5,6
![Page 4: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/4.jpg)
Joint probability
P(5,6) =
P(A,B) = P(A) P(B)
P(0.166) P(0.166) = 0.0277
![Page 5: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/5.jpg)
Joint probability
-> V NP PP-> V [NP PP]
(1) keep the dogs on the beach
![Page 6: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/6.jpg)
keep
VP VP → V NP XP [.15]
V NP PP
the dogs on the beach
keep: V NP XP [.81]
.15 x .81 = .12
Conditional probability
![Page 7: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/7.jpg)
keep
VP VP → V NP XP [.15]
V
NP PP
the dogs on the beach
keep: V NP [.19]
.19 x .39 x 14 = .01
NP NP → NP PP [.14]
Conditional probability
![Page 8: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/8.jpg)
Conditional probability
![Page 9: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/9.jpg)
Conditional probability
In a corpus including 12.000 nouns and 3.500 adjectives, 2.000 adjectives precede a noun. What is the likelihood that a noun occurs after an adjective?
P(2000)
P(12000)P(ADJ|N) = 0.1666
![Page 10: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/10.jpg)
Conditional probability
What is the likelihood that an adjective precedes a noun?
P(2000)
P(3500)P(N|ADJ) = 0.5714
![Page 11: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/11.jpg)
Probability distribution
![Page 12: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/12.jpg)
Discrete probability distribution Continuous probability distribution
Types of probability distributions
![Page 13: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/13.jpg)
Binomial distribution
![Page 14: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/14.jpg)
two possible outcomes on each trail
the outcomes are independent of each other
the probability ratio is constant across trails
Bernoulli trail:
Binomial distribution
![Page 15: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/15.jpg)
TH
HH HT TH TT
Binomial distribution
![Page 16: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/16.jpg)
0 heads = HH
1 head = HT + TH
2 heads = TT
Binomial distribution
![Page 17: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/17.jpg)
HH
HT
TH
TT
0
1
2
Sample space Random variable
Binomial distribution
![Page 18: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/18.jpg)
Cumulative outcome Probability
0 = 11 = 22 = 1
0.250.500.25
P(x) = 1
Binomial distribution
![Page 19: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/19.jpg)
TH
HH HT TH TT
HHH HHT HTH HTT THH THT TTH TTT
![Page 20: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/20.jpg)
Sample space: HHH TTTHHT TTHHTH THTTHH HTT
Random variables: 0 Head1 Head2 Heads3 Heads
0 head: 11 head: 32 heads: 33 heads: 1
/ 8 = 0.125/ 8 = 0.375/ 8 = 0.375/ 8 = 0.125
![Page 21: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/21.jpg)
Binomial distribution
![Page 22: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/22.jpg)
Poisson distribution
![Page 23: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/23.jpg)
Normal distribution
![Page 24: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/24.jpg)
The center of the curve represents the mean, median, and mode.
The curve is symmetrical around the mean. The tails meet the x-axis in infinity. The curve is bell-shaped. The total under the curve is equal to 1 (by definition).
Normal distribution
![Page 25: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/25.jpg)
Normal distribution
![Page 26: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/26.jpg)
Standard normal distribution
1.96
![Page 27: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/27.jpg)
x1 – x
SD
z-scores
![Page 28: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/28.jpg)
z-scores
Zwei Kandidaten haben an zwei unterschiedlichen Sprachtests teilgenommen. Kandidat A hat 121 Punkte erzielt, Kandidat B hat 177 Punkte erzielt. Im ersten Test (an dem Kandidat A teilgenommen hat) lag der Mittelwert bei 92 und die Standardabweichung bei 14; im zweiten Test (an dem Kandidat B teilgenommen hat) lag der Mittelwert bei 143 und die Standardabweichung bei 21. Welcher der beiden Kandidaten hat besser abgeschnitten (im Vergleich zu allen übrigen Kandidaten)?
ZA = 121 – 92 / 14 = 2.07
ZB = 177 – 143 / 21 = 1.62
![Page 29: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/29.jpg)
Central limit theorem
![Page 30: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/30.jpg)
Central limit theorem
6, 2, 5, 6, 2, 3, 1,
6, 1, 1, 4, 6, 6, 2,
2, 1, 1, 5, 1, 3 = 2.64
![Page 31: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/31.jpg)
X1 X2 X3 X4 M
Sample 1 6 2 5 6 4.75
Central limit theorem
![Page 32: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/32.jpg)
X1 X2 X3 X4 M
Sample 1 6 2 5 6 4.75
Sample 2 2 3 1 6 3
Central limit theorem
![Page 33: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/33.jpg)
X1 X2 X3 X4 M
Sample 1 6 2 5 6 4.75
Sample 2 2 3 1 6 3
Sample 3 1 1 4 6 3
Central limit theorem
![Page 34: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/34.jpg)
X1 X2 X3 X4 M
Sample 1 6 2 5 6 4.75
Sample 2 2 3 1 6 3
Sample 3 1 1 4 6 3
Sample 4 6 2 2 1 2.75
Central limit theorem
![Page 35: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/35.jpg)
X1 X2 X3 X4 M
Sample 1 6 2 5 6 4.75
Sample 2 2 3 1 6 3
Sample 3 1 1 4 6 3
Sample 4 6 2 2 1 2.75
Sample 5 1 5 1 3 2.5
Central limit theorem
![Page 36: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/36.jpg)
4.75 + 3.0 + 3.0 + 2.75 + 2.5 = 3.2
5
Mean of sample mean
![Page 37: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/37.jpg)
The sample means are normally distributed (even if the phenomenon in the parent population is not normally distributed).
2,50 3,00 3,50 4,00 4,50 5,00
case
0
2
4
6
8
10
12
Häu
fig
keit
Mean = 3,352Std. Dev. = 0,44802N = 25
Central limit theorem
![Page 38: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/38.jpg)
Der Mittelwert der individuellen Mittelwerte nähert sich dem Mittelwert in der wahren Population an.
Die Mittelwerte der Stichproben ist normalverteilt, selbst wenn das Phänomen, das wir untersuchen, in der wahren Population nicht normalverteilt ist.
Alle parametrischen Tests nutzen die Tatsache, dass die Mittelwerte der Stichproben (ab einer bestimmten Anzahl von Stichproben) normalverteilt sind.
Central limit theorem
![Page 39: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/39.jpg)
population
![Page 40: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/40.jpg)
population
sample
![Page 41: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/41.jpg)
population
sample
mean of this sample
![Page 42: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/42.jpg)
population
sample
mean of this sample
distribution of many sample means
![Page 43: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/43.jpg)
How many samples do you need to assume that the mean of the sample means is normally distributed?
Are your data normally distributed?
![Page 44: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/44.jpg)
The distribution in the parent population (normal, slightly skewed, heavily skewed).
The number of observations in the individual sample.
The total number of individual samples.
Are your data normally distributed?
![Page 45: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/45.jpg)
Confidence intervals
![Page 46: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/46.jpg)
Confidence intervals indicate a range within which the mean (or other parameters) of the true population is located given the values of your sample and assuming a particular degree of certainty.
Confidence intervals
![Page 47: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/47.jpg)
The mean of the sample means
The SDs of the sample means, i.e. the standard error
The degree of certainty with which you want to state
the estimation
Confidence intervals
![Page 48: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/48.jpg)
(xn – x)2
N- 1
Standard deviation
![Page 49: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/49.jpg)
Samples Mean
12345
1.51.81.32.01.7
8.3 / 5= 1.66 (mean)
Standard error
![Page 50: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/50.jpg)
Samples Mean Individual means – Mean of means
12345
1.51.81.32.01.7
1.5 – 1.661.8 – 1.664 – 1.669 – 1.6612 – 1.66
8.3 / 5= 1.66 (mean)
Standard error
![Page 51: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/51.jpg)
Samples Mean Individual means – Mean of means
12345
1.51.81.32.01.7
1.5 – 1.661.8 – 1.664 – 1.669 – 1.6612 – 1.66
0.160.14– 0.36– 0.360.04
8.3 / 5= 1.66 (mean)
Standard error
![Page 52: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/52.jpg)
Samples Mean Individual means – Mean of means
squared
12345
1.51.81.32.01.7
1.5 – 1.661.8 – 1.664 – 1.669 – 1.6612 – 1.66
0.160.14– 0.36– 0.360.04
0.02560.01960.12960.11560.0016
8.3 / 5= 1.66 (mean)
Standard error
![Page 53: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/53.jpg)
Samples Mean Individual means – Mean of means
squared
12345
1.51.81.32.01.7
1.5 – 1.661.8 – 1.664 – 1.669 – 1.6612 – 1.66
0.160.14– 0.36– 0.360.04
0.02560.01960.12960.11560.0016
8.3 / 5= 1.66 (mean)
0.292
Standard error
![Page 54: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/54.jpg)
0.292
5 - 1= 0.2701
Standard error
![Page 55: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/55.jpg)
[degree of certainty] [standard error] = x
[sample mean] +/–x = confidence interval
Confidence intervals
![Page 56: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/56.jpg)
95% degree of certainty = 1.96 [z-score]
Confidence interval of the first sample (mean = 1.5):
1.96 0.2701 = 0.53
1.5 +/- 0.53 = 0.97–2.03
We can be 95% certain that the population mean is located in the range between 0.97 and 2.03.
Confidence intervals
![Page 57: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/57.jpg)
SD
N
Confidence intervals
![Page 58: Probability](https://reader036.vdocuments.us/reader036/viewer/2022062410/56815895550346895dc5f786/html5/thumbnails/58.jpg)
What is the 95% confidence interval of the following
sample: 2, 5, 6, 7, 10, 12?
SD: (2-7)2 + (5-7)2 + (6-7)2 + (7-7)2 + (10-7)2 + (12-7)2
6 -1
Standard error: 3.58 / 6 = 1.46
Mean: 7
= 3.58
Confidence I.: 1.46 1.96 = 2.86
7 +/– 2.86 = 4.14 – 9.86