random variables and probability distributions modified from a presentation by carlos j....
TRANSCRIPT
![Page 1: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/1.jpg)
Random Variables and Probability Distributions
Modified from a presentation by Carlos J. Rosas-Anderson
![Page 2: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/2.jpg)
Fundamentals of Probability
The probability P that an outcome occurs is:
The sample space is the set of all possible outcomes of an event Example: Visit = {(Capture), (Escape)}
trialsofnumber
outcomesofnumberP
![Page 3: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/3.jpg)
Axioms of Probability
1. The sum of all the probabilities of outcomes within a single sample space equals one:
2. The probability of a complex event equals the sum of the probabilities of the outcomes making up the event:
3. The probability of 2 independent events equals the product of their individual probabilities:
)()()( BPAPBorAP
0.1)(1
n
iiAP
)()()( BPAPBandAP
![Page 4: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/4.jpg)
Probability distributions
We use probability distributions because they fit many types of data in the living world
Ht (cm) 1996
100
80
60
40
20
0
Std. Dev = 14.76
Mean = 35.3
N = 713.00
Ex. Height (cm) of Hypericum cumulicola at Archbold Biological Station
![Page 5: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/5.jpg)
Probability distributions
Most people are familiar with the Normal Distribution, BUT…
…many variables relevant to biological and ecological studies are not normally distributed! For example, many variables are discrete
(presence/absence, # of seeds or offspring, # of prey consumed, etc.)
Because normal distributions apply only to continuous variables, we need other types of distributions to model discrete variables.
![Page 6: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/6.jpg)
Random variable
The mathematical rule (or function) that assigns a given numerical value to each possible outcome of an experiment in the sample space of interest.
2 Types: Discrete random variables Continuous random variables
![Page 7: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/7.jpg)
The Binomial DistributionBernoulli Random Variables
Imagine a simple trial with only two possible outcomes: Success (S) Failure (F)
Examples Toss of a coin (heads or tails) Sex of a newborn (male or female) Survival of an organism in a region (live or
die)
Jacob Bernoulli (1654-1705)
![Page 8: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/8.jpg)
The Binomial DistributionOverview
Suppose that the probability of success is p
What is the probability of failure? q = 1 – p
Examples Toss of a coin (S = head): p = 0.5 q = 0.5 Roll of a die (S = 1): p = 0.1667 q = 0.8333 Fertility of a chicken egg (S = fertile): p = 0.8
q = 0.2
![Page 9: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/9.jpg)
The Binomial DistributionOverview
Imagine that a trial is repeated n times Examples:
A coin is tossed 5 times A die is rolled 25 times 50 chicken eggs are examined
ASSUMPTIONS: 1) p is constant from trial to trial2) the trials are statistically independent of each
other
![Page 10: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/10.jpg)
The Binomial DistributionOverview
What is the probability of obtaining X successes in n trials?
Example What is the probability of obtaining 2 heads
from a coin that was tossed 5 times?
P(HHTTT) = (1/2)5 = 1/32
![Page 11: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/11.jpg)
The Binomial DistributionOverview
But there are more possibilities:
HHTTT HTHTT HTTHT HTTTHTHHTT THTHT THTTH
TTHHT TTHTHTTTHH
P(2 heads) = 10 × 1/32 = 10/32
![Page 12: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/12.jpg)
The Binomial DistributionOverview
In general, if n trials result in a series of success and failures,
FFSFFFFSFSFSSFFFFFSF…
Then the probability of X successes in that order is
P(X) = q q p q = pX qn – X
![Page 13: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/13.jpg)
The Binomial DistributionOverview
However, if order is not important, then
where is the number of ways to obtain X successes
in n trials, and n! = n (n – 1) (n – 2) … 2 1
n!
X!(n – X)! pX qn – XP(X) =
n!
X!(n – X)!
![Page 14: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/14.jpg)
The Binomial DistributionOverview
Bin(0.1, 5)
0
0.2
0.4
0.6
0.8
0 1 2 3 4 5
Bin(0.3, 5)
0
0.1
0.2
0.3
0.4
0 1 2 3 4 5
Bin(0.5, 5)
0
0.1
0.2
0.3
0.4
0 1 2 3 4 5
Bin(0.7, 5)
0
0.1
0.2
0.3
0.4
0 1 2 3 4 5
Bin(0.9, 5)
0
0.2
0.4
0.6
0.8
0 1 2 3 4 5
![Page 15: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/15.jpg)
The Poisson DistributionOverview
When there are a large number of trials but a small probability of success, binomial calculations become impractical Example: Number of
deaths from horse kicks in the French Army in different years
The mean number of successes from n trials is λ = np Example: 64 deaths in 20
years out of thousands of soldiers
Simeon D. Poisson (1781-1840)
![Page 16: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/16.jpg)
The Poisson DistributionOverview
If we substitute λ/n for p, and let n approach infinity, the binomial distribution becomes the Poisson distribution:
P(x) = e -λλx
x!
![Page 17: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/17.jpg)
The Poisson DistributionOverview
The Poisson distribution is applied when random events are expected to occur in a fixed area or a fixed interval of time
Deviation from a Poisson distribution may indicate some degree of non-randomness in the events under study
See Hurlbert (1990) for some caveats and suggestions for analyzing random spatial distributions using Poisson distributions
![Page 18: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/18.jpg)
The Poisson DistributionExample: Emission of -particles
Rutherford, Geiger, and Bateman (1910) counted the number of -particles emitted by a film of polonium in 2608 successive intervals of one-eighth of a minute What is n? What is p?
Do their data follow a Poisson distribution?
![Page 19: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/19.jpg)
The Poisson DistributionEmission of -particles
No. -particles Observed
057
1203
2383
3525
4532
5408
6273
7139
845
927
1010
114
120
131
141
Over 14 0
Total2608
Calculation of λ:
λ = No. of particles per interval
= 10097/2608= 3.87
Expected values:
2608 P(x) = e -3.87(3.87)x
x!2608
![Page 20: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/20.jpg)
The Poisson DistributionEmission of -particles
No. -particles Observed
Expected
057 54
1203 210
2383 407
3525 525
4532 508
5408 394
6273 255
7139 140
845 68
927 29
1010 11
114 4
120 1
131 1
141 1
Over 14 0 0
Total2608 2608
![Page 21: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/21.jpg)
The Poisson DistributionEmission of -particles
Random events
Regular events
Clumped events
![Page 22: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/22.jpg)
The Poisson Distribution0.1
0
0.2
0.4
0.6
0.8
1
0.5
0
0.2
0.4
0.6
0.8
1
1
0
0.2
0.4
0.6
0.8
1
2
0
0.2
0.4
0.6
0.8
1
6
0
0.2
0.4
0.6
0.8
1
![Page 23: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/23.jpg)
Review of Discrete Probability Distributions
If X is a discrete random variable,
What does X ~ Bin(p, n) mean?
What does X ~ Poisson(λ) mean?
![Page 24: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/24.jpg)
The Expected Value of a Discrete Random Variable
nn
n
iii papapapaXE
...)( 22111
![Page 25: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/25.jpg)
The Variance of a Discrete Random Variable
22 )()( XEXEX
n
i
n
iiiii paap
1
2
1
![Page 26: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/26.jpg)
Continuous Random Variables
If X is a continuous random variable, then X has an infinitely large sample space
Consequently, the probability of any particular outcome within a continuous sample space is 0
To calculate the probabilities associated with a continuous random variable, we focus on events that occur within particular subintervals of X, which we will denote as Δx
![Page 27: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/27.jpg)
Continuous Random Variables
dxxxfXE
xxfxXE i
n
ii
)()(
)()(1
xxfxXP ii )()(
The probability density function (PDF):
To calculate E(X), we let Δx get infinitely small:
![Page 28: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/28.jpg)
Uniform Random Variables
Defined for a closed interval (for example, [0,10], which contains all numbers between 0 and 10, including the two end points 0 and 10).
0
0.1
0.2
0 1 2 3 4 5 6 7 8 9 10
X
P(X)
Subinterval [5,6]Subinterval [3,4]
otherwise
xxf
,0
100,10/1)(
The probability density function (PDF)
![Page 29: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/29.jpg)
Uniform Random Variables
2/)()( abXE
For a uniform random variable X, where f(x) is defined on the interval [a,b] and where a<b:
12
)()(
22 abX
otherwise
bxaabxf
,0
),/(1)(
![Page 30: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/30.jpg)
The Normal DistributionOverview
Discovered in 1733 by de Moivre as an approximation to the binomial distribution when the number of trials is large
Derived in 1809 by Gauss
Importance lies in the Central Limit Theorem, which states that the sum of a large number of independent random variables (binomial, Poisson, etc.) will approximate a normal distribution
Example: Human height is determined by a large number of factors, both genetic and environmental, which are additive in their effects. Thus, it follows a normal distribution. Karl F.
Gauss (1777-1855)
Abraham de Moivre (1667-1754)
![Page 31: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/31.jpg)
The Normal DistributionOverview
A continuous random variable is said to be normally distributed with mean and variance 2 if its probability density function is
f(x) is not the same as P(x) P(x) would be virtually 0 for every x because
the normal distribution is continuous However, P(x1 < X ≤ x2) = f(x)dx
f (x) =
1
2(x )2/22
e
x1
x2
![Page 32: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/32.jpg)
The Normal DistributionOverview
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
x
f(x)
![Page 33: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/33.jpg)
The Normal DistributionOverview
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
x
f(x)
![Page 34: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/34.jpg)
The Normal DistributionOverview
Mean changes Variance changes
![Page 35: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/35.jpg)
The Normal DistributionLength of Fish
A sample of rock cod in Monterey Bay suggests that the mean length of these fish is = 30 in. and 2 = 4 in.
Assume that the length of rock cod is a normal random variable X ~ N( = 30 , =2)
If we catch one of these fish in Monterey Bay, What is the probability that it will be at least
31 in. long? That it will be no more than 32 in. long? That its length will be between 26 and 29
inches?
![Page 36: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/36.jpg)
The Normal DistributionLength of Fish
What is the probability that it will be at least 31 in. long?
0.00
0.05
0.10
0.15
0.20
0.25
25 26 27 28 29 30 31 32 33 34 35
Fish length (in.)
![Page 37: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/37.jpg)
The Normal DistributionLength of Fish
That it will be no more than 32 in. long?
0.00
0.05
0.10
0.15
0.20
0.25
25 26 27 28 29 30 31 32 33 34 35
Fish length (in.)
![Page 38: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/38.jpg)
The Normal DistributionLength of Fish
That its length will be between 26 and 29 inches?
0.00
0.05
0.10
0.15
0.20
0.25
25 26 27 28 29 30 31 32 33 34 35
Fish length (in.)
![Page 39: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/39.jpg)
-6 -4 -2 0 2 40
1000
2000
3000
4000
5000
Standard Normal Distribution
μ=0 and σ2=1
![Page 40: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/40.jpg)
Useful properties of the normal distribution
The normal distribution has useful properties:
Can be added: E(X+Y)= E(X)+E(Y) and σ2(X+Y)= σ2(X)+ σ2(Y)
Can be transformed with shift and change of scale operations
![Page 41: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/41.jpg)
Consider two random variables X and Y
Let X~N(μ,σ) and let Y=aX+b where a and b are constants
Change of scale is the operation of multiplying X by a constant a because one unit of X becomes “a” units of Y.
Shift is the operation of adding a constant b to X because we simply move our random variable X “b” units along the x-axis.
If X is a normal random variable, then the new random variable Y created by these operations on X is also a normal random variable .
![Page 42: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/42.jpg)
For X~N(μ,σ) and Y=aX+b
E(Y) =aμ+b σ2(Y)=a2 σ2
A special case of a change of scale and shift operation in which a = 1/σ and b = -1(μ/σ): Y = (1/σ)X-(μ/σ) = (X-μ)/σ This gives E(Y)=0 and σ2(Y)=1
Thus, any normal random variable can be transformed to a standard normal random variable.
![Page 43: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/43.jpg)
The Central Limit Theorem
Asserts that standardizing any random variable that itself is a sum or average of a set of independent random variables results in a new random variable that is “nearly the same as” a standard normal one.
So what? The C.L.T allows us to use statistical tools that require our sample observations to be drawn from normal distributions, even though the underlying data themselves may not be normally distributed!
The only caveats are that the sample size must be “large enough” and that the observations themselves must be independent and all drawn from a distribution with common expectation and variance.
![Page 44: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/44.jpg)
Log-normal Distribution X is a log-normal
random variable if its natural logarithm, ln(X), is a normal random variable [NOTE: ln(X) is same as loge(X)]
Original values of X give a right-skewed distribution (A), but plotting on a logarithmic scale gives a normal distribution (B).
Many ecologically important variables are log-normally distributed.
rep 1994
1600.0
1500.0
1400.0
1300.0
1200.0
1100.0
1000.0
900.0800.0
700.0600.0
500.0400.0
300.0200.0
100.00.0
300
200
100
0
Std. Dev = 183.79
Mean = 127.5
N = 765.00
SOURCE: Quintana-Ascencio et al. 2006; Hypericum data from Archbold Biological Station
LOGREP94
7.256.75
6.255.75
5.254.75
4.253.75
3.252.75
2.251.75
1.25.75
70
60
50
40
30
20
10
0
Std. Dev = 1.44
Mean = 4.00
N = 765.00
A
![Page 45: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/45.jpg)
Log-normal Distribution
2/2emean2
2
2
2
1
eevariance
![Page 46: Random Variables and Probability Distributions Modified from a presentation by Carlos J. Rosas-Anderson](https://reader031.vdocuments.us/reader031/viewer/2022032802/56649de85503460f94ae2420/html5/thumbnails/46.jpg)
Exercise
During class we will perform an exercise in R allowing you and us to work with some of these probability distributions!