probability theory 2 tron anders moger september 13th 2006

34
Probability theory 2 Tron Anders Moger September 13th 2006

Upload: spencer-anthony

Post on 29-Dec-2015

223 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Probability theory 2 Tron Anders Moger September 13th 2006

Probability theory 2

Tron Anders Moger

September 13th 2006

Page 2: Probability theory 2 Tron Anders Moger September 13th 2006

The Binomial distribution• Bernoulli distribution: One experiment with

two possible outcomes, probability of success P.

• If the experiment is repeated n times

• The probability P is constant in all experiments

• The experiments are independent

• Then the number of successes follows a binomial distribution

Page 3: Probability theory 2 Tron Anders Moger September 13th 2006

The Binomial distribution

If X has a Binomial distribution, its PDF is defined as:

xnx PPxnx

nxXP

)1(

)!(!

!)(

)1()(

)(

PnPXVar

nPXE

Page 4: Probability theory 2 Tron Anders Moger September 13th 2006

Example

• Since the early 50s, 10000 UFO’s have been reported in the U.S.

• Assume P(real observation)=1/100000

• Binomial experiments, n=10000, p=1/100000

• X counts the number of real observations

%5.9095.010000

11

10000

1

0

100001

)0(1)1()real isn observatio oneleast At (100000

XPXPP

Page 5: Probability theory 2 Tron Anders Moger September 13th 2006

The Hypergeometric distribution

• Randomly sample n objects from a group of N, S of which are successes. The distribution of the number of successes, X, in the sample, is hypergeometric distributed:

)!(!

!)!()!(

)!(

)!(!

!

)(

nNn

NxnSNxn

SN

xSx

S

n

N

xn

SN

x

S

xXP

Page 6: Probability theory 2 Tron Anders Moger September 13th 2006

Example

• What is the probability of winning the lottery, that is, getting all 7 numbers on your coupon correct out of the total 34?

71086.1

)!734(!7

!34)!77734()!77(

)!734(

)!77(!7

!7

7

34

77

734

7

7

)7(

XP

Page 7: Probability theory 2 Tron Anders Moger September 13th 2006

The distribution of rare events: The Poisson distribution

• Assume successes happen independently, at a rate λ per time unit. The probability of x successes during a time unit is given by the Poisson distribution:

( )!

( )

( )

xeP x

xE X

Var X

Page 8: Probability theory 2 Tron Anders Moger September 13th 2006

Example: AIDS cases in 1991 (47 weeks)

• Cases per week:

1 1 0 1 2 1 3 0 0 0 0 0 0 2 1 2 2 1 3 0 1 0 0 0

1 1 1 1 1 0 2 1 0 2 0 2 1 6 1 0 0 1 0 2 0 0 0

• Mean number of cases per week:

λ=44/47=0.936

• Can model the data as a Poisson process with rate λ=0.936

Page 9: Probability theory 2 Tron Anders Moger September 13th 2006

Example cont’d:No. of No. Expected no. observed

cases observed (from Poisson dist.)

0 20 18.4

1 16 17.2

2 8 8.1

3 2 2.5

4 0 0.6

5 0 0.11

6 1 0.017

• Calculation: P(X=2)=0.9362*e-0.936/2!=0.17

• Multiply by the number of weeks: 0.17*47=8.1

• Poisson distribution fits data fairly well!

Page 10: Probability theory 2 Tron Anders Moger September 13th 2006

The Poisson and the Binomial

• Assume X is Bin(n,P), E(X)=nP• Probability of 0 successes: P(X=0)=(1-p)n • Can write λ =nP, hence P(X=0)=(1- λ/n)n • If n is large and P is small, this converges to e-λ,

the probability of 0 successes in a Poisson distribution!

• Can show that this also applies for other probabilities. Hence, Poisson approximates Binomial when n is large and P is small (n>5, P<0.05).

Page 11: Probability theory 2 Tron Anders Moger September 13th 2006

Bivariate distributions

• If X and Y is a pair of discrete random variables, their joint probability function expresses the probability that they simultaneously take specific values:– – marginal probability: – conditional probability: – X and Y are independent if for all x and y:

( , ) ( )P x y P X x Y y ( ) ( , )

y

P x P x y( , )

( | )( )

P x yP x y

P y

( , ) ( ) ( )P x y P x P y

Page 12: Probability theory 2 Tron Anders Moger September 13th 2006

Example

• The probabilities for – A: Rain tomorrow

– B: Wind tomorrow

are given in the following table:

0.1 0.2 0.05 0.01

0.05 0.1 0.15 0.04

0.05 0.1 0.1 0.05

No rain

Light rain

Heavy rain

No wind Some wind Strong wind Storm

Page 13: Probability theory 2 Tron Anders Moger September 13th 2006

Example cont’d:• Marginal probability of no rain: 0.1+0.2+0.05+0.01=0.36

• Similarily, marg. prob. of light and heavy rain: 0.34 and 0.3. Hence marginal dist. of rain is a PDF!

• Conditional probability of no rain given storm: 0.01/(0.01+0.04+0.05)=0.1

• Similarily, cond. prob. of light and heavy rain given storm: 0.4 and 0.5. Hence conditional dist. of rain given storm is a

PDF!• Are rain and wind independent? Marg. prob. of no wind:

0.1+0.05+0.05=0.2

P(no rain,no wind)=0.36*0.2=0.072≠0.1

Page 14: Probability theory 2 Tron Anders Moger September 13th 2006

Covariance and correlation

• Covariance measures how two variables vary together:

• Correlation is always between -1 and 1:

• If X,Y independent, then• If X,Y independent, then• If Cov(X,Y)=0 then

( , ) ( ( ))( ( )) ( ) ( ) ( )Cov X Y E X E X Y E Y E XY E X E Y

( , ) ( , )( , )

( ) ( )X Y

Cov X Y Cov X YCorr X Y

Var X Var Y

( ) ( ) ( )E XY E X E Y( , ) 0Cov X Y

( ) ( ) ( )Var X Y Var X Var Y

Page 15: Probability theory 2 Tron Anders Moger September 13th 2006

Continuous random variables

• Used when the outcomes can take any number (with decimals) on a scale

• Probabilities are assigned to intervals of numbers; individual numbers generally have probability zero

• Area under a curve: Integrals

Page 16: Probability theory 2 Tron Anders Moger September 13th 2006

Cdf for continuous random variables

• As before, the cumulative distribution function F(x) is equal to the probability of all outcomes less than or equal to x.

• Thus we get • The probability density function is however

now defined so that

• We get that

( ) ( ) ( )P a X b F b F a

( ) ( )b

a

P a X b f x dx 0

0( ) ( )x

F x f x dx

Page 17: Probability theory 2 Tron Anders Moger September 13th 2006

Expected values

• The expectation of a continuous random variable X is defined as

• The variance, standard deviation, covariance, and correlation are defined exactly as before, in terms of the expectation, and thus have the same properties

( ) ( )E X xf x dx

Page 18: Probability theory 2 Tron Anders Moger September 13th 2006

Example: The uniform distribution on the interval [0,1]

• f(x)=1

• F(x)=x

1 1121 1

2 200 0

( ) ( )E X xf x dx xdx x

2 2

122 1 1 1

3 4 120

( ) ( ) ( )

( ) 0.5

Var X E X E X

x d x

Page 19: Probability theory 2 Tron Anders Moger September 13th 2006

The normal distribution

• The most used continuous probability distribution: – Many observations tend to approximately

follow this distribution– It is easy and nice to do computations with– BUT: Using it can result in wrong conclusions

when it is not appropriate

Page 20: Probability theory 2 Tron Anders Moger September 13th 2006

Histogram of weight with normal curve displayed

Weight (kg)

95.090.085.080.075.070.065.060.055.050.045.040.0

Distribution of weight among 95 students

Nu

mb

er o

f stu

de

nts

25

20

15

10

5

0

Page 21: Probability theory 2 Tron Anders Moger September 13th 2006

The normal distribution• The probability density function is

• where

• Notation

• Standard normal distribution

• Using the normal density is often OK unless the actual distribution is very skewed

• Also: µ±σ covers ca 65% of the distribution

• µ±2σ covers ca 95% of the distribution

2 2( ) / 2

2

1( )

2

xf x e

( )E X 2( )Var X 2( , )N

(0,1)N

Page 22: Probability theory 2 Tron Anders Moger September 13th 2006

The normal distribution with small and large standard deviation σ

x 2018161412108642

0.4

0.3

0.2

0.1

0

Page 23: Probability theory 2 Tron Anders Moger September 13th 2006

Simple method for checking if data are well approximated by a normal

distribution: Explore

• As before, choose Analyze->Descriptive Statistics->Explore in SPSS.

• Move the variable to Dependent List (e.g. weight).

• Under Plots, check Normality Plots with tests.

Page 24: Probability theory 2 Tron Anders Moger September 13th 2006

Histogram of lung function for the students

Average PEF value measured in a sitting position

800

750

700

650

600

550

500

450

400

350

300

Nu

mb

er o

f stu

de

nts

20

16

12

8

4

0

Std. Dev = 120.12

Mean = 503

N = 95.00

Page 25: Probability theory 2 Tron Anders Moger September 13th 2006

Q-Q plot for lung function

Normal Q-Q Plot of PEFSITTM

Observed Value

800700600500400300200

Exp

ecte

d N

orm

al

3

2

1

0

-1

-2

-3

Page 26: Probability theory 2 Tron Anders Moger September 13th 2006

Age – not normal

Age

35.032.530.027.525.022.520.0

Histogram

Fre

qu

en

cy

50

40

30

20

10

0

Std. Dev = 3.11

Mean = 22.4

N = 95.00

Page 27: Probability theory 2 Tron Anders Moger September 13th 2006

Q-Q plot of age

Normal Q-Q Plot of AGE

Observed Value

40302010

Expe

cte

d N

orm

al

3

2

1

0

-1

-2

Page 28: Probability theory 2 Tron Anders Moger September 13th 2006

SKEWED

40

30

20

10

0

Std. Dev = 1.71

Mean = 1.50

N = 106.00

Skewed distribution, with e.g. the observations 0.40, 0.96, 11.0

A trick for data that are skewed to the right: Log-transformation!

Page 29: Probability theory 2 Tron Anders Moger September 13th 2006

Log-transformed data

LNSKEWD

14

12

10

8

6

4

2

0

Std. Dev = 1.05

Mean = -.12

N = 106.00

ln(0.40)=-0.91ln(0.96)=-0.04ln(11) =2.40

Do the analysis on log-transformed data

SPSS: transform- compute

Page 30: Probability theory 2 Tron Anders Moger September 13th 2006

OK, the data follows a normal distribution, so what?

• First lecture, pairs of terms:– Sample – population

– Histogram – distribution

– Mean – Expected value

• In statistics we would like the results from analyzing a small sample to apply for the population

• Has to collect a sample that is representative w.r.t. age, gender, home place etc.

Page 31: Probability theory 2 Tron Anders Moger September 13th 2006

New way of reading tables and histograms:

• Histograms show that data can be described by a normal distribution

• Want to conclude that data in the population are normally distributed

• Mean calculated from the sample is an estimate of the expected value µ of the population normal distribution

• Standard deviation in the sample is an estimate of σ in the population normal distribution

• Mean±2*(standard deviation) as estimated from the sample (hopefully) covers 95% of the population normal distribution

Page 32: Probability theory 2 Tron Anders Moger September 13th 2006

In addition:

• Most standard methods for analyzing continuous data assumes a normal distribution.

• When n is large and P is not too close to 0 or 1, the Binomial distribution can be approximated by the normal distribution

• A similar phenomenon is true for the Poisson distribution

• This is a phenomenon that happens for all distributions that can be seen as a sum of independent observations.

• Means that the normal distribution appears whenever you want to do statistics

Page 33: Probability theory 2 Tron Anders Moger September 13th 2006

The Exponential distribution

• The exponential distribution is a distribution for positive numbers (parameter λ):

• It can be used to model the time until an event, when events arrive randomly at a constant rate

( ) tf t e

( ) 1/E T 2( ) 1/Var T

Page 34: Probability theory 2 Tron Anders Moger September 13th 2006

Next time:

• Sampling and estimation

• Will talk much more in depth about the topics mentioned in the last few slides today