statistic & information theory (csnb134) module 7c probability distributions for random...

25
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

Upload: geoffrey-thomas

Post on 03-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

STATISTIC & INFORMATION THEORY

(CSNB134)

MODULE 7CPROBABILITY DISTRIBUTIONS FOR

RANDOM VARIABLES ( NORMAL DISTRIBUTION)

Page 2: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

Overview

In Module 7, we will learn three types of distributions for random variables, which are:- Binomial distribution - Module 7A- Poisson distribution - Module 7B- Normal distribution - Module 7C

This is a Sub-Module 7C, which includes lecture slides on Normal Distribution.

Page 3: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

Recaps of Module 7A

In Module 7A, we have learned the Binomial Probability Distribution, where

and

)!(!

! Recall

.,...2,1,0for )!(!

!)(

knk

nC

nkqpknk

nqpCkxP

nk

knkknknk

)!(!

! Recall

.,...2,1,0for )!(!

!)(

knk

nC

nkqpknk

nqpCkxP

nk

knkknknk

npq

npq

np

:eviationStandard_d

:Variance

:Mean2

npq

npq

np

:eviationStandard_d

:Variance

:Mean2

Page 4: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

Recaps of Module 7B

In Module 7B, we have learned the Poisson Probability Distribution, where

For values of k = 0, 1, 2, … The mean and standard deviation of the Poisson random variable are

Mean:

Standard deviation:

For values of k = 0, 1, 2, … The mean and standard deviation of the Poisson random variable are

Mean:

Standard deviation:

!)(

k

ekxP

k

7183.2: eNote

Page 5: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

Recaps of Module 7A & Module 7B

In both Module 7A and Module 7B, we have learned the formulas of calculating the probability of exact x equivalent to k, which is P(x=k) for both type of distributions.

This is possible since both distributions are for DISCRETE random variables.

The word ‘discrete’ means that data can be measured distinctively at an exact point (e.g. x= 1, x= 100, x = 50 etc). Digital data are example of discrete data.

Where as not all random variable are of discrete nature. For example analogue data is not discrete, analogue data is CONTINUOUS.

Page 6: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

Continuous Random Variables

Continuous random variables can assume the infinitely many values corresponding to points on a line interval.

Examples: Heights, weights Chemical substance temperatures Speed of a plane

Page 7: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

Continuous Random Variables (cont.) A smooth curve describes the probability

distribution of a continuous random variable.

The depth or density of the probability, which varies with x, may be described by a mathematical formula f (x), called the probability distribution or probability density function for the random variable x.

Page 8: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

Continuous Probability Distributions

The area under the curve (i.e. the total probability) is equal to 1.

P(a < x < b) = area under the curve between a and b.

There is no probability attached to any single value of x. That is, P(x = a) = 0. Unlike Binomial and Poisson distributions for discrete random variables!!! One example of a continuous random variable distribution is the Normal Distribution.

Page 9: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

The Normal Distribution

deviation. standard andmean population theare and

1416.3 7183.2

for 2

1)(

2

2

1

e

xexfx

deviation. standard andmean population theare and

1416.3 7183.2

for 2

1)(

2

2

1

e

xexfx

The shape and location of the normal curve changes as the mean and standard deviation change.

The formula that generates the normal probability distribution is:

Page 10: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

The Normal Distribution To find P(a < x < b), we need to find the area under the

appropriate normal curve. To simplify the tabulation of these areas, we standardize

each value of x by expressing it as a z-score, the number of standard deviations s it lies from the mean m.

Recaps z score formulae:

Thus, if x >µ, z is positive if x <µ, z is negative if x =µ, z is 0

x

z

x

z

Page 11: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

The Standard Normal (z)

Distribution Mean = 0 Standard deviation = 1 When x = , z = 0 Symmetric about z = 0 Total area under the

curve = 1. Values of z to the left of center are negative Values of z to the right of center are positive

Page 12: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

Finding the Probability from z-Table Please refer to Table 3 of Mendenhall, Beaver & Beaver. The four digit probability in a particular row and column of Table 3 gives the area under the z curve to the left that particular value of z.

Area for z = 1.36

Page 13: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

To find an area to the left of a z-value, find the area directly from the table.

To find an area to the right of a z-value, find the area in Table 3 and subtract from 1.

To find the area between two values of z, find the two areas in Table 3, and subtract one from the other.

To find an area to the left of a z-value, find the area directly from the table.

To find an area to the right of a z-value, find the area in Table 3 and subtract from 1.

To find the area between two values of z, find the two areas in Table 3, and subtract one from the other.

Finding the Probability from z-Table (cont.)

Page 14: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

P(z 1.36)P(z 1.36)

P(z >1.36)P(z >1.36)

P(-1.20 z 1.36)P(-1.20 z 1.36)

Exercise 1

Use Table 3 to calculate these probabilities:

Page 15: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

P(z 1.36) = .9131P(z 1.36) = .9131

P(z >1.36)

= 1 - .9131 = .0869

P(z >1.36)

= 1 - .9131 = .0869

P(-1.20 z 1.36) = .9131 - .1151 = .7980

P(-1.20 z 1.36) = .9131 - .1151 = .7980

Exercise 1 (cont.)

Page 16: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

Recaps: Empirical Rule can be used to describe mound shape distributions

The interval µ σ contains approximately 68% of the measurements.

The interval µ 2σ contains approximately 95% of the measurements.

The interval µ 3σ contains approximately 99.7% of the measurements.

Recaps: Empirical Rule can be used to describe mound shape distributions

The interval µ σ contains approximately 68% of the measurements.

The interval µ 2σ contains approximately 95% of the measurements.

The interval µ 3σ contains approximately 99.7% of the measurements.

Relationship with Empirical Rule

Recaps: Standard Normal (z) Distribution Mean = 0 (µ = 0) Standard deviation = 1 (σ = 1) Symmetric about z = 0 (mound shape)

Recaps: Standard Normal (z) Distribution Mean = 0 (µ = 0) Standard deviation = 1 (σ = 1) Symmetric about z = 0 (mound shape)

x

z

x

z

Page 17: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

P(-2 z 2) = .9772 - .0228 = .9544

P(-2 z 2) = .9772 - .0228 = .9544

Relationship with Empirical Rule (cont.)

Remember the Empirical Rule: Approximately 95% of the measurements lie within 2 standard deviations of the mean.

Page 18: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

P(-3 z 3) = .9987 - .0013=.9974

P(-3 z 3) = .9987 - .0013=.9974Remember the Empirical Rule: Approximately 99.7% of the measurements lie within 3 standard deviations of the mean.

Relationship with Empirical Rule (cont.)

Page 19: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

1. Look for the four digit area closest to .2500 in Table 3.

2. What row and column does this value correspond to?

1. Look for the four digit area closest to .2500 in Table 3.

2. What row and column does this value correspond to?

Working Backwards

Find the value of z that has area .25 to its left.

4. What percentile does this value represent?

4. What percentile does this value represent?

25th percentile, or 1st quartile (Q1)

3. z = -.67

Page 20: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

1. The area to its left will be 1 - .05 = .95

2. Look for the four digit area closest to .9500 in Table 3.

1. The area to its left will be 1 - .05 = .95

2. Look for the four digit area closest to .9500 in Table 3.

Working Backwards (cont.)Find the value of z that has area .05 to its right.

3. Since the value .9500 is halfway between .9495 and .9505, we choose z halfway between 1.64 and 1.65.

4. z = 1.645

Page 21: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

The General Normal Distribution Previously we have looked at the Standard Normal (z) Distribution with

the following characteristics:Mean = 0 Standard deviation = 1Symmetric about z = 0To find the probability of P(x<=z), we just need to find the area under the curve by looking at the z-Table.

However, how do we find the probability of other types of normal distribution (i.e. known as the General Normal Distribution) whereMean is not necessarily 0 Standard deviation is not necessarily 1???

Page 22: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

Finding Probabilities for the General Normal Random Variable

Example: x has a normal distribution with µ = 5 and σ = 2. Find P(x > 7).

1587.8413.1)1(

)2

57()7(

zP

zPxP

1 z

To find an area for a normal random variable x with mean µ and standard deviation σ, standardize or rescale the interval in terms of z.

Find the appropriate area using the z-table.

To find an area for a normal random variable x with mean µ and standard deviation σ, standardize or rescale the interval in terms of z.

Find the appropriate area using the z-table.

Page 23: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

Exercise 2

The weights of packages of ground beef are normally distributed with mean 1 pound and standard deviation .10. What is the probability that a randomly selected package weighs between 0.80 and 0.85 pounds?

)85.80(. xP

)5.12( zP

0440.0228.0668.

Page 24: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

Exercise 2

What is the weight of a package such that only 1% of all packages exceed this weight?

233.11)1(.33.2?

33.21.

1? 3, Table From

01.)1.

1?(

01.?)(

zP

xP

233.11)1(.33.2?

33.21.

1? 3, Table From

01.)1.

1?(

01.?)(

zP

xP

Page 25: STATISTIC & INFORMATION THEORY (CSNB134) MODULE 7C PROBABILITY DISTRIBUTIONS FOR RANDOM VARIABLES ( NORMAL DISTRIBUTION)

STATISTIC & INFORMATION THEORY

(CSNB134)

PROBABILITY DISTRIBUTIONS OF RANDOM VARIABLES (POISSON DISTRIBUTIONS)

--END--