new york times-feb, 24 th 2009

26
A NASA satellite to track carbon dioxide in the Earth’s atmosphere failed to reach its orbit during launching Tuesday morning, scuttling the $278 million mission. Andrew Lee/U.S. Air Force, via Associated Press The Orbiting Carbon Observatory lifted off from Vandenberg Air Force Base in California aboard a four-stage Taurus XL rocket on Tuesday morning but failed to reach orbit and fell back to Earth, landing in the ocean just short of Antarctica. New York Times-Feb, 24 th 2009

Upload: giza

Post on 24-Feb-2016

23 views

Category:

Documents


0 download

DESCRIPTION

A NASA satellite to track carbon dioxide in the Earth ’s atmosphere failed to reach its orbit during launching Tuesday morning, scuttling the $278 million mission. Andrew Lee/U.S. Air Force, via Associated Press - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: New York Times-Feb, 24 th  2009

A NASA satellite to track carbon dioxide in the Earth’s atmosphere failed to reach its orbit during launching Tuesday morning, scuttling the $278 million mission. Andrew Lee/U.S. Air Force, via Associated PressThe Orbiting Carbon Observatory lifted off from Vandenberg Air Force Base in California aboard a four-stage Taurus XL rocket on Tuesday morning but failed to reach orbit and fell back to Earth, landing in the ocean just short of Antarctica.

New York Times-Feb, 24th 2009

Page 2: New York Times-Feb, 24 th  2009

What do we know

We have in our tool bag a lot of the probability basics and some ready made special distributions, these are:

• Bernoulli• Binomial• Hypergeometric• Geometric, and• Negative Binomial

All, in a way or another, are based on Bernoulli experiment structure.

Page 3: New York Times-Feb, 24 th  2009

We can get to the Poisson model in two ways:

1. As an approximation of the Binomial distribution

2. As a model describing the Poisson process

The Poisson distribution Section 3.6

An important probability model that occurs when we are interested in counting the number of successes (S) regardless of the number of failures (F).

Page 4: New York Times-Feb, 24 th  2009

The Poisson distribution Section 3.6

1. Approximating the Binomial distribution

Rules for approximation:

The math ones are:If , , and then

In practice:

If n is large (>50) and p is small such as np < 5, then we can approximate with , where

Page 5: New York Times-Feb, 24 th  2009

The Poisson distribution Section 3.6

1. Approximating the Binomial distribution

1) Identify the experiment of interest and understand it well (including the associated population)

A binomial experiment with large n and small p that conforms to the rules above.

Page 6: New York Times-Feb, 24 th  2009

2) Identify the sample space (all possible outcomes)

Interested in counting the number of successes (S), so we can go directly to:

S = {0, 1, 2, 3, …}

The Poisson distribution Section 3.6

Avoiding the tediousness of listing all successes and failures.

1. Approximating the Binomial distribution

Page 7: New York Times-Feb, 24 th  2009

3) Identify an appropriate random variable that reflects what you are studying.

It is a one-to-one mapping of the above S!

Snew = S = {0, 1, 2, 3, …}

The Poisson distribution Section 3.6

1. Approximating the Binomial distribution

Page 8: New York Times-Feb, 24 th  2009

4) Construct the probability distribution associated with the simple events based on the random variable

pmf:

Notation for the Poisson

Poisson random variable X = the number of successes (S).

We say X is distributed Poisson with parameter l,

The Poisson distribution Section 3.6

1. Approximating the Binomial distribution

Page 9: New York Times-Feb, 24 th  2009

0 5 10 15 20 25

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

x

P(X

=x)

The Poisson distribution Section 3.6

Using command dpois(x, lambda=8) in R the pmf looks,

1. Approximating the Binomial distribution

Page 10: New York Times-Feb, 24 th  2009

4) Construct the probability distribution associated with the simple events based on the random variable

CDF:

Tabulated in Table A.2, page 667

Mean:

Variance:

Standard deviation:

The Poisson distribution Section 3.6

Page 11: New York Times-Feb, 24 th  2009

The Poisson distribution Section 3.6

Example: Forensics in evolution!

Say we have two virus DNA sequences! We have no idea where these sequences came from though we know that they represent the same gene. The length of these genes is 500bp (base pair). We also know from observing evolution process that the chance that any base pair being different between the two sequences is 0.004 (chosen so as we can use the tables and is usually about 0.00001), if they come from the same viral species (due to mutation).

AACTTTTGTTAAACCCTTTT… DNA Sequence 1AACTTTTGTTAAACCCTGTT… DNA Sequence 2

Page 12: New York Times-Feb, 24 th  2009

1) Identify the experiment of interest and understand it well (including the associated population)

The Poisson distribution Section 3.6

One can think of this experiment as obtaining a set of matched base pairs of length n (=500 in this case) out of a large set representing the whole genome (N>6000bp usually for viruses).

The mutation rate (mutation probability p = 0.004) is determined based on an entire population of viruses and is independent from this particular genome.

So we can justify the use of the Binomial as a model with n = 500 and p = 0.004.

Page 13: New York Times-Feb, 24 th  2009

1) Identify the experiment of interest and understand it well (including the associated population)

The Poisson distribution Section 3.6

But n is large (n > 50) and p is small where np < 5, so we can simplify life and approximate using the Poisson with l = 500*0.004 = 2

Page 14: New York Times-Feb, 24 th  2009

2) Identify the sample space (all possible outcomes)

3) Identify an appropriate random variable that reflects what you are studying.

4) Construct the probability distribution associated with the simple events based on the random variable

pmf:

The Poisson distribution Section 3.6

Page 15: New York Times-Feb, 24 th  2009

The Poisson distribution Section 3.6

Page 16: New York Times-Feb, 24 th  2009

2. As a model describing the Poisson processThe Poisson distribution Section 3.6

This is a process of counting events, usually, over time

Assumptions of this process:

a. There exists a parameter a > 0 such that,

b. There is a very small chance that 2 or more events will occur in ,

b. The number of events observed in is independent from that occurring in any other period.

Page 17: New York Times-Feb, 24 th  2009

2. As a model describing the Poisson processThe Poisson distribution Section 3.6

t

t

Page 18: New York Times-Feb, 24 th  2009

2. As a model describing the Poisson processThe Poisson distribution Section 3.6

Is a very small value such that very fast as

Page 19: New York Times-Feb, 24 th  2009

The Poisson distribution Section 3.6

1) Identify the experiment of interest and understand it well (including the associated population)

A Poisson process where we are counting the number of successes (S) over a time period t.

2. As a model describing the Poisson process

Rate of success per unit time is a

Page 20: New York Times-Feb, 24 th  2009

2) Identify the sample space (all possible outcomes)

Interested in counting the number of successes (S) with in a time interval t, so we can go directly to:

S = {0, 1, 2, 3, …}

The Poisson distribution Section 3.6

Avoiding the tediousness of listing all successes and failures.

2. As a model describing the Poisson process

Page 21: New York Times-Feb, 24 th  2009

3) Identify an appropriate random variable that reflects what you are studying.

It is a one-to-one mapping of the above S!

Snew = S = {0, 1, 2, 3, …}

The Poisson distribution Section 3.6

Within time period t

2. As a model describing the Poisson process

Page 22: New York Times-Feb, 24 th  2009

4) Construct the probability distribution associated with the simple events based on the random variable

pmf:

Poisson random variable X = the number of successes (S) within time period t.

We say X is distributed Poisson with parameter at,

The Poisson distribution Section 3.6

2. As a model describing the Poisson process

Page 23: New York Times-Feb, 24 th  2009

0 5 10 15 20 25

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

x

P(X

=x)

The Poisson distribution Section 3.6

Using command dpois(x, at=8) in R the pmf looks,

Page 24: New York Times-Feb, 24 th  2009

4) Construct the probability distribution associated with the simple events based on the random variable

CDF:

Tabulated in Table A.2, page 667

Mean:

Variance:

Standard deviation:

The Poisson distribution Section 3.6

Page 25: New York Times-Feb, 24 th  2009

Example: The mean number of cars passing the sixth and Mountain view intersection, close to the edge of Moscow, is 5 per hour.

The Poisson distribution Section 3.6

Find the probability of observing more than 15 cars pass by that intersection in 2 hours.

Find the chance of observing less than 6 cars pass through in 3 hours.

What is the mean number of cars you expect to observe pass through in 4 hours? The standard deviation is?

Page 26: New York Times-Feb, 24 th  2009

We can get to the Poisson model in two ways:

1. As an approximation of the Binomial distribution

2. As a model describing the Poisson process

The Poisson distribution Section 3.6

3. From a data perspective: plot the data and if it is count data with the variation increasing with the increase of the count then it is modeled using a Poisson distribution. We’ll keep this in mind tell later.