bus005week3 ho (1)

7/27/2019 Bus005Week3 HO (1)

http://slidepdf.com/reader/full/bus005week3-ho-1 1/24

1/22/2013

1

BUS 005: Quantitative ResearchMethods for Business

Lecture 3: The random variableand discrete probability

distributions

Sanghamitra BandyopadhyaySchool of Business and

Management

1. Displaying graphsDescribethe data

3. Probability

4. Samplin g

5. Confidence intervalsInference

7. Modelling

Quantitative Research Methods

2. Descriptive statistics

6. Test of hypothesis

Random Variables: used to describe the

outcomes of an experiment

A random variable takes a value for each possible event of anexperiment.• it is random indicating the uncertainty of its value, which we don’tknow until the experiment has taken place.• usually represented by uppercase letters: X, Y, Z.

Example. Experiment: flip a coin {heads, tails};random variable: X = {1, 0}. Or {head, tail}.

Example. Experiment: flip a coin, repeatedly, 10 times. The events are now“the number of heads when flipping a coin 10 times”. Random variable :X = {0,1,2,3,…, 10}

7/27/2019 Bus005Week3 HO (1)


1/22/2013

2

Random Variable types

• Discrete random variables produce outcomes that come from acounting process (e.g. number of classes you are taking).

• Continuous random variables produce outcomes that come froma measurement (e.g. annual salary, weight). They can take anyvalue in a range of values.

Analogy:Integers are Discrete, while Real Numbers are Continuous

Discrete Random Variables

Examples

Experiment RandomVariable

PossibleValues

Count Cars at TollBetween 11:00 & 1:00

# CarsArriving

0, 1, 2, ..., ∞

Make 100 Sales Calls # Sales 0, 1, 2, ..., 100

Inspect 70 Radios # Defective 0, 1, 2, ..., 70

Answer 33 Questions # Correct 0, 1, 2, ..., 33

Measure TimeBetween Arrivals

Inter-ArrivalTime

0, 1.3, 2.78, ...

Experiment RandomVariable

PossibleValues

Weigh 100 People Weight 45.1, 78, ...

Measure Part Life Hours 900, 875.9, ...

Amount spent on food £ amount 54.12, 42, ...

Continuous Random Variables

Examples

7/27/2019 Bus005Week3 HO (1)


1/22/2013

3

How to describe an experiment and its possibleresults?

ContinuousProbability

Distributions

ProbabilityDistributions

DiscreteProbability

Distributions

Describing an experiment

Probability Distributions

A probability distribution consists of the values of a randomvariable and the probability associated with these values.

After both types of random variables (discrete or continuous) wehave two types of probability distributions:

– Discrete Probability Distribution – Continuous Probability Distribution

Probability Notation…

When we use its lower-case counterpart, we will be representing atheoretic value of the random variable.

The probability that the random variable X will equal x is:P(X = x) or just P( x)

i.e.:P(Achieving a 2B) = P(X=2B) = P(2) = P(49<mark<59)

7/27/2019 Bus005Week3 HO (1)


7/27/2019 Bus005Week3 HO (1)


1/22/2013

5

1,218 ÷ 101,501 = 0.012

e.g. P( X =4) = P(4) = 0.076 = 7.6%

Probability distributions can be estimated from relativefrequencies.

Example 3. TV per household

0.1

0.20.3

0.4

0 1 2 3 4 5 X

P(x)

E.g. what is the probability that there is at least one television but no

more than three in any given household? These events are mutuallyexclusive:

Discrete Probability Distributions…Remember the rules of probability:

i

Population/Probability Distribution…

When we calculate proportions using sample data we do not call them probabilities but just frequencies.

Population features are described by computing parameters .

E.g. the population mean and population variance.

7/27/2019 Bus005Week3 HO (1)


1/22/2013

6

Population Mean (Expected Value)

The population mean is the weighted average of all of its values. Theweights are the probabilities.

This parameter is also called the expected value of X and isrepresented by E(X).

Example 3b:How many TVs in a typical US household?0,0,…,0, 1,1,…,1, 2,2,…,2, 3,3,…,3, 4,4,….4, 5,5,…,5

1218cases

32379cases

37961cases

19307cases

7714cases

2842cases

)5(5)4(4)3(3)2(2)1(1)0(01015012842

51015017714

410150119387

310150137961

210150132379

11015011218

0

101501284257714419387337961232379112180

1015015...5...2...21...10...0

... 124321

PPPPPP

N x x x x x x x N N N

)5(5)4(4)3(3)2(2)1(1)0(0 PPPPPP

7/27/2019 Bus005Week3 HO (1)


1/22/2013

7

Population Variance…The population variance is calculated similarly. It is the weightedaverage of the squared deviations from the mean.

The standard deviation is the same as before:

Experiment: Toss 2 Coins. Let X = # heads.

T

T

Example 2b. Discrete Random VariableProbability Distribution

T

T

H

H

H H

Probability Distributi onX Value Probability

0 1/4 = 0.25

1 2/4 = 0.50

2 1/4 = 0.25

0 1 2 X

0.50

0.25 P r o b a b i l i t y

4 possible outcomes

3 possible events

Example 2b:Summary Measures Calcul ation Table

x p(x) x p(x) x –

Total

( x p( x)

(x – (x – p(x)

x p( x)

mean variance

0

1

2

0.25

0.50

0.25

0

0.50

0.50

-1

0

1

11

1

0

1

0.25

0

0.25

0.50

7/27/2019 Bus005Week3 HO (1)


1/22/2013

8

Laws of Expected Value… E(c) = c

The expected value of a constant (c) is just the value of theconstant.

E(X + c) = E(X) + c E(b . X) = b . E(X)

We can “pull” a constant out of the expected value expression(either as part of a sum with a random variable X or as acoefficient of random variable X ).

Example 5aMonthly sales in a shop have a mean of £25,000 and a standarddeviation of £4,000 . Variable costs represent 70% of sales; fixedmonthly costs are £6,000.

Find the mean monthly profit.

1) Describe the problem statement in algebraic terms:

profits = Sales – variable costs – fixed costs =

profits = Sales – 0.70 Sales – 6000

Profit (= Y ) = 0.30(Sales) – 6,000

Example 5a

E(Profit) =E[0.30 . (Sales) – 6,000]=0.30 . E(Sales) – 6,000=0.30 . (25,000) – 6,000 = 1,500

Thus, the mean monthly profit is £1,500

sales have a mean of £25,000 E(Sales) = 25,000

Profit = 0.30(Sales) – 6,000

if: Y = b . X + cthen: E(Y) = b . E(X) + c

Note that c is a negative number.

7/27/2019 Bus005Week3 HO (1)


1/22/2013

9

Laws of Variance: linear transformationV(c) = 0

The variance of a constant (c) is zero.

V(X + c) = V(X)V(b . X) = b 2 . V(X)

Example 4b.Find standard deviation of monthly profits

E(sales) = £25,000; standard deviation = £4,000. Profits are calculatedidem example 4a.

1) Describe the problem statement in algebraic terms: sales have a standard deviation of £4,000

V(Sales) = 4,000 2 = 16,000,000

Remember: ; then

profits are calculated by… Profit = 0.30(Sales) – 6,000

)(ProfitsVar Profits

Example 4b Find the standard deviation of monthly profits.

2) The variance of profit is = V(Profit)=V[0.30(Sales) – 6,000]

if Y = b . X + c, then V(Y) = b 2V(X)

V(Profit) =V[0.30 . (Sales) – 6,000]=0.30 2 . V(Sales)=0.30 . (16,000,000) = 1,440,000

Again, standard deviation is the square root of variance ,so standard deviation of Profit = (1,440,000) 1/2 = £1,200

7/27/2019 Bus005Week3 HO (1)


1/22/2013

10

Example 6Xavier and Yvette are real estate agents.X = number of houses sold by Xavier in a month;Y = number of houses sold by Yvette in a month;An analysis of their past monthly performances has the following joint

probabilities (bivariate probability distribution).

Bivariate distributions

Bivariate Distributions…Up to now, we have looked at univariate distributions , i.e. probabilitydistributions in one variable.

As you might guess, bivariate distributions are probabilities of combinations of two variables. They are also called joint probability

distributions .

A joint probability distribution of X and Y is a table or formula that

lists the joint probabilities for all pairs of values x and y, and isdenoted P(x,y).

P(x,y) = P(X=x and Y=y)

Discrete Bivariate Distribution…As you might expect, the requirements for a bivariate distribution aresimilar to a univariate distribution, with only minor changes to thenotation:

for all pairs (x,y).

7/27/2019 Bus005Week3 HO (1)


1/22/2013

11

Example 6

Xavier and Yvette are real estate agents.X = number of houses sold by Xavier in a month;Y = number of houses sold by Yvette in a month;An analysis of their past monthly performances has the following joint

probabilities (bivariate probability distribution).

Bivariate distributions

Marginal Probabilities…We calculate the marginal probabilities by summing across rows anddown columns to determine the probabilities of X and Y individually:

E.g the probability that Xavier sells 1 house = P(X=1) =0.50

P(X=x)

P(Y=y)

Describing the Bivariate Distribution…We can describe the mean, variance, and standard deviation of eachvariable in a bivariate distribution by working with the marginal

probabilities …

x . P(x)0 x 0.4 = 01 x 0.5 = 0.52 x 0.1 = 0.2

E(x) = 0.7

same formulae as for univariate distributions…

7/27/2019 Bus005Week3 HO (1)


1/22/2013

12

Covariance…Definition. Covariance of two discrete variables :

• The covariance measures the strength of the linear relationship between two discrete random variables X and Y.

• A positive covariance indicates a positive relationship.

• A negative covariance indicates a negative relationship.

It depends on the values and units of measures of X and Y and it is notconstrained to be between -1 and 1.

Coefficient of Correlation (rho)

11

Example 6b (cont)Compute the covariance and the coefficient of correlation between thenumbers of houses sold by Xavier and Yvette.

COV(X,Y) = (0 – .7)(0 – .5)(.12) + (1 – .7)(0 – .5)(.42) + (2 – .7)(0 – .5)(.06) ++ (0 – .7)(1 – .5)(.21) + (1 – .7)(1 – .5)(.06) + (2 – .7)(1 – .5)(.03) ++(0 – .7)(2 – .5)(.07) + (1 – .7)(2 – .5)(.02) + (2 – .7)(2 –.5)(.01) = –.15

7/27/2019 Bus005Week3 HO (1)


1/22/2013

13

= –0.15 ÷ [(.64)(.67)] = –.35

There is a weak, negative relationship between the two variables.

15.0 XY

Probability Distribution of the Sum of Two Variables…

The bivariate distribution allows us to develop the probabilitydistribution of any combination of the two variables, of particular interest is the su m of two variables (z= total houses sold).

x+y = 0 x+y = 1 x+y = 3

z =P(z)

“what is the probability that three houses are sold”?P(X+Y=3) = P(2,1) + P(1,2) = 0.02 + 0.03 = 0.05

x+y = 0 x+y = 1 x+y = 3

z =P(z)

Probability Distribution of the Sum of Two Variables…

7/27/2019 Bus005Week3 HO (1)


1/22/2013

14

Likewise, we can compute the expected value, variance, andstandard deviation of X+Y in the usual way…

E(X + Y) = 0(.12) + 1(.63) + 2(.19) + 3(.05) + 4(.01) = 1.2

V(X + Y) = (0 – 1.2) 2(.12) + … + (4 – 1.2) 2(.01) = .56

75.56.)YX(Var yx

z =

P(z)

• A probability distribution is an equation that1. associates a particular probability of occurrence

with each outcome in the sample space.2. measures outcomes and assigns values of X to the

simple events.3. assigns a value to the variability in the sample

space.4. assigns a value to the center of the sample space.

7/27/2019 Bus005Week3 HO (1)


1/22/2013

15

• The covariance1. must be between -1 and +1.2. must be positive.3. can be positive or negative.4. must be less than +1.

Laws of expectation and variance of the sum

We can derive laws of expected value and variance for the sum of twovariables as follows…

E(X + Y) = E(X) + E(Y) = 0.7 + 0.5 = 1.2

V(X + Y) = V(X) + V(Y) + 2COV(X, Y) = 0.41 + 0.45 + 2(-0.15) == 0.56

If X and Y are independent, COV(X, Y) = 0, thenV(X + Y) = V(X) + V(Y)

2

Y X Y X

Generalization: Laws of expectation and

variance of the linear combination

We can derive laws of expected value and variance for thesum of two variables as follows…

E(aX + bY) = a . E(X) + b . E(Y)

V(aX + bY) = a 2 . V(X) + b 2 . V(Y) + 2 a b COV(X, Y)

2bY aX bY aX

7/27/2019 Bus005Week3 HO (1)


1/22/2013

16

Portfolio Expected Return and Portfolio Risk

• Two assets X and Y, with w invested in X.

• Portfolio expected return (weighted average return):

• Portfolio risk (weighted variability)

Where w = proportion of portfolio value in asset X(1 - w) = proportion of portfolio value in asset Y

)Y(E)w1()X(EwE(P)

XY2Y

22X

2P w)σ-2w(1σ)w1(σwσ

Portfolio Example

Investment X: μ X = 50 σX = 43.30Investment Y: μ Y = 95 σY = 193.21

σXY = 8250

Suppose 40% of the portfolio is in Investment X and 60% is inInvestment Y:

77(95)(0.6)(50)0.4E(P)

133.30

)(8250)2(0.4)(0.6(193.71)(0.6)(43.30)(0.4)σ 2222P

Probability Distributions


Distributions

Binomial

Poisson


DiscreteProbability

Distributions

Normal

Uniform

Exponential

Lec. 5 Lec. 6

7/27/2019 Bus005Week3 HO (1)


1/22/2013

17

Mathematical Models of Probability Distributions


Distributions

Binomial

Poisson


DiscreteProbability

Distributions

Normal

Uniform

Exponential

Lec. 5 Lec. 6

Binomial Distribution Properties

1. Two different sampling methods• Infinite population without replacement• Finite population with replacement

2. Sequence of n identical trials

3. Each trial has 2 outcomes• ‘Success’ (any of the two) or ‘Failure’

4. Constant trial probability of success = π . P(failure)=1- π

5. Trials are independent: the outcome of one trial does not affectthe outcomes of any other trials.

The binomial distribution is the probability distribution that resultsfrom doing a “ binomial experiment ”. Binomial experiments havethe following properties:

Binomial: Possible Applications

• A manufacturing plant labels items as either defective or acceptable

• A firm bidding for contracts will either get a contractor not

• A marketing research firm receives survey responsesof “yes I will buy” or “no I will not”

• New job applicants either accept the offer or reject it

7/27/2019 Bus005Week3 HO (1)


1/22/2013

18

Binomial Random Variable…The binomial random variable counts the number of successes (X) in ntrials of the binomial experiment. It can take on values from 0, 1, 2, …,

n . Thus, its a discrete random variable.

To calculate the probability associated with each value we use thisformulae:

for x = 0, 1, 2, …, n

n! = 1 x 2 x 3 x … x n

Example: Don Qi

Find out:1. The probability that Don gets no answers correct.2. The probability that Don gets at least two answers correct.3. The probability that Don fails the quiz, which demands a minimumof 5 answers correct.

Don Qi exam strategy is to rely on luck for the next quiz. The quizconsists of 10 multiple-choice questions . Each question has five

possible answers, only one of which is correct . Don plans to merelyguess the answer to each question.Algebraically then: n=10 , and P(success) = 1/5 = 0.20

Is this a binomial experiment? Check the conditions:There is a fixed finite number of trials ( n=10 ).An answer can be either correct or incorrect.The probability of a correct answer (P(success)=.20) doesnot change from question to question.Each answer is independent of the others.

7/27/2019 Bus005Week3 HO (1)


1/22/2013

19

n=10 , and P(success) = .20

i.e. # success, x, = 0; hence we want to know P(x=0)

Don has about an 11% chance of getting no answers correctusing the guessing strategy.

1. Probability that Don gets no answers correct?

=BINOM(0, 10, 0.20, FALSE)=BINOM(X, n, P, FALSE)

In Excel

BINOM(10, 0.20, 0, FALSE)BINOM(X, n, P, FALSE)

# successes

# trials

P(success)

cumulative(i.e. P(X ≤ x)?)

True = cumulativeFalse=probability

distribution

n=10 , and P(success) = .20

i.e. # success, x, ≥ 2; that is: P(x ≥ 2) = 1 - P(0) - P(1) = 0.5906

or: P(x ≥ 2) = 1 - P(x ≤ 1 ) = 1 – BINOM(1, 10, 0.20, TRUE )

2. What is the probability that Don gets at least 2 answerscorrect?

7/27/2019 Bus005Week3 HO (1)


1/22/2013

20

Cumulative Probability…

This requires a cumulative probability , that is,

P(X at most 4) = P(X ≤ 4) = F(4) == P(0) + P(1) + P(2) + P(3) + P(4)

3. Probability that Don fails the quiz: x min= 5

We already know P(0) = .1074. Using the binomial formula tocalculate: P(1) = .2684 , P(2) = .3020, P(3) = .2013, and P(4) = .0881

P(X ≤ 4) = .1074 + .2684 + … + .0881 = .9672

Thus, its about 97% probable that Don will fail the test using the luck strategy and guessing at answers…

Don’s Density functionProbability

0 1 2 3 4 5 6 7 8 9 10

Probability Cumulative distribution function

1

1

0 1 2 3 4 5 6 7 8 9 10

Binomial cdf The binomial cdf gives cumulative probabilities for P(X ≤ k), but as we’ve seen in the last example,

P(X = k) = P(X ≤ k) – P(X ≤ [k–1])

Likewise, for probabilities given as P(X ≥ k), we have:P(X ≥ k) = 1 – P(X ≤ [k–1])

3b. Probability that Don gets at least 5 answers correct?

7/27/2019 Bus005Week3 HO (1)


1/22/2013

21

DNA fingerprinting

• 1985. Prof. Alec Jeffreys (Leicester) suggests a procedure to produce DNA individual’s pictures which may be compared acrossindividuals. Only twins may have similar profiles.

• Fingerprints are now being used in courts for forensic and paternitytests.

• Blood samples are tried with enzymes and exposed to electric fieldto produce unique fragment sequences.

DNA fingerprinting

• Child bands are equal to one of their parents unlessexceptional mutations with probability 1/300.

• Example: # of bands = 30;X = # of mutations. X ~ B(n=30, p=1/300)

The questions to ask are:1. how many bands do not come from the mother?

2. How many of these are different from those of the allegedfather?

Some drawbacks1. It’s not always straight forward to identify band matches.

2. New York Times talks about gross discrepancies between differentlaboratories.

DNA fingerprinting

3. What is the probability of those not coming from the allegedfather being mutations? If this probability is too low (say below 5%),then we should reject that the alleged father is the father.

P(X ≥ 2)=1-P(0)-P(1) = 0.045;P(X ≥ 3)=0.0001.So if there are two bands or more with no match we may rejectthat the alleged father is the father.

X ~ B(n=30, p=1/300)

7/27/2019 Bus005Week3 HO (1)


1/22/2013

22

Mathematical Models of Probability Distributions


Distributions

Binomial

Poisson


DiscreteProbability

Distributions

Normal

Uniform

Exponential

Poisson Distribut ion1. Number of events that occur in an interval (or ‘area of

opportunity’)• events per unit

— Time, Length, Area, Space

2. Examples• Number of customers arriving in 20 minutes• Number of strikes per year in the U.S.• Number of defects per lot (group) of DVD’s• Number of exits per mile in a motorway.

Note: the difference with binomial is that now X is not defined as successes ina number n of trials, but in an ‘area of opportunity’.

Poisson Process1. Constant event probability

• Average of 60/hr is1/min for 60 1-minuteintervals

2. One event per interval• Don’t arrive together

3. Independent events• Arrival of 1 person does

not affect another’sarrival

7/27/2019 Bus005Week3 HO (1)


1/22/2013

23

Poisson Distribution…The Poisson random variable is the number of successes that occur ina period of time or an interval of space in a Poisson experiment.

E.g. On average, 96 trucks arrive at a border crossingevery hour .

E.g. The number of typographic errors in a new textbook editionaverages 1.5 per 100 pages .

successes

time period

successes (?!) interval

Poisson Distribution ExampleCustomers arrive at a rate of 72 per hour. What is the

probability of 4 customersarriving in 3 minutes?

© 1995 Corel Corp.

Poisson Distributi on Solution

72 Per Hr. = 1.2 Per Min. = 3.6 Per 3 Min. Interval

-

4 -3.6

( )!

3.6(4) .1912

4!

xe

p x x

e p

First, lambda and x must be defined for the same time period.x is defined per 3 minutes.

7/27/2019 Bus005Week3 HO (1)


1/22/2013

Using Excel For The Poisson Distribution

How to calculate the probability of at most 4 eventsin a period of time?

POISSON(4, 3.6, FALSE)POISSON(X, , FALSE)

# successesMean/Expectednumber of events

cumulative(i.e. P(X ≤ x)?)

True = cumulativeFalse=probability

distribution

Recommended readings

• Chapter 5. Except for section 5.6.

Berenson, Levine, Krehbiel or the Pearson custom textbook

bus005week3 ho (1)

Documents