introductin to econometrics

34
Econ 184: Introduction to Econometrics (Fall 2015) Alexandre Poirier 1 1 Department of Economics University of Iowa 8/25/2015

Upload: suslu76

Post on 24-Jan-2016

228 views

Category:

Documents


0 download

DESCRIPTION

lecture slides

TRANSCRIPT

Page 1: Introductin to Econometrics

Econ 184: Introduction to Econometrics(Fall 2015)

Alexandre Poirier 1

1Department of EconomicsUniversity of Iowa

8/25/2015

Page 2: Introductin to Econometrics

Why Study Econometrics?

I Econometrics: the use of statistical methods and economictheory to study economic problems (using data).

I Study causal relationships between economic variables.

I Economics is a quantitative and predictive science.I Economists often want to determine whether a change in one

variable causes a change in another.I For example, does another year of schooling increase earnings?I Other examples:

I Does patent protection help foster innovation?I Does a minimum wage lower employment?I Will “universal coverage” lower the quality of health care?

Page 3: Introductin to Econometrics

Why Study Econometrics?

More specifically, we might want to

I Study whether the predictions of an economic model hold truein reality.

I Does demand slope downward? Is the stock market efficient?

I Quantify the effect of an economic or social program on anoutcome of interest (poverty, inequality, wages, fertility,educational achievements, innovation).

I Forecast economic variables of interest.I Next quarter inflation, etc.

Page 4: Introductin to Econometrics

Why Study Econometrics?

This course is geared toward providing you with an introduction tothe methods and tools of econometrics.

Page 5: Introductin to Econometrics

Four main examples used in the textbook

1. Does reducing class size improve elementary school education?

2. Is there racial discrimination in the market for home loans?

3. How much do cigarette taxes reduce smoking?

4. Will raising the beer tax reduce traffic fatalities?

I We will analyze many more examples in class, sections andproblem sets....

Page 6: Introductin to Econometrics

Causation vs. Correlation

I Does a change in X really cause a change in Y ? Or do theyjust co-vary?

I To evaluate policies and test theories, we need to establishcausation.

I But in the real world, correlation and causation are often verydifficult to separate.

I Does drinking red wine reduce the risk of a heart attack?I Does watching Oprah cause stress?I Does smoking cigarettes cause cancer?

Page 7: Introductin to Econometrics

Causation vs. Correlation: More Examples

I We observe a positive relationship between crime and thenumber of police officers.

I Is it because police officers create crime?I Or (more likely!) is it because more police officers are assigned

to more troublesome neighborhoods?

I We observe that unemployed people who attend a job trainingprogram wait for shorter periods before finding a job.

I Is it because the program helped them, or is it because thosewho joined the program are the most skilled/motivated ones,so that they would have waited less anyway?

I So how do we establish causation?

Page 8: Introductin to Econometrics

Causation vs. Correlation

I Ideally, we would like to have experimental data: forexample, two identical plots of land, where the same crop iscultivated, using the same techniques, but where differentfertilizers are used.

I Then if the outcome of interest (yield per acre, for example)is different between the two plots, we can safely infer that adifferent fertilizer causes differences in the average yield.

I Classic example: clinical trials (compliance?).

I BUT, in most cases, all what we have is observational data(you can’t have the same person with and without college, orthe same economy with and without a tax cut...).

I In general, to find answers, we need econometric skills &creativity.

Page 9: Introductin to Econometrics

Summary: Learning Goals

I Conduct Statistical Analysis.

I Understand the role of empirical evidence in evaluatingeconomic problems.

I Understand the role of assumptions for the underlying models.

Page 10: Introductin to Econometrics

Probability Review: Introduction

I Let’s begin by introducing the notions of probability,randomness, and random variables....

I The use of probability to measure uncertainty and variabilitybegan hundreds of years ago with the “study” of gambling.

I Generally speaking, probability is the chance that something(an event) will happen.

I The probability of an event or outcome is the proportion ofthe time it occurs in the long run - this is called the frequencyinterpretation of probability.

Page 11: Introductin to Econometrics

Probability Review: Definitions

I Sample space and Events: The set of all possible outcomesis called the sample space. An event is a subset of the samplespace.

I We typically use Ω to denote the sample space.I Example: Coin Tossing. Ω = Heads,Tails.I Example: # of times a computer will crash. Ω = 0, 1, ...,

and event A = 0, 1 (i.e., the computer will crash no morethan once).

I Random Variable: Is a numerical summary of a randomoutcome.

I Formally: A random variable is a real valued function, definedon the set of possible outcomes (i.e. the sample space Ω),that assigns a real number to every possible outcome.

I Example: Coin Tossing. Ω = Heads,Tails. A randomvariable could be X ∈ 0, 1 such that X = 0 if Heads occur.

Page 12: Introductin to Econometrics

Probability Review: Definitions

I There are two major classes of random variables

I Discrete and Continuous

I A discrete random variable takes on only a discrete number ofvalues.

I Number of phone calls you will receive today.

I A continuous random variable takes on a continuum of values.

I Amount of time you will spend on the phone.

Page 13: Introductin to Econometrics

Probability Review: Definitions

I Probability distribution: Is a number between 0 and 1 thanquantifies how likely is an event to occur. I.e., for an event A,the number Pr(A) indicates the probability that A will occur.

Page 14: Introductin to Econometrics

Probability Review: Discrete Random Variables

I We characterize or describe a discrete random variable Xwith a probability function (pf).

I A pf lists the probability for each possible discrete outcome.

I The pf of a discrete random variable is defined as the functionf such that for every real number x ,

f (x) = Pr(X = x)

where X represents a random variable and x represents arealization of that random variable.

I The following slide contains a specific example. Note that thesame information is displayed in the table and graph.

Page 15: Introductin to Econometrics

Probability Review: Probability Function

Page 16: Introductin to Econometrics

Probability Review: Cumulative Distribution Function(CDF)

I Another way of characterizing the distribution of a randomvariable is with a cumulative distribution function (cdf)

I The cdf lists the probability that a random variable is lessthan or equal to a specific value

F (x) = Pr(X ≤ x)

x 0 1 2 3 4

Pr(X ≤ x) 0.8 0.9 0.96 0.99 1.0

I F is:I F (x) ∈ [0, 1] for all x .I F is non-decreasing.I F is right-continuous.

Page 17: Introductin to Econometrics

Probability Review: Cumulative Distribution Function(CDF)

I The cumulative distribution may also be referred to as thedistribution function.

x 0 1 2 3 4

Pr(X ≤ x) 0.8 0.9 0.96 0.99 1.0

0

0.2

0.4

0.6

0.8

1

1.2

0 1 2 3 4 5

Page 18: Introductin to Econometrics

Probability Review: Continuous Random Variables

I A random variable Y that can take on any real value withinsome range is a continuous random variable.

I Time, temperature, height...

I For continuous random variables, the probability of aparticular value occurring is equal to zero

Pr(Y = y) = 0

I We typically speak of interval probabilities (i.e. the probabilitythat Y will take on some subset of values)

Pr(a ≤ Y ≤ b)

I Note that probability zero does not mean impossible.

Page 19: Introductin to Econometrics

Probability Review: Probability Density Function (pdf)

I The probabilities associated with a continuous randomvariable Y are determined by the pdf of Y .

I The pdf of Y , denoted f (y), has the following properties:

1. f (y) ≥ 0, for all y .

2. The probability that the uncertain quantity Y will fall in theinterval (a, b) is equal to the area under f (y) between a andb :

P(a < Y < b) =

b∫a

f (y)dy .

3. The total area under the entire curve of f (y) is equal to 1

∞∫−∞

f (y)dy = 1

Page 20: Introductin to Econometrics

Probability Review: Probability Density Function (pdf)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9 1

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9 2

2.1

2.2

2.3

2.4

2.5

2.6

2.7

2.8

2.9 3

3.1

3.2

3.3

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9 1

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9 2

2.1

2.2

2.3

2.4

2.5

2.6

2.7

2.8

2.9 3

3.1

3.2

3.3

P(0.8<y<1.6)

Page 21: Introductin to Econometrics

Probability Review: CDF of Continuous Random Variables

I The definition of a cdf for a continuous random variable is thesame as that of a discrete random variable

F (y) = Pr(Y ≤ y)

I With a continuous random variable, the cdf is a continuousfunction over the entire real line, so we can write down asimple formula

F (y) = Pr(Y ≤ y) =

y∫−∞

f (t)dt

Page 22: Introductin to Econometrics

Probability Review: CDF of Continuous Random Variables

I Furthermore, it follows that, at each point at which f (x) iscontinuous, the pdf can be calculated as

F ′(y) =dF (y)

dy= f (y)

I We can easily see that

Pr(Y > y) = 1− F (y)

and

Pr(y1 < Y ≤ y2) = F (y2)− F (y1)

Page 23: Introductin to Econometrics

I In practice, the CDF allows us to calculate the probability forany interval.

I Pr(15 < CT ≤ 20) = Pr(CT ≤ 20)− Pr(CT ≤ 15) =F (20)− F (15) = .58

I Pr(CT > 20) = 1− F (20) = 1− .78 = .22

Page 24: Introductin to Econometrics

Probability Review: Measures of Central Tendency forDistributions

I Mode: The mode is the value that occurs with the greatestprobability.

I Example: What is the modal age of students in this class?

I Median: The median is the value such that the probability ofthe random variable being less than or equal to that value isat least 50% and the probability of the random variable beinggreater than or equal to that value is at least 50%.

I Example: What is the median age of students in this class?I Case 0

Age 19 20 21% .4 .2 .4

I Case 1Age 19 20 21 22% .4 .2 .2 .2

I Case 2Age 19 20 21 22% .4 .1 .3 .2

Page 25: Introductin to Econometrics

Probability Review: Mean (Expected Value)

I Mean: The mean or expected value of a random variable X isthe weighted average of all its possible outcomes, weightedby the probabilities of those outcomes.

I Unlike the mode, it’s unique.

I For a discrete random variable X ,

E (X ) = x1 Pr(x1)+ ...+ xk Pr(xk) =k

∑i=1

xi Pr(xi ) =k

∑i=1

xi f (xi )

I Example: Expected value of throwing a die:

E (X ) = 161 + 1

62 + 163 + 1

64 + 165 + 1

66 =16 (1 + 2 + 3 + 4 + 5 + 6) = 3.5

Page 26: Introductin to Econometrics

Probability Review: Mean (Expected Value)

I However, one drawback of the mean (relative to the median)is its sensitivity to outliers

I When Bill Gates walks into a bar...

I For a continuous random variable X with probability densityfunction f (x), the mean of X (assuming it exists) is defined as

E (X ) =∫ ∞

−∞xf (x)dx

I The expected value or mean of X is typically denoted byE (X ) or µX .

Page 27: Introductin to Econometrics

Examplef (x) = 2x for 0 < x < 1 (and 0 otherwise)Then

E (X ) =

∞∫−∞

xf (x) dx =

1∫0

x (2x) dx =

1∫0

2x2dx = 23x

3 |10= 23

Question: what’s F (x)?

F (x) =∫ x

−∞f (t) dt =

∫ x

02tdt = t2 |x0= x2 for x ≤ 1.

What about the median?12? F ( 12 ) =

14 < 1

2 Nope.23? F ( 23 ) =

49 < 1

2 Nope.

F ( 1√2) = 1

2

0

0.5

1

1.5

2

2.5

0 0.2 0.4 0.6 0.8 1 1.2 1.4

Page 28: Introductin to Econometrics

Probability Review: Expected Value of a Function

I We know that if X is a random variable with pdf f (x) we cancalculate E (X ) as either

E (X ) =k

∑i=1

xi f (xi ) or E (X ) =∫ ∞

−∞xf (x)dx

I But what if we want to calculate E (X 2) or E (ln(X ))?

I In general, what’s the expected value of a function g(·) of X ?I It can also be shown that (assuming it exists)

E (g(X )) =k

∑i=1

g(xi )f (xi ) or E (g(X )) =∫ ∞

−∞g(x)f (x)dx

I For example, for a continuous distribution

E (X 2) =∫ ∞

−∞x2f (x)dx

Page 29: Introductin to Econometrics

Probability Review: Measures of Dispersion forDistributions

I The variance of a random variable measures the spread ordispersion of the variable around its mean

Var(X ) = E[(X − µX )

2]

I In the discrete case, this is simply the weighted average of thesquared deviations of X from its mean

Var(X ) =k

∑i=1

[xi − E (X )]2 Pr(xi )

I For a continuous random variable X with probability densityfunction f (x), the variance of X is

Var(X ) =∫ ∞

−∞[x − E (X )]2 f (x)dx .

Page 30: Introductin to Econometrics

Probability Review: Variance & Moments

I Another (equivalent) formula that can be used in either caseis:

Var(X ) = E (X 2)− (E (X ))2

where the second moment E (X 2) is defined as

E (X 2) = (x1)2 Pr(x1) + (x2)

2 Pr(x2) + ... + (xk)2 Pr(xk)

or

E (X 2) =∫ ∞

−∞x2f (x)dx

I Other (higher order) moments are defined similarly.

Page 31: Introductin to Econometrics

Probability Review: Standard Deviation

I The variance of X is denoted Var(X ) or σ2X .

I The standard deviation of X , denoted σX , is the square rootof Var(X ).

I A large standard deviation and variance means that theprobability distribution is quite spread out: a large differencebetween the outcome and the expected value is anticipated.

Page 32: Introductin to Econometrics

Probability Review: Example

x Pr(x) x Pr(x) (x)2 Pr(x)1 1

616

16

2 16

13

23

3 16

12

32

4 16

23

83

5 16

56

256

6 16 1 6

↓ ↓E (X ) = 3.5 E (X 2) = 151

6

I k = 6

I E (X ) = ∑ki=1 xi Pr(xi ) = 3.5

I E (X 2) = ∑ki=1(xi )

2 Pr(xi ) = 1516

I Var(X ) = E (X 2)− (E (X ))2 = 1516 − 3.52 = 2.92

I σX = 1.71

Page 33: Introductin to Econometrics

Probability Review: Example

f (x) = 2x for 0 < x < 1 (and 0 otherwise)We already know E (X ) = 2

3 . What’s Var(X )?

Var(X ) =

∞∫−∞

[x − E (X )]2 f (x)dx =

1∫0

(x − 2

3

)22xdx =

1∫0

(2x3 − 8

3x2 + 8

9x)dx =

[(12x

4 − 89x

3 + 49x

2)]1

0= 1

2 −49 = 1

18

or (using the other formula)

Var(X ) = E (X 2)− (E (X ))2 =

1∫0

x2 (2x) dx − ( 23 )2

=

1∫0

2x3dx − 49 =

[12x

4]10− 4

9 = 12 −

49 = 1

18

Page 34: Introductin to Econometrics

Probability Review: Expected Value and Variance of aLinear Function

If Y = a+ bX , then E (Y ) = a+ bE (X )

Example: Suppose E (X ) = 5,Y = 3X − 5,Z = −3X + 15E (Y ) = 3 · E (X )− 5 = 3 · 5− 5 = 10E (Z ) = −3 · E (X ) + 15 = −3 · 5 + 15 = 0

If Y = a+ bX , then Var(Y ) = b2Var(X )

Example: Var(X ) = 5,Y = 3X − 5,Z = −3X + 15Var(Y ) = 9 · Var(X ) = 9 · 5 = 45Var(Z ) = 9 · Var(X ) = 9 · 5 = 45

These properties can be easily proven using the definitions ofexpectation and variance...