statistical inference and regression analysis: gb.3302.30

83
Part 10: Advanced Topics Statistical Inference and Regression Analysis: GB.3302.30 Professor William Greene Stern School of Business IOMS Department Department of Economics

Upload: kirima

Post on 25-Feb-2016

66 views

Category:

Documents


0 download

DESCRIPTION

Statistical Inference and Regression Analysis: GB.3302.30. Professor William Greene Stern School of Business IOMS Department Department of Economics. Statistics and Data Analysis. Part 10 – Advanced Topics. Advanced topics. Nonlinear Least Squares Nonlinear Models – ML Estimation - PowerPoint PPT Presentation

TRANSCRIPT

Statistics

Statistical Inference and Regression Analysis: GB.3302.30Professor William GreeneStern School of BusinessIOMS Department Department of Economics

1Statistics and Data AnalysisPart 10 Advanced Topics

2

Advanced topicsNonlinear Least SquaresNonlinear Models ML Estimation Poisson RegressionBinary ChoiceEnd of course.4

Statistics and Data AnalysisNonlinear Least Squares

5Nonlinear Least Squares

Lanczos 1 Data

Nonlinear Regression

Nonlinear Least Squares

There are no explicit solutions to these equations in the form of bi = a function of (y,x).

Strategy for Nonlinear LS

NLS StrategyPick bA. Compute yi0 and xi0B. Regress yi0 on xi0This obtains a new bReturn to step A or exit if the new b is the same as the old b

Lanczos 1 First Iteration

Now, repeat the iteration using this as b

This is the correct answer

Gauss-Marquardt AlgorithmStarting with b0A. Compute regressors xi0 Compute residuals ei0 = yi f(xi,b0)B. New b1 = b0 + slopes in regression of ei0 on xi0Return to A. or exit if estimates have converged.This is equivalent to our earlier method.

Statistics and Data AnalysisMaximum Likelihood: Poisson

17Application: Doctor VisitsGerman Individual Health Care data: N=27,236Model for number of visits to the doctor:Poisson regressionAge, Health Satisfaction, Marital Status, Income, Kids

18Poisson Regression

19Nonlinear Least Squares

Maximum Likelihood EstimationThis defines a class of estimators based on the particular distribution assumed to have generated the observed random variable. The main advantage of ML estimators is that among all Consistent Asymptotically Normal Estimators, MLEs have optimal asymptotic properties.

21Setting up the MLEThe distribution of the observed random variable is written as a function of the parameters to be estimated

P(yi|data,) = Probability density | parameters.

The likelihood function is constructed from the density Construction: Joint probability density function of the observed sample of data generally the product when the data are a random sample.

22Likelihood for the Poisson Regression

Newtons Method

24

25Properties of the MLEConsistent: Not necessarily unbiased, howeverAsymptotically normally distributed: Proof based on central limit theoremsAsymptotically efficient: Among the possible estimators that are consistent and asymptotically normally distributedInvariant: The MLE of g() is g(the MLE of )

Computing the Asymptotic VarianceWe want to estimate {-E[H]}-1 Three ways:(1) Just compute the negative of the actual second derivatives matrix and invert it.(2) Insert the maximum likelihood estimates into the known expected values of the second derivatives matrix. Sometimes (1) and (2) give the same answer (for example, in the Poisson regression model).(3) Since E[H] is the variance of the first derivatives, estimate this with the sample variance (i.e., mean square) of the first derivatives. This will almost always be different from (1) and (2). Since they are estimating the same thing, in large samples, all three will give the same answer.

27Poisson Regression Iterations

28

MLENLS

Using the Model. Partial Effects

Effect of Income Depends on Age

Effect of Income | Age

Statistics and Data AnalysisBinary Choice

33Case Study: Credit Modeling1992 American Express analysis ofApplication process: Acceptance or rejection; Y = 0 (reject) or 1 (accept).Cardholder behaviorLoan default (D = 0 or 1).Average monthly expenditure (E = $/month)General credit usage/behavior (C = number of charges)13,444 applications in November, 1992

Proportion for BernoulliIn the AmEx data, the true population acceptance rate is 0.7809 = Y = 1 if application accepted, 0 if not.E[y] = E[(1/N)iyi] = paccept = . This is the estimator

35

Some Evidence= Homeowners

Does the acceptance rate depend on home ownership?

A Test of IndependenceIn the credit card example, are Own/Rent and Accept/Reject independent? Hypothesis: Prob(Ownership) and Prob(Acceptance) are independentFormal hypothesis, based only on the laws of probability: Prob(Own,Accept) = Prob(Own)Prob(Accept) (and likewise for the other three possibilities.Rejection region: Joint frequencies that do not look like the products of the marginal frequencies.

37Contingency Table AnalysisThe Data: Frequencies Reject Accept TotalRent 1,845 5,469 7,214Own 1,100 5,030 6,630Total 2,945 10,499 13,444Step 1: Convert to Actual Proportions Reject Accept TotalRent 0.13724 0.40680 0.54404Own 0.08182 0.37414 0.45596Total 0.21906 0.78094 1.00000

38Independence Test

Step 2: Expected proportions assuming independence: If the factors are independent, then the joint proportions should equal the product of the marginal proportions.[Rent,Reject] 0.54404 x 0.21906 = 0.11918[Rent,Accept] 0.54404 x 0.78094 = 0.42486[Own,Reject] 0.45596 x 0.21906 = 0.09988[Own,Accept] 0.45596 x 0.78094 = 0.35606

39Comparing Actual to Expected

It appears that the acceptance rate is dependent on home ownership

40When is the Chi Squared Large?Critical values from chi squared tableDegrees of freedom = (R-1)(C-1).Critical chi squaredD.F. .05 .01 1 3.84 6.63 2 5.99 9.21 3 7.81 11.34 4 9.49 13.28 5 11.07 15.09 6 12.59 16.81 7 14.07 18.48 8 15.51 20.09 9 16.92 21.6710 18.31 23.21

41Analyzing DefaultDo renters default more often (at a different rate) than owners?To investigate, we study the cardholders (only) DEFAULTOWNRENT 0 1 All 0 4854 615 5469 46.23 5.86 52.09

1 4649 381 5030 44.28 3.63 47.91

All 9503 996 10499 90.51 9.49 100.00

42Hypothesis Test

43More Formal Model of Acceptance and Default

Probability Models

zi

Likelihood Function

American Express, 1992

Logistic Model for Acceptance

Probit Default Model

Think statisticallyBuild models

Thank you.