[william h. greene]_econometric_analysis(book_fi.org)

983
Greene-50240 gree50240˙FM July 10, 2002 12:51 FIFTH EDITION ECONOMETRIC ANALYSIS Q William H. Greene New York University Upper Saddle River, New Jersey 07458 iii

Upload: mohamedchaouche

Post on 14-Jun-2015

601 views

Category:

Economy & Finance


3 download

TRANSCRIPT

  • 1. Greene-50240 gree50240FM July 10, 2002 12:51 FIFTH EDITION ECONOMETRIC ANALYSIS QWilliam H. Greene New York University Upper Saddle River, New Jersey 07458 iii

2. Greene-50240 gree50240FM July 10, 2002 12:51 CIP data to come Executive Editor: Rod Banister Editor-in-Chief: P. J. Boardman Managing Editor: Gladys Soto Assistant Editor: Marie McHale Editorial Assistant: Lisa Amato Senior Media Project Manager: Victoria Anderson Executive Marketing Manager: Kathleen McLellan Marketing Assistant: Christopher Bath Managing Editor (Production): Cynthia Regan Production Editor: Michael Reynolds Production Assistant: Dianne Falcone Permissions Supervisor: Suzanne Grappi Associate Director, Manufacturing: Vinnie Scelta Cover Designer: Kiwi Design Cover Photo: Anthony Bannister/Corbis Composition: Interactive Composition Corporation Printer/Binder: Courier/Westford Cover Printer: Coral Graphics Credits and acknowledgments borrowed from other sources and reproduced, with permission, in this textbook appear on appropriate page within text (or on page XX). Copyright 2003, 2000, 1997, 1993 by Pearson Education, Inc., Upper Saddle River, New Jersey, 07458. All rights reserved. Printed in the United States of America. This publication is protected by Copyright and permission should be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise. For information regarding permission(s), write to: Rights and Permissions Department. Pearson Education LTD. Pearson Education Australia PTY, Limited Pearson Education Singapore, Pte. Ltd Pearson Education North Asia Ltd Pearson Education, Canada, Ltd Pearson Educacin de Mexico, S.A. de C.V. Pearson EducationJapan Pearson Education Malaysia, Pte. Ltd 10 9 8 7 6 5 4 3 2 1 ISBN 0-13-066189-9 iv 3. Greene-50240 gree50240FM July 10, 2002 12:51 Chapter 1Chapter 2Chapter 3Chapter 4Chapter 5Chapter 6Chapter 7Chapter 8Chapter 9Chapter 10Chapter 11Chapter 12Chapter 13Chapter 14Chapter 15Chapter 16Chapter 17Chapter 18Chapter 19Chapter 20Chapter 21Chapter 22B R I E F C O N T E N T S QIntroduction 1The Classical Multiple Linear Regression Model 7Least Squares 19Finite-Sample Properties of the Least Squares Estimator 41Large-Sample Properties of the Least Squares and InstrumentalVariables Estimators 65Inference and Prediction 93Functional Form and Structural Change 116Specication Analysis and Model Selection 148Nonlinear Regression Models 162Nonspherical DisturbancesThe GeneralizedRegression Model 191Heteroscedasticity 215Serial Correlation 250Models for Panel Data 283Systems of Regression Equations 339Simultaneous-Equations Models 378Estimation Frameworks in Econometrics 425Maximum Likelihood Estimation 468The Generalized Method of Moments 525Models with Lagged Variables 558Time-Series Models 608Models for Discrete Choice 663Limited Dependent Variable and Duration Models 756Appendix A Matrix Algebra 803Appendix B Probability and Distribution Theory 845Appendix C Estimation and Inference 877Appendix D Large Sample Distribution Theory 896vii 4. Greene-50240 gree50240FM July 10, 2002 12:51 viii Brief Contents Appendix E Computation and Optimization 919Author Index 000Subject Index 000Appendix F Data Sets Used in Applications 946Appendix G Statistical Tables 953References 959 5. Greene-50240 gree50240FM July 10, 2002 12:51 C O N T E N T S Q CHAPTER 1 Introduction 11.1 Econometrics 11.2 Econometric Modeling 11.3 Data and Methodology 41.4 Plan of the Book 5CHAPTER 2 The Classical Multiple Linear Regression Model 72.1 Introduction 72.2 The Linear Regression Model 72.3 Assumptions of the Classical Linear Regression Model 102.3.1 Linearity of the Regression Model 112.3.2 Full Rank 132.3.3 Regression 142.3.4 Spherical Disturbances 152.3.5 Data Generating Process for the Regressors 162.3.6 Normality 172.4 Summary and Conclusions 18CHAPTER 3 Least Squares 193.1 Introduction 193.2 Least Squares Regression 193.2.1 The Least Squares Coefcient Vector 203.2.2 Application: An Investment Equation 213.2.3 Algebraic Aspects of The Least Squares Solution 243.2.4 Projection 243.3 Partitioned Regression and Partial Regression 263.4 Partial Regression and Partial Correlation Coefcients 283.5 Goodness of Fit and the Analysis of Variance 313.5.1 The Adjusted R-Squared and a Measure of Fit 343.5.2 R-Squared and the Constant Term in the Model 363.5.3 Comparing Models 373.6 Summary and Conclusions 38ix 6. 41 Greene-50240 gree50240FM July 10, 2002 12:51 x Contents CHAPTER 4 Finite-Sample Properties of the Least Squares Estimator 4.1 Introduction 414.2 Motivating Least Squares 424.2.1 The Population Orthogonality Conditions 424.2.2 Minimum Mean Squared Error Predictor 434.2.3 Minimum Variance Linear Unbiased Estimation 444.3 Unbiased Estimation 444.4 The Variance of the Least Squares Estimator and the Gauss MarkovTheorem 454.5 The Implications of Stochastic Regressors 474.6 Estimating the Variance of the Least Squares Estimator 484.7 The Normality Assumption and Basic Statistical Inference 504.7.1 Testing a Hypothesis About a Coefcient 504.7.2 Condence Intervals for Parameters 524.7.3 Condence Interval for a Linear Combination of Coefcients:The Oaxaca Decomposition 534.7.4 Testing the Signicance of the Regression 544.7.5 Marginal Distributions of the Test Statistics 554.8 Finite-Sample Properties of Least Squares 554.9 Data Problems 564.9.1 Multicollinearity 564.9.2 Missing Observations 594.9.3 Regression Diagnostics and Inuential Data Points 604.10 Summary and Conclusions 61CHAPTER 5 Large-Sample Properties of the Least Squares and InstrumentalVariables Estimators 655.1 Introduction 655.2 Asymptotic Properties of the Least Squares Estimator 655.2.1 Consistency of the Least Squares Estimator of 665.2.2 Asymptotic Normality of the Least Squares Estimator 675.2.3 Consistency of s2 and the Estimator of Asy. Var[b] 69 5.2.4 Asymptotic Distribution of a Function of b: The DeltaMethod 705.2.5 Asymptotic Efciency 705.3 More General Cases 725.3.1 Heterogeneity in the Distributions of xi 725.3.2 Dependent Observations 735.4 Instrumental Variable and Two Stage Least SquaresEstimation 745.5 Hausmans Specication Test and an Application to Instrumental VariableEstimation 80 7. Greene-50240 gree50240FM July 10, 2002 12:51 Contents xi5.6 Measurement Error 835.6.1 Least Squares Attenuation 845.6.2 Instrumental Variables Estimation 865.6.3 Proxy Variables 875.6.4 Application: Income and Education and a Study of Twins 885.7 Summary and Conclusions 90CHAPTER 6 Inference and Prediction 936.1 Introduction 936.2 Restrictions and Nested Models 936.3 Two Approaches to Testing Hypotheses 956.3.1 The F Statistic and the Least Squares Discrepancy 956.3.2 The Restricted Least Squares Estimator 996.3.3 The Loss of Fit from Restricted Least Squares 1016.4 Nonnormal Disturbances and Large Sample Tests 1046.5 Testing Nonlinear Restrictions 1086.6 Prediction 1116.7 Summary and Conclusions 114CHAPTER 7 Functional Form and Structural Change 1167.1 Introduction 1167.2 Using Binary Variables 1167.2.1 Binary Variables in Regression 1167.2.2 Several Categories 1177.2.3 Several Groupings 1187.2.4 Threshold Effects and Categorical Variables 1207.2.5 Spline Regression 1217.3 Nonlinearity in the Variables 1227.3.1 Functional Forms 1227.3.2 Identifying Nonlinearity 1247.3.3 Intrinsic Linearity and Identication 1277.4 Modeling and Testing for a Structural Break 1307.4.1 Different Parameter Vectors 1307.4.2 Insufcient Observations 1317.4.3 Change in a Subset of Coefcients 1327.4.4 Tests of Structural Break with Unequal Variances 1337.5 Tests of Model Stability 1347.5.1 Hansens Test 1347.5.2 Recursive Residuals and the CUSUMS Test 1357.5.3 Predictive Test 1377.5.4 Unknown Timing of the Structural Break 1397.6 Summary and Conclusions 144 8. Greene-50240 gree50240FM July 10, 2002 12:51 xii Contents CHAPTER 8 Specication Analysis and Model Selection 1488.1 Introduction 1488.2 Specication Analysis and Model Building 1488.2.1 Bias Caused by Omission of Relevant Variables 1488.2.2 Pretest Estimation 1498.2.3 Inclusion of Irrelevant Variables 1508.2.4 Model BuildingA General to Simple Strategy 1518.3 Choosing Between Nonnested Models 1528.3.1 Testing Nonnested Hypotheses 1538.3.2 An Encompassing Model 1548.3.3 Comprehensive ApproachThe J Test 1548.3.4 The Cox Test 1558.4 Model Selection Criteria 1598.5 Summary and Conclusions 160CHAPTER 9 Nonlinear Regression Models 1629.1 Introduction 1629.2 Nonlinear Regression Models 1629.2.1 Assumptions of the Nonlinear Regression Model 1639.2.2 The Orthogonality Condition and the Sum of Squares 1649.2.3 The Linearized Regression 1659.2.4 Large Sample Properties of the Nonlinear Least SquaresEstimator 1679.2.5 Computing the Nonlinear Least Squares Estimator 1699.3 Applications 1719.3.1 A Nonlinear Consumption Function 1719.3.2 The BoxCox Transformation 1739.4 Hypothesis Testing and Parametric Restrictions 1759.4.1 Signicance Tests for Restrictions: F and Wald Statistics 1759.4.2 Tests Based on the LM Statistic 1779.4.3 A Specication Test for Nonlinear Regressions: The PE Test 1789.5 Alternative Estimators for Nonlinear Regression Models 1809.5.1 Nonlinear Instrumental Variables Estimation 1819.5.2 Two-Step Nonlinear Least Squares Estimation 1839.5.3 Two-Step Estimation of a Credit Scoring Model 1869.6 Summary and Conclusions 189CHAPTER 10 Nonspherical DisturbancesThe GeneralizedRegression Model 19110.1 Introduction 19110.2 Least Squares and Instrumental Variables Estimation 19210.2.1 Finite-Sample Properties of Ordinary Least Squares 19310.2.2 Asymptotic Properties of Least Squares 19410.2.3 Asymptotic Properties of Nonlinear Least Squares 196 9. Greene-50240 gree50240FM July 10, 2002 12:51 Contents xiii10.2.4 Asymptotic Properties of the Instrumental VariablesEstimator 19610.3 Robust Estimation of Asymptotic Covariance Matrices 19810.4 Generalized Method of Moments Estimation 20110.5 Efcient Estimation by Generalized Least Squares 20710.5.1 Generalized Least Squares (GLS) 20710.5.2 Feasible Generalized Least Squares 20910.6 Maximum Likelihood Estimation 21110.7 Summary and Conclusions 212CHAPTER 11 Heteroscedasticity 21511.1 Introduction 21511.2 Ordinary Least Squares Estimation 21611.2.1 Inefciency of Least Squares 21711.2.2 The Estimated Covariance Matrix of b 21711.2.3 Estimating the Appropriate Covariance Matrix for OrdinaryLeast Squares 21911.3 GMM Estimation of the Heteroscedastic Regression Model 22111.4 Testing for Heteroscedasticity 22211.4.1 Whites General Test 22211.4.2 The GoldfeldQuandt Test 22311.4.3 The BreuschPagan/Godfrey LM Test 22311.5 Weighted Least Squares When is Known 22511.6 Estimation When Contains Unknown Parameters 22711.6.1 Two-Step Estimation 22711.6.2 Maximum Likelihood Estimation 22811.6.3 Model Based Tests for Heteroscedasticity 22911.7 Applications 23211.7.1 Multiplicative Heteroscedasticity 23211.7.2 Groupwise Heteroscedasticity 23511.8 Autoregressive Conditional Heteroscedasticity 23811.8.1 The ARCH(1) Model 23811.8.2 ARCH(q), ARCH-in-Mean and Generalized ARCHModels 24011.8.3 Maximum Likelihood Estimation of the GARCH Model 24211.8.4 Testing for GARCH Effects 24411.8.5 Pseudo-Maximum Likelihood Estimation 24511.9 Summary and Conclusions 246CHAPTER 12 Serial Correlation 25012.1 Introduction 25012.2 The Analysis of Time-Series Data 25312.3 Disturbance Processes 256 10. Greene-50240 gree50240FM July 10, 2002 12:51 xiv Contents 12.3.1 Characteristics of Disturbance Processes 25612.3.2 AR(1) Disturbances 25712.4 Some Asymptotic Results for Analyzing Time Series Data 25912.4.1 Convergence of MomentsThe Ergodic Theorem 26012.4.2 Convergence to NormalityA Central Limit Theorem 26212.5 Least Squares Estimation 26512.5.1 Asymptotic Properties of Least Squares 26512.5.2 Estimating the Variance of the Least Squares Estimator 26612.6 GMM Estimation 26812.7 Testing for Autocorrelation 26812.7.1 Lagrange Multiplier Test 26912.7.2 Box and Pierces Test and Ljungs Renement 26912.7.3 The DurbinWatson Test 27012.7.4 Testing in the Presence of a Lagged Dependent Variables 27012.7.5 Summary of Testing Procedures 27112.8 Efcient Estimation When Is Known 27112.9 Estimation When Is Unknown 27312.9.1 AR(1) Disturbances 27312.9.2 AR(2) Disturbances 27412.9.3 Application: Estimation of a Model with Autocorrelation 27412.9.4 Estimation with a Lagged Dependent Variable 27712.10 Common Factors 27812.11 Forecasting in the Presence of Autocorrelation 27912.12 Summary and Conclusions 280CHAPTER 13 Models for Panel Data 28313.1 Introduction 28313.2 Panel Data Models 28313.3 Fixed Effects 28713.3.1 Testing the Signicance of the Group Effects 28913.3.2 The Within- and Between-Groups Estimators 28913.3.3 Fixed Time and Group Effects 29113.3.4 Unbalanced Panels and Fixed Effects 29313.4 Random Effects 29313.4.1 Generalized Least Squares 29513.4.2 Feasible Generalized Least Squares When Is Unknown 29613.4.3 Testing for Random Effects 29813.4.4 Hausmans Specication Test for the Random EffectsModel 30113.5 Instrumental Variables Estimation of the Random Effects Model 30313.6 GMM Estimation of Dynamic Panel Data Models 30713.7 Nonspherical Disturbances and Robust Covariance Estimation 31413.7.1 Robust Estimation of the Fixed Effects Model 314 11. Greene-50240 gree50240FM July 10, 2002 12:51 Contents xv13.7.2 Heteroscedasticity in the Random Effects Model 31613.7.3 Autocorrelation in Panel Data Models 31713.8 Random Coefcients Models 31813.9 Covariance Structures for Pooled Time-Series Cross-SectionalData 32013.9.1 Generalized Least Squares Estimation 32113.9.2 Feasible GLS Estimation 32213.9.3 Heteroscedasticity and the Classical Model 32313.9.4 Specication Tests 32313.9.5 Autocorrelation 32413.9.6 Maximum Likelihood Estimation 32613.9.7 Application to Grunfelds Investment Data 32913.9.8 Summary 33313.10 Summary and Conclusions 334CHAPTER 14 Systems of Regression Equations 33914.1 Introduction 33914.2 The Seemingly Unrelated Regressions Model 34014.2.1 Generalized Least Squares 34114.2.2 Seemingly Unrelated Regressions with Identical Regressors 34314.2.3 Feasible Generalized Least Squares 34414.2.4 Maximum Likelihood Estimation 34714.2.5 An Application from Financial Econometrics:The Capital Asset Pricing Model 35114.2.6 Maximum Likelihood Estimation of the Seemingly UnrelatedRegressions Model with a Block of Zeros in theCoefcient Matrix 35714.2.7 Autocorrelation and Heteroscedasticity 36014.3 Systems of Demand Equations: Singular Systems 36214.3.1 CobbDouglas Cost Function 36314.3.2 Flexible Functional Forms: The Translog Cost Function 36614.4 Nonlinear Systems and GMM Estimation 36914.4.1 GLS Estimation 37014.4.2 Maximum Likelihood Estimation 37114.4.3 GMM Estimation 37214.5 Summary and Conclusions 374CHAPTER 15 Simultaneous-Equations Models 37815.1 Introduction 37815.2 Fundamental Issues in Simultaneous-Equations Models 37815.2.1 Illustrative Systems of Equations 37815.2.2 Endogeneity and Causality 38115.2.3 A General Notation for Linear Simultaneous EquationsModels 38215.3 The Problem of Identication 385 12. Greene-50240 gree50240FM July 10, 2002 12:51 xvi Contents 15.3.1 The Rank and Order Conditions for Identication 38915.3.2 Identication Through Other Nonsample Information 39415.3.3 Identication Through Covariance RestrictionsThe FullyRecursive Model 39415.4 Methods of Estimation 39615.5 Single Equation: Limited Information Estimation Methods 39615.5.1 Ordinary Least Squares 39615.5.2 Estimation by Instrumental Variables 39715.5.3 Two-Stage Least Squares 39815.5.4 GMM Estimation 40015.5.5 Limited Information Maximum Likelihood and the k Class ofEstimators 40115.5.6 Two-Stage Least Squares in Models That Are Nonlinear inVariables 40315.6 System Methods of Estimation 40415.6.1 Three-Stage Least Squares 40515.6.2 Full-Information Maximum Likelihood 40715.6.3 GMM Estimation 40915.6.4 Recursive Systems and Exactly Identied Equations 41115.7 Comparison of MethodsKleins Model I 41115.8 Specication Tests 41315.9 Properties of Dynamic Models 41515.9.1 Dynamic Models and Their Multipliers 41515.9.2 Stability 41715.9.3 Adjustment to Equilibrium 41815.10 Summary and Conclusions 421CHAPTER 16 Estimation Frameworks in Econometrics 42516.1 Introduction 42516.2 Parametric Estimation and Inference 42716.2.1 Classical Likelihood Based Estimation 42816.2.2 Bayesian Estimation 42916.2.2.a Bayesian Analysis of the Classical Regression Model 43016.2.2.b Point Estimation 43416.2.2.c Interval Estimation 43516.2.2.d Estimation with an Informative Prior Density 43516.2.2.e Hypothesis Testing 43716.2.3 Using Bayes Theorem in a Classical Estimation Problem: TheLatent Class Model 43916.2.4 Hierarchical Bayes Estimation of a Random Parameters Modelby Markov Chain Monte Carlo Simulation 44416.3 Semiparametric Estimation 44716.3.1 GMM Estimation in Econometrics 44716.3.2 Least Absolute Deviations Estimation 448 13. Greene-50240 gree50240FM July 10, 2002 12:51 Contents xvii16.3.3 Partially Linear Regression 45016.3.4 Kernel Density Methods 45216.4 Nonparametric Estimation 45316.4.1 Kernel Density Estimation 45316.4.2 Nonparametric Regression 45716.5 Properties of Estimators 46016.5.1 Statistical Properties of Estimators 46016.5.2 Extremum Estimators 46116.5.3 Assumptions for Asymptotic Properties of ExtremumEstimators 46116.5.4 Asymptotic Properties of Estimators 46416.5.5 Testing Hypotheses 46516.6 Summary and Conclusions 466CHAPTER 17 Maximum Likelihood Estimation 46817.1 Introduction 46817.2 The Likelihood Function and Identication of the Parameters 46817.3 Efcient Estimation: The Principle of Maximum Likelihood 47017.4 Properties of Maximum Likelihood Estimators 47217.4.1Regularity Conditions 473 17.4.2Properties of Regular Densities 474 17.4.3The Likelihood Equation 476 17.4.4The Information Matrix Equality 476 17.4.5 Asymptotic Properties of the MaximumLikelihood Estimator 47617.4.5.aConsistency 477 17.4.5.bAsymptotic Normality 478 17.4.5.cAsymptotic Efciency 479 17.4.5.dInvariance 480 17.4.5.eConclusion 480 17.4.6 Estimating the Asymptotic Variance of the MaximumLikelihood Estimator 48017.4.7Conditional Likelihoods and Econometric Models 482 17.5 Three Asymptotically Equivalent Test Procedures 48417.5.1 The Likelihood Ratio Test 48417.5.2 The Wald Test 48617.5.3 The Lagrange Multiplier Test 48917.5.4 An Application of the Likelihood Based Test Procedures 49017.6 Applications of Maximum Likelihood Estimation 49217.6.1The Normal Linear Regression Model 49217.6.3Nonnormal DisturbancesThe Stochastic Frontier Model 50117.6.4Conditional Moment Tests of Specication 50517.6.2 Maximum Likelihood Estimation of NonlinearRegression Models 496 14. Greene-50240 gree50240FM July 10, 2002 12:51 xviii Contents 17.7 Two-Step Maximum Likelihood Estimation 50817.8 Maximum Simulated Likelihood Estimation 51217.9 Pseudo-Maximum Likelihood Estimation and Robust AsymptoticCovariance Matrices 51817.10 Summary and Conclusions 521CHAPTER 18 The Generalized Method of Moments 52518.1 Introduction 52518.2 Consistent Estimation: The Method of Moments 52618.2.1 Random Sampling and Estimating the Parameters ofDistributions 52718.2.2 Asymptotic Properties of the Method of MomentsEstimator 53118.2.3 SummaryThe Method of Moments 53318.3 The Generalized Method of Moments (GMM) Estimator 53318.3.1 Estimation Based on Orthogonality Conditions 53418.3.2 Generalizing the Method of Moments 53618.3.3 Properties of the GMM Estimator 54018.3.4 GMM Estimation of Some Specic Econometric Models 54418.4 Testing Hypotheses in the GMM Framework 54818.4.1 Testing the Validity of the Moment Restrictions 54818.4.2 GMM Counterparts to the Wald, LM, and LR Tests 54918.5 Application: GMM Estimation of a Dynamic Panel Data Model ofLocal Government Expenditures 55118.6 Summary and Conclusions 555CHAPTER 19 Models with Lagged Variables 55819.1 Introduction 55819.2 Dynamic Regression Models 55919.2.1 Lagged Effects in a Dynamic Model 56019.2.2 The Lag and Difference Operators 56219.2.3 Specication Search for the Lag Length 56419.3 Simple Distributed Lag Models 56519.3.1 Finite Distributed Lag Models 56519.3.2 An Innite Lag Model: The Geometric Lag Model 56619.4 Autoregressive Distributed Lag Models 57119.4.1 Estimation of the ARDL Model 57219.4.2 Computation of the Lag Weights in the ARDLModel 57319.4.3 Stability of a Dynamic Equation 57319.4.4 Forecasting 57619.5 Methodological Issues in the Analysis of Dynamic Models 57919.5.1 An Error Correction Model 57919.5.2 Autocorrelation 581 15. Greene-50240 gree50240FM July 10, 2002 12:51 Contents xix19.5.3 Specication Analysis 58219.5.4 Common Factor Restrictions 58319.6 Vector Autoregressions 58619.6.1 Model Forms 58719.6.2 Estimation 58819.6.3 Testing Procedures 58919.6.4 Exogeneity 59019.6.5 Testing for Granger Causality 59219.6.6 Impulse Response Functions 59319.6.7 Structural VARs 59519.6.8 Application: Policy Analysis with a VAR 59619.6.9 VARs in Microeconomics 60219.7 Summary and Conclusions 605CHAPTER 20 Time-Series Models 60820.1 Introduction 60820.2 Stationary Stochastic Processes 60920.2.1 Autoregressive Moving-Average Processes 60920.2.2 Stationarity and Invertibility 61120.2.3 Autocorrelations of a Stationary Stochastic Process 61420.2.4 Partial Autocorrelations of a Stationary StochasticProcess 61720.2.5 Modeling Univariate Time Series 61920.2.6 Estimation of the Parameters of a Univariate TimeSeries 62120.2.7 The Frequency Domain 62420.2.7.a Theoretical Results 62520.2.7.b Empirical Counterparts 62720.3 Nonstationary Processes and Unit Roots 63120.3.1 Integrated Processes and Differencing 63120.3.2 Random Walks, Trends, and Spurious Regressions 63220.3.3 Tests for Unit Roots in Economic Data 63620.3.4 The DickeyFuller Tests 63720.3.5 Long Memory Models 64720.4 Cointegration 64920.4.1 Common Trends 65320.4.2 Error Correction and VAR Representations 65420.4.3 Testing for Cointegration 65520.4.4 Estimating Cointegration Relationships 65720.4.5 Application: German Money Demand 65720.4.5.a Cointegration Analysis and a Long RunTheoretical Model 65920.4.5.b Testing for Model Instability 65920.5 Summary and Conclusions 660 16. Greene-50240 gree50240FM July 10, 2002 12:51 xx Contents CHAPTER 21 Models for Discrete Choice 66321.1 Introduction 66321.2 Discrete Choice Models 66321.3 Models for Binary Choice 66521.3.1 The Regression Approach 66521.3.2 Latent RegressionIndex Function Models 66821.3.3 Random Utility Models 67021.4 Estimation and Inference in Binary Choice Models 67021.4.1 Robust Covariance Matrix Estimation 67321.4.2 Marginal Effects 67421.4.3 Hypothesis Tests 67621.4.4 Specication Tests for Binary Choice Models 67921.4.4.a Omitted Variables 68021.4.4.b Heteroscedasticity 68021.4.4.c A Specication Test for Nonnested ModelsTestingfor the Distribution 68221.4.5 Measuring Goodness of Fit 68321.4.6 Analysis of Proportions Data 68621.5 Extensions of the Binary Choice Model 68921.5.1 Random and Fixed Effects Models for Panel Data 68921.5.1.a Random Effects Models 69021.5.1.b Fixed Effects Models 69521.5.2 Semiparametric Analysis 70021.5.3 The Maximum Score Estimator (MSCORE) 70221.5.4 Semiparametric Estimation 70421.5.5 A Kernel Estimator for a Nonparametric RegressionFunction 70621.5.6 Dynamic Binary Choice Models 70821.6 Bivariate and Multivariate Probit Models 71021.6.1 Maximum Likelihood Estimation 71021.6.2 Testing for Zero Correlation 71221.6.3 Marginal Effects 71221.6.4 Sample Selection 71321.6.5 A Multivariate Probit Model 71421.6.6 Application: Gender Economics Courses in LiberalArts Colleges 71521.7 Logit Models for Multiple Choices 71921.7.1 The Multinomial Logit Model 72021.7.2 The Conditional Logit Model 72321.7.3 The Independence from Irrelevant Alternatives 72421.7.4 Nested Logit Models 72521.7.5 A Heteroscedastic Logit Model 72721.7.6 Multinomial Models Based on the Normal Distribution 72721.7.7 A Random Parameters Model 728 17. Greene-50240 gree50240FM July 10, 2002 12:51 Contents xxi21.7.8 Application: Conditional Logit Model for TravelMode Choice 72921.8 Ordered Data 73621.9 Models for Count Data 74021.9.1 Measuring Goodness of Fit 74121.9.2 Testing for Overdispersion 74321.9.3 Heterogeneity and the Negative BinomialRegression Model 74421.9.4 Application: The Poisson Regression Model 74521.9.5 Poisson Models for Panel Data 74721.9.6 Hurdle and Zero-Altered Poisson Models 74921.10 Summary and Conclusions 752CHAPTER 22 Limited Dependent Variable and Duration Models 75622.1 Introduction 75622.2 Truncation 75622.2.1 Truncated Distributions 75722.2.2 Moments of Truncated Distributions 75822.2.3 The Truncated Regression Model 76022.3 Censored Data 76122.3.1 The Censored Normal Distribution 76222.3.2 The Censored Regression (Tobit) Model 76422.3.3 Estimation 76622.3.4 Some Issues in Specication 76822.3.4.a Heteroscedasticity 76822.3.4.b Misspecication of Prob[y* < 0] 77022.3.4.c Nonnormality 77122.3.4.d Conditional Moment Tests 77222.3.5 Censoring and Truncation in Models for Counts 77322.3.6 Application: Censoring in the Tobit and PoissonRegression Models 77422.4 The Sample Selection Model 78022.4.1 Incidental Truncation in a Bivariate Distribution 78122.4.2 Regression in a Model of Selection 78222.4.3 Estimation 78422.4.4 Treatment Effects 78722.4.5 The Normality Assumption 78922.4.6 Selection in Qualitative Response Models 79022.5 Models for Duration Data 79022.5.1 Duration Data 79122.5.2 A Regression-Like Approach: Parametric Modelsof Duration 79222.5.2.a Theoretical Background 79222.5.2.b Models of the Hazard Function 79322.5.2.c Maximum Likelihood Estimation 794 18. Greene-50240 gree50240FM July 10, 2002 12:51 xxii Contents 22.5.2.d Exogenous Variables 79622.5.2.e Heterogeneity 79722.5.3 Other Approaches 79822.6 Summary and Conclusions 801APPENDIX A Matrix Algebra 803A.1 Terminology 803A.2 Algebraic Manipulation of Matrices 803A.2.1 Equality of Matrices 803A.2.2 Transposition 804A.2.3 Matrix Addition 804A.2.4 Vector Multiplication 805A.2.5 A Notation for Rows and Columns of a Matrix 805A.2.6 Matrix Multiplication and Scalar Multiplication 805A.2.7 Sums of Values 807A.2.8 A Useful Idempotent Matrix 808A.3 Geometry of Matrices 809A.3.1 Vector Spaces 809A.3.2 Linear Combinations of Vectors and Basis Vectors 811A.3.3 Linear Dependence 811A.3.4 Subspaces 813A.3.5 Rank of a Matrix 814A.3.6 Determinant of a Matrix 816A.3.7 A Least Squares Problem 817A.4 Solution of a System of Linear Equations 819A.4.1 Systems of Linear Equations 819A.4.2 Inverse Matrices 820A.4.3 Nonhomogeneous Systems of Equations 822A.4.4 Solving the Least Squares Problem 822A.5 Partitioned Matrices 822A.5.1 Addition and Multiplication of Partitioned Matrices 823A.5.2 Determinants of Partitioned Matrices 823A.5.3 Inverses of Partitioned Matrices 823A.5.4 Deviations from Means 824A.5.5 Kronecker Products 824A.6 Characteristic Roots and Vectors 825A.6.1 The Characteristic Equation 825A.6.2 Characteristic Vectors 826A.6.3 General Results for Characteristic Roots and Vectors 826A.6.4 Diagonalization and Spectral Decomposition of a Matrix 827A.6.5 Rank of a Matrix 827A.6.6 Condition Number of a Matrix 829A.6.7 Trace of a Matrix 829A.6.8 Determinant of a Matrix 830A.6.9 Powers of a Matrix 830 19. Greene-50240 gree50240FM July 10, 2002 12:51 Contents xxiii A.6.10 Idempotent Matrices 832A.6.11 Factoring a Matrix 832A.6.12 The Generalized Inverse of a Matrix 833A.7 Quadratic Forms and Denite Matrices 834A.7.1 Nonnegative Denite Matrices 835A.7.2 Idempotent Quadratic Forms 836A.7.3 Comparing Matrices 836A.8 Calculus and Matrix Algebra 837A.8.1 Differentiation and the Taylor Series 837A.8.2 Optimization 840A.8.3 Constrained Optimization 842A.8.4 Transformations 844APPENDIX B Probability and Distribution Theory 845B.1 Introduction 845B.2 Random Variables 845B.2.1 Probability Distributions 845B.2.2 Cumulative Distribution Function 846B.3 Expectations of a Random Variable 847B.4 Some Specic Probability Distributions 849B.4.1 The Normal Distribution 849B.4.2 The Chi-Squared, t, and F Distributions 851B.4.3 Distributions With Large Degrees of Freedom 853B.4.4 Size Distributions: The Lognormal Distribution 854B.4.5 The Gamma and Exponential Distributions 855B.4.6 The Beta Distribution 855B.4.7 The Logistic Distribution 855B.4.8 Discrete Random Variables 855B.5 The Distribution of a Function of a Random Variable 856B.6 Representations of a Probability Distribution 858B.7 Joint Distributions 860B.7.1 Marginal Distributions 860B.7.2 Expectations in a Joint Distribution 861B.7.3 Covariance and Correlation 861B.7.4 Distribution of a Function of Bivariate Random Variables 862B.8 Conditioning in a Bivariate Distribution 864B.8.1 Regression: The Conditional Mean 864B.8.2 Conditional Variance 865B.8.3 Relationships Among Marginal and ConditionalMoments 865B.8.4 The Analysis of Variance 867B.9 The Bivariate Normal Distribution 867B.10 Multivariate Distributions 868B.10.1 Moments 868 20. Greene-50240 gree50240FM July 10, 2002 12:51 xxiv Contents B.10.2 Sets of Linear Functions 869B.10.3 Nonlinear Functions 870B.11 The Multivariate Normal Distribution 871B.11.1 Marginal and Conditional Normal Distributions 871B.11.2 The Classical Normal Linear Regression Model 872B.11.3 Linear Functions of a Normal Vector 873B.11.4 Quadratic Forms in a Standard Normal Vector 873B.11.5 The F Distribution 875B.11.6 A Full Rank Quadratic Form 875B.11.7 Independence of a Linear and a Quadratic Form 876APPENDIX C Estimation and Inference 877C.1 Introduction 877C.2 Samples and Random Sampling 878C.3 Descriptive Statistics 878C.4 Statistics as EstimatorsSampling Distributions 882C.5 Point Estimation of Parameters 885C.5.1 Estimation in a Finite Sample 885C.5.2 Efcient Unbiased Estimation 888C.6 Interval Estimation 890C.7 Hypothesis Testing 892C.7.1 Classical Testing Procedures 892C.7.2 Tests Based on Condence Intervals 895C.7.3 Specication Tests 896APPENDIX D Large Sample Distribution Theory 896D.1 Introduction 896D.2 Large-Sample Distribution Theory 897D.2.1 Convergence in Probability 897D.2.2 Other Forms of Convergence and Laws of Large Numbers 900D.2.3 Convergence of Functions 903D.2.4 Convergence to a Random Variable 904D.2.5 Convergence in Distribution: Limiting Distributions 906D.2.6 Central Limit Theorems 908D.2.7 The Delta Method 913D.3 Asymptotic Distributions 914D.3.1 Asymptotic Distribution of a Nonlinear Function 916D.3.2 Asymptotic Expectations 917D.4 Sequences and the Order of a Sequence 918APPENDIX E Computation and Optimization 919E.1 Introduction 919E.2 Data Input and Generation 920E.2.1 Generating Pseudo-Random Numbers 920 21. Greene-50240 gree50240FM July 10, 2002 12:51 Contents xxv E.2.2 Sampling from a Standard Uniform Population 921E.2.3 Sampling from Continuous Distributions 921E.2.4 Sampling from a Multivariate Normal Population 922E.2.5 Sampling from a Discrete Population 922E.2.6 The Gibbs Sampler 922E.3 Monte Carlo Studies 923E.4 Bootstrapping and the Jackknife 924E.5 Computation in Econometrics 925E.5.1 Computing Integrals 926E.5.2 The Standard Normal Cumulative Distribution Function 926E.5.3 The Gamma and Related Functions 927E.5.4 Approximating Integrals by Quadrature 928E.5.5 Monte Carlo Integration 929E.5.6 Multivariate Normal Probabilities and SimulatedMoments 931E.5.7 Computing Derivatives 933E.6 Optimization 933E.6.1 Algorithms 935E.6.2 Gradient Methods 935E.6.3 Aspects of Maximum Likelihood Estimation 939E.6.4 Optimization with Constraints 941E.6.5 Some Practical Considerations 942E.6.6 Examples 943APPENDIX F Data Sets Used in Applications 946APPENDIX G Statistical Tables 953References 959Author Index 000Subject Index 000 22. Greene-50240 gree50240FM July 10, 2002 12:51 P R E FA C E Q1. THE FIFTH EDITION OF ECONOMETRIC ANALYSIS Econometric Analysis is intended for a one-year graduate course in econometrics for social scientists. The prerequisites for this course should include calculus, mathematical statistics,andanintroductiontoeconometricsatthelevelof,say,GujaratisBasicEcono metrics (McGraw-Hill, 1995) or Wooldridges Introductory Econometrics: A Modern Approach [South-Western (2000)]. Self-contained (for our purposes) summaries of the matrix algebra, mathematical statistics, and statistical theory used later in the book are given in Appendices A through D. Appendix E contains a description of numerical methods that will be useful to practicing econometricians. The formal presentation of econometrics begins with discussion of a fundamental pillar, the linear multiple regres sion model, in Chapters 2 through 8. Chapters 9 through 15 present familiar extensions of the single linear equation model, including nonlinear regression, panel data models, the generalized regression model, and systems of equations. The linear model is usually not the sole technique used in most of the contemporary literature. In view of this, the (expanding) second half of this book is devoted to topics that will extend the linear regression model in many directions. Chapters 16 through 18 present the techniques and underlying theory of estimation in econometrics, including GMM and maximum likelihood estimation methods and simulation based techniques. We end in the last four chapters, 19 through 22, with discussions of current topics in applied econometrics, in cluding time-series analysis and the analysis of discrete choice and limited dependent variable models. This book has two objectives. The rst is to introduce students to applied econo metrics, including basic techniques in regression analysis and some of the rich variety of models that are used when the linear model proves inadequate or inappropriate. The second is to present students with sufcient theoretical background that they will recognize new variants of the models learned about here as merely natural extensions that t within a common body of principles. Thus, I have spent what might seem to be a large amount of effort explaining the mechanics of GMM estimation, nonlinear least squares, and maximum likelihood estimation and GARCH models. To meet the second objective, this book also contains a fair amount of theoretical material, such as that on maximum likelihood estimation and on asymptotic results for regression models. Mod ern software has made complicated modeling very easy to do, and an understanding of the underlying theory is important. I had several purposes in undertaking this revision. As in the past, readers continue to send me interesting ideas for my next edition. It is impossible to use them all, of xxvii 23. Greene-50240 gree50240FM July 10, 2002 12:51 xxviii Preface course. Because the ve volumes of the Handbook of Econometrics and two of the Handbook of Applied Econometrics already run to over 4,000 pages, it is also unneces sary. Nonetheless, this revision is appropriate for several reasons. First, there are new and interesting developments in the eld, particularly in the areas of microeconometrics (panel data, models for discrete choice) and, of course, in time series, which continues its rapid development. Second, I have taken the opportunity to continue ne-tuning the text as the experience and shared wisdom of my readers accumulates in my les. For this revision, that adjustment has entailed a substantial rearrangement of the materialthe main purpose of that was to allow me to add the new material in a more compact and orderly way than I could have with the table of contents in the 4th edition. The litera ture in econometrics has continued to evolve, and my third objective is to grow with it. This purpose is inherently difcult to accomplish in a textbook. Most of the literature is written by professionals for other professionals, and this textbook is written for students who are in the early stages of their training. But I do hope to provide a bridge to that literature, both theoretical and applied. This book is a broad survey of the eld of econometrics. This eld grows con tinually, and such an effort becomes increasingly difcult. (A partial list of journals devoted at least in part, if not completely, to econometrics now includes the Journal of Applied Econometrics, Journal of Econometrics, Econometric Theory, Econometric Reviews, Journal of Business and Economic Statistics, Empirical Economics, and Econo metrica.) Still, my view has always been that the serious student of the eld must start somewhere, and one can successfully seek that objective in a single textbook. This text attempts to survey, at an entry level, enough of the elds in econometrics that a student can comfortably move from here to practice or more advanced study in one or more specialized areas. At the same time, I have tried to present the material in sufcient generality that the reader is also able to appreciate the important common foundation of all these elds and to use the tools that they all employ. There are now quite a few recently published texts in econometrics. Several have gathered in compact, elegant treatises, the increasingly advanced and advancing theo retical background of econometrics. Others, such as this book, focus more attention on applications of econometrics. One feature that distinguishes this work from its prede cessors is its greater emphasis on nonlinear models. [Davidson and MacKinnon (1993) is a noteworthy, but more advanced, exception.] Computer software now in wide use has made estimation of nonlinear models as routine as estimation of linear ones, and the recent literature reects that progression. My purpose is to provide a textbook treat ment that is in line with current practice. The book concludes with four lengthy chapters on time-series analysis, discrete choice models and limited dependent variable models. These nonlinear models are now the staples of the applied econometrics literature. This book also contains a fair amount of material that will extend beyond many rst courses in econometrics, including, perhaps, the aforementioned chapters on limited dependent variables, the section in Chapter 22 on duration models, and some of the discussions of time series and panel data models. Once again, I have included these in the hope of providing a bridge to the professional literature in these areas. I have had one overriding purpose that has motivated all ve editions of this work. For the vast majority of readers of books such as this, whose ambition is to use, not develop econometrics, I believe that it is simply not sufcient to recite the theory of estimation, hypothesis testing and econometric analysis. Understanding the often subtle 24. Greene-50240 gree50240FM July 10, 2002 12:51 Preface xxix background theory is extremely important. But, at the end of the day, my purpose in writing this work, and for my continuing efforts to update it in this now fth edition, is to show readers how to do econometric analysis. I unabashedly accept the unatter ing assessment of a correspondent who once likened this book to a users guide to econometrics. 2. SOFTWARE AND DATA There are many computer programs that are widely used for the computations describedin this book. All were written by econometricians or statisticians, and in general, allare regularly updated to incorporate new developments in applied econometrics. Asampling of the most widely used packages and Internet home pages where you cannd information about them are:E-Views www.eviews.com (QMS, Irvine, Calif.)Gauss www.aptech.com (Aptech Systems, Kent, Wash.)LIMDEP www.limdep.com (Econometric Software, Plainview, N.Y.)RATS www.estima.com (Estima, Evanston, Ill.)SAS www.sas.com (SAS, Cary, N.C.)Shazam shazam.econ.ubc.ca (Ken White, UBC, Vancouver, B.C.)Stata www.stata.com (Stata, College Station, Tex.)TSP www.tspintl.com (TSP International, Stanford, Calif.)Programs vary in size, complexity, cost, the amount of programming required of the user,and so on. Journals such as The American Statistician, The Journal of Applied Econometrics, and The Journal of Economic Surveys regularly publish reviews of individualpackages and comparative surveys of packages, usually with reference to particularfunctionality such as panel data analysis or forecasting.With only a few exceptions, the computations described in this book can be carried out with any of these packages. We hesitate to link this text to any of them in partic ular. We have placed for general access a customized version of LIMDEP, which was also written by the author, on the website for this text, http://www.stern.nyu.edu/ wgreene/Text/econometricanalysis.htm. LIMDEP programs used for many of the computations are posted on the sites as well. The data sets used in the examples are also on the website. Throughout the text, these data sets are referred to TableFn.m, for example Table F4.1. The F refers to Appendix F at the back of the text, which contains descriptions of the data sets. The actual data are posted on the website with the other supplementary materials for the text. (The data sets are also replicated in the system format of most of the commonly used econometrics computer programs, including in addition to LIMDEP, SAS, TSP, SPSS, E-Views, and Stata, so that you can easily import them into whatever program you might be using.) I should also note, there are now thousands of interesting websites containing soft ware, data sets, papers, and commentary on econometrics. It would be hopeless to attempt any kind of a survey here. But, I do note one which is particularly agree ably structured and well targeted for readers of this book, the data archive for the 25. Greene-50240 gree50240FM July 10, 2002 12:51 xxx Preface Journal of Applied Econometrics. This journal publishes many papers that are precisely at the right level for readers of this text. They have archived all the noncondential data sets used in their publications since 1994. This useful archive can be found at http://qed.econ.queensu.ca/jae/. 3. ACKNOWLEDGEMENTS It is a pleasure to express my appreciation to those who have inuenced this work. I am grateful to Arthur Goldberger and Arnold Zellner for their encouragement, guidance, and always interesting correspondence. Dennis Aigner and Laurits Christensen were also inuential in shaping my views on econometrics. Some collaborators to the earlier editions whose contributions remain in this one include Aline Quester, David Hensher, and Donald Waldman. The number of students and colleagues whose suggestions have helped to produce what you nd here is far too large to allow me to thank them all individually. I would like to acknowledge the many reviewers of my work whose care ful reading has vastly improved the book: Badi Baltagi, University of Houston: Neal Beck, University of California at San Diego; Diane Belleville, Columbia University; Anil Bera, University of Illinois; John Burkett, University of Rhode Island; Leonard Carlson, Emory University; Frank Chaloupka, City University of New York; Chris Cornwell, University of Georgia; Mitali Das, Columbia University; Craig Depken II, University of Texas at Arlington; Edward Dwyer, Clemson University; Michael Ellis, Wesleyan University; Martin Evans, New York University; Ed Greenberg, Washington University at St. Louis; Miguel Herce, University of North Carolina; K. Rao Kadiyala, Purdue University; Tong Li, Indiana University; Lubomir Litov, New York University; William Lott, University of Connecticut; Edward Mathis, Villanova University; Mary McGarvey, University of Nebraska-Lincoln; Ed Melnick, New York University; Thad Mirer, State University of New York at Albany; Paul Ruud, University of California at Berkeley; Sherrie Rhine, Chicago Federal Reserve Board; Terry G. Seaks, University of North Carolina at Greensboro; Donald Snyder, California State University at Los Angeles; Steven Stern, University of Virginia; Houston Stokes, University of Illinois at Chicago; Dimitrios Thomakos, Florida International University; Paul Wachtel, New York University; Mark Watson, Harvard University; and Kenneth West, University of Wisconsin. My numerous discussions with B. D. McCullough have improved Ap pendix E and at the same time increased my appreciation for numerical analysis. I am especially grateful to Jan Kiviet of the University of Amsterdam, who subjected my third edition to a microscopic examination and provided literally scores of sugges tions, virtually all of which appear herein. Chapters 19 and 20 have also beneted from previous reviews by Frank Diebold, B. D. McCullough, Mary McGarvey, and Nagesh Revankar. I would also like to thank Rod Banister, Gladys Soto, Cindy Regan, Mike Reynolds, Marie McHale, Lisa Amato, and Torie Anderson at Prentice Hall for their contributions to the completion of this book. As always, I owe the greatest debt to my wife, Lynne, and to my daughters, Lesley, Allison, Elizabeth, and Julianna. William H. Greene 26. Greene-50240 book May 24, 2002 10:36 1 INTRODUCTION Q 1.1 ECONOMETRICS In the rst issue of Econometrica, the Econometric Society stated that its main object shall be to promote studies that aim at a unication of the theoretical-quantitative and the empirical-quantitative approach to economic problems and that are penetrated by constructive and rigorous thinking similar to that which has come to dominate the natural sciences. But there are several aspects of the quantitative approach to economics, and no single one of these aspects taken by itself, should be confounded with econo metrics. Thus, econometrics is by no means the same as economic statistics. Nor is it identical with what we call general economic theory, although a consider able portion of this theory has a denitely quantitative character. Nor should econometrics be taken as synonomous [sic] with the application of mathematics to economics. Experience has shown that each of these three viewpoints, that of statistics, economic theory, and mathematics, is a necessary, but not by itself a sufcient, condition for a real understanding of the quantitative relations in modern economic life. It is the unication of all three that is powerful. And it is this unication that constitutes econometrics. Frisch (1933) and his society responded to an unprecedented accumulation of statisti cal information. They saw a need to establish a body of principles that could organize what would otherwise become a bewildering mass of data. Neither the pillars nor the objectives of econometrics have changed in the years since this editorial appeared. Econometrics is the eld of economics that concerns itself with the application of math ematical statistics and the tools of statistical inference to the empirical measurement of relationships postulated by economic theory. 1.2 ECONOMETRIC MODELING Econometric analysis will usually begin with a statement of a theoretical proposition. Consider, for example, a canonical application: Example 1.1 Keyness Consumption Function From Keyness (1936) General Theory of Employment, Interest and Money: We shall therefore dene what we shall call the propensity to consume as the func tional relationship f between X, a given level of income and C, the expenditure on consumption out of the level of income, so that C = f ( X). The amount that the community spends on consumption depends (i) partly on the amount of its income, (ii) partly on other objective attendant circumstances, and 1 27. Greene-50240 book May 24, 2002 10:36 2 CHAPTER 1 Introduction (iii) partly on the subjective needs and the psychological propensities and habits of the individuals composing it. The fundamental psychological law upon which we are entitled to depend with great condence, both a priori from our knowledge of human nature and from the detailed facts of experience, is that men are disposed, as a rule and on the average, to increase their consumption as their income increases, but not by as much as the increase in their income.1 That is, . . . dC/dX is positive and less than unity. But, apart from short period changes in the level of income, it is also obvious that a higher absolute level of income will tend as a rule to widen the gap between income and consumption. . . . These reasons will lead, as a rule, to a greater proportion of income being saved as real income increases. The theory asserts a relationship between consumption and income, C = f ( X), and claims in the third paragraph that the marginal propensity to consume (MPC), dC/dX, is between 0 and 1. The nal paragraph asserts that the average propensity to consume (APC), C/X, falls as income rises, or d(C/X)/dX = (MPC APC)/X < 0. It follows that MPC < APC. The most common formulation of the consumption function is a linear relationship, C = + X, that satises Keyness laws if lies between zero and one and if is greater than zero. These theoretical propositions provide the basis for an econometric study. Given an ap propriate data set, we could investigate whether the theory appears to be consistent with the observed facts. For example, we could see whether the linear specication appears to be a satisfactory description of the relationship between consumption and income, and, if so, whether is positive and is between zero and one. Some issues that might be stud ied are (1) whether this relationship is stable through time or whether the parameters of the relationship change from one generation to the next (a change in the average propensity to save, 1APC, might represent a fundamental change in the behavior of consumers in the economy); (2) whether there are systematic differences in the relationship across different countries, and, if so, what explains these differences; and (3) whether there are other factors that would improve the ability of the model to explain the relationship between consumption and income. For example, Figure 1.1 presents aggregate consumption and personal income in constant dollars for the U.S. for the 10 years of 19701979. (See Appendix Table F1.1.) Apparently, at least supercially, the data (the facts) are consistent with the theory. The rela tionship appears to be linear, albeit only approximately, the intercept of a line that lies close to most of the points is positive and the slope is less than one, although not by much. Economic theories such as Keyness are typically crisp and unambiguous. Models of demand, production, and aggregate consumption all specify precise, deterministic relationships. Dependent and independent variables are identied, a functional form is specied, and in most cases, at least a qualitative statement is made about the directions of effects that occur when independent variables in the model change. Of course, the model is only a simplication of reality. It will include the salient features of the rela tionship of interest, but will leave unaccounted for inuences that might well be present but are regarded as unimportant. No model could hope to encompass the myriad essen tially random aspects of economic life. It is thus also necessary to incorporate stochastic elements. As a consequence, observations on a dependent variable will display varia tion attributable not only to differences in variables that are explicitly accounted for, but also to the randomness of human behavior and the interaction of countless minor inuences that are not. It is understood that the introduction of a random disturbance into a deterministic model is not intended merely to paper over its inadequacies. It is 1Modern economists are rarely this condent about their theories. More contemporary applications generally begin from rst principles and behavioral axioms, rather than simple observation. 28. Greene-50240 book May 24, 2002 10:36 CHAPTER 1 Introduction 3 950 900 850 C 800 750 700 650700 750 800 850 900 950 1000 1050XFIGURE 1.1 Consumption Data, 19701979. essential to examine the results of the study, in a sort of postmortem, to ensure that the allegedly random, unexplained factor is truly unexplainable. If it is not, the model is, in fact, inadequate. The stochastic element endows the model with its statistical proper ties. Observations on the variable(s) under study are thus taken to be the outcomes of a random process. With a sufciently detailed stochastic structure and adequate data, the analysis will become a matter of deducing the properties of a probability distri bution. The tools and methods of mathematical statistics will provide the operating principles. A model (or theory) can never truly be conrmed unless it is made so broad as to include every possibility. But it may be subjected to ever more rigorous scrutiny and, in the face of contradictory evidence, refuted. A deterministic theory will be invali dated by a single contradictory observation. The introduction of stochastic elements into the model changes it from an exact statement to a probabilistic description about expected outcomes and carries with it an important implication. Only a preponder ance of contradictory evidence can convincingly invalidate the probabilistic model, and what constitutes a preponderance of evidence is a matter of interpretation. Thus, the probabilistic model is less precise but at the same time, more robust.2 The process of econometric analysis departs from the specication of a theoreti cal relationship. We initially proceed on the optimistic assumption that we can obtain precise measurements on all the variables in a correctly specied model. If the ideal conditions are met at every step, the subsequent analysis will probably be routine. Unfortunately, they rarely are. Some of the difculties one can expect to encounter are the following: 2See Keuzenkamp and Magnus (1995) for a lengthy symposium on testing in econometrics. 29. Greene-50240 book May 24, 2002 10:36 4 CHAPTER 1 Introduction The data may be badly measured or may correspond only vaguely to the variables in the model. The interest rate is one example. Some of the variables may be inherently unmeasurable. Expectations are a case in point. The theory may make only a rough guess as to the correct functional form, if it makes any at all, and we may be forced to choose from an embarrassingly long menu of possibilities. The assumed stochastic properties of the random terms in the model may be demonstrably violated, which may call into question the methods of estimation and inference procedures we have used. Some relevant variables may be missing from the model. The ensuing steps of the analysis consist of coping with these problems and attempting to cull whatever information is likely to be present in such obviously imperfect data. The methodology is that of mathematical statistics and economic theory. The product is an econometric model. 1.3 DATA AND METHODOLOGY The connection between underlying behavioral models and the modern practice of econometrics is increasingly strong. Practitioners rely heavily on the theoretical tools of microeconomics including utility maximization, prot maximization, and market equilibrium. Macroeconomic model builders rely on the interactions between economic agents and policy makers. The analyses are directed at subtle, difcult questions that often require intricate, complicated formulations. A few applications: What are the likely effects on labor supply behavior of proposed negative income taxes? [Ashenfelter and Heckman (1974).] Does a monetary policy regime that is strongly oriented toward controlling ination impose a real cost in terms of lost output on the U.S. economy? [Cecchetti and Rich (2001).] Did 2001s largest federal tax cut in U.S. history contribute to or dampen the concurrent recession? Or was it irrelevant? (Still to be analyzed.) Does attending an elite college bring an expected payoff in lifetime expected income sufcient to justify the higher tuition? [Krueger and Dale (2001) and Krueger (2002).] Does a voluntary training program produce tangible benets? Can these benets be accurately measured? [Angrist (2001).] Each of these analyses would depart from a formal model of the process underlying the observed data. The eld of econometrics is large and rapidly growing. In one dimension, we can distinguish between theoretical and applied econometrics. Theorists develop new techniques and analyze the consequences of applying particular methods when the as sumptions that justify them are not met. Applied econometricians are the users of these techniques and the analysts of data (real world and simulated). Of course, the distinction is far from clean; practitioners routinely develop new analytical tools for the purposes of 30. Greene-50240 book May 24, 2002 10:36 CHAPTER 1 Introduction 5 the study that they are involved in. This book contains a heavy dose of theory, but it is di rected toward applied econometrics. I have attempted to survey techniques, admittedly some quite elaborate and intricate, that have seen wide use in the eld. Another loose distinction can be made between microeconometrics and macro econometrics. The former is characterized largely by its analysis of cross section and panel data and by its focus on individual consumers, rms, and micro-level decision mak ers. Macroeconometrics is generally involved in the analysis of time series data, usually of broad aggregates such as price levels, the money supply, exchange rates, output, and so on. Once again, the boundaries are not sharp. The very large eld of nancial econo metrics is concerned with long-time series data and occasionally vast panel data sets, but with a very focused orientation toward models of individual behavior. The analysis of market returns and exchange rate behavior is neither macro- nor microeconometric in nature, or perhaps it is some of both. Another application that we will examine in this text concerns spending patterns of municipalities, which, again, rests somewhere between the two elds. Applied econometric methods will be used for estimation of important quantities, analysis of economic outcomes, markets or individual behavior, testing theories, and for forecasting. The last of these is an art and science in itself, and (fortunately) the subject of a vast library of sources. Though we will briey discuss some aspects of forecasting, our interest in this text will be on estimation and analysis of models. The presentation, where there is a distinction to be made, will contain a blend of microeconometric and macroeconometric techniques and applications. The rst 18 chapters of the book are largely devoted to results that form the platform of both areas. Chapters 19 and 20 focus on time series modeling while Chapters 21 and 22 are devoted to methods more suited to cross sections and panels, and used more frequently in microeconometrics. Save for some brief applications, we will not be spending much time on nancial econometrics. For those with an interest in this eld, I would recommend the celebrated work by Campbell, Lo, and Mackinlay (1997). It is also necessary to distinguish between time series analysis (which is not our focus) and methods that primarily use time series data. The former is, like forecasting, a growth industry served by its own literature in many elds. While we will employ some of the techniques of time series analysis, we will spend relatively little time developing rst principles. The techniques used in econometrics have been employed in a widening variety of elds, including political methodology, sociology [see, e.g., Long (1997)], health eco nomics, medical research (how do we handle attrition from medical treatment studies?) environmental economics, transportation engineering, and numerous others. Practi tioners in these elds and many more are all heavy users of the techniques described in this text. 1.4 PLAN OF THE BOOK The remainder of this book is organized into ve parts: 1. Chapters 2 through 9 present the classical linear and nonlinear regression models. We will discuss specication, estimation, and statistical inference. 2. Chapters 10 through 15 describe the generalized regression model, panel data 31. Greene-50240 book May 24, 2002 10:36 6 CHAPTER 1 Introduction applications, and systems of equations. 3. Chapters 16 through 18 present general results on different methods of estimation including maximum likelihood, GMM, and simulation methods. Various estimation frameworks, including non- and semiparametric and Bayesian estimation are presented in Chapters 16 and 18. 4. Chapters 19 through 22 present topics in applied econometrics. Chapters 19 and 20 are devoted to topics in time series modeling while Chapters 21 and 22 are about microeconometrics, discrete choice modeling, and limited dependent variables. 5. Appendices A through D present background material on tools used in econometrics including matrix algebra, probability and distribution theory, estimation, and asymptotic distribution theory. Appendix E presents results on computation. Appendices A through D are chapter-length surveys of the tools used in econometrics. Since it is assumed that the reader has some previous training in each of these topics, these summaries are included primarily for those who desire a refresher or a convenient reference. We do not anticipate that these appendices can substitute for a course in any of these subjects. The intent of these chapters is to provide a reasonably concise summary of the results, nearly all of which are explicitly used elsewhere in the book. The data sets used in the numerical examples are described in Appendix F. The actual data sets and other supplementary materials can be downloaded from the website for the text, www.prenhall.com/greene 32. Greene-50240 book May 24, 2002 13:34 2 THE CLASSICAL MULTIPLE LINEAR REGRESSION MODEL Q 2.1 INTRODUCTION An econometric study begins with a set of propositions about some aspect of the economy. The theory species a set of precise, deterministic relationships among vari ables. Familiar examples are demand equations, production functions, and macroeco nomic models. The empirical investigation provides estimates of unknown parameters in the model, such as elasticities or the effects of monetary policy, and usually attempts to measure the validity of the theory against the behavior of observable data. Once suitably constructed, the model might then be used for prediction or analysis of behavior. This book will develop a large number of models and techniques used in this framework. The linear regression model is the single most useful tool in the econometricians kit. Though to an increasing degree in the contemporary literature, it is often only the departure point for the full analysis, it remains the device used to begin almost all empirical research. This chapter will develop the model. The next several chapters will discuss more elaborate specications and complications that arise in the application of techniques that are based on the simple models presented here. 2.2 THE LINEAR REGRESSION MODEL The multiple linear regression model is used to study the relationship between a depen dent variable and one or more independent variables. The generic form of the linear regression model is y = f (x1, x2, . . . , xK) + (2-1) = x11 + x22 + + xKK + where y is the dependent or explained variable and x1, . . . , xK are the independent or explanatory variables. Ones theory will specify f (x1, x2, . . . , xK). This function is commonly called the population regression equation of y on x1, . . . , xK. In this set ting, y is the regressand and xk, k= 1, . . . , K, are the regressors or covariates. The underlying theory will specify the dependent and independent variables in the model. It is not always obvious which is appropriately dened as each of thesefor exam ple, a demand equation, quantity = 1 + price 2 + income 3 + , and an inverse demand equation, price = 1 + quantity 2 + income 3 + u are equally valid rep resentations of a market. For modeling purposes, it will often prove useful to think in terms of autonomous variation. One can conceive of movement of the independent 7 33. Greene-50240 book May 24, 2002 13:34 8 CHAPTER 2 The Classical Multiple Linear Regression Model variables outside the relationships dened by the model while movement of the depen dent variable is considered in response to some independent or exogenous stimulus.1 The term is a random disturbance, so named because it disturbs an otherwise stable relationship. The disturbance arises for several reasons, primarily because we cannot hope to capture every inuence on an economic variable in a model, no matter how elaborate. The net effect, which can be positive or negative, of these omitted factors is captured in the disturbance. There are many other contributors to the disturbance in an empirical model. Probably the most signicant is errors of measurement. It is easy to theorize about the relationships among precisely dened variables; it is quite another to obtain accurate measures of these variables. For example, the difculty of obtaining reasonable measures of prots, interest rates, capital stocks, or, worse yet, ows of services from capital stocks is a recurrent theme in the empirical literature. At the extreme, there may be no observable counterpart to the theoretical variable. The literature on the permanent income model of consumption [e.g., Friedman (1957)] provides an interesting example. We assume that each observation in a sample (yi , xi1, xi2, . . . , xi K), i = 1, . . . , n, is generated by an underlying process described by yi = xi11 + xi22 + + xi KK + i . The observed value of yi is the sum of two parts, a deterministic part and the random part, i . Our objective is to estimate the unknown parameters of the model, use the data to study the validity of the theoretical propositions, and perhaps use the model to predict the variable y. How we proceed from here depends crucially on what we assume about the stochastic process that has led to our observations of the data in hand. Example 2.1 Keyness Consumption Function Example 1.1 discussed a model of consumption proposed by Keynes and his General Theory (1936). The theory that consumption, C, and income, X, are related certainly seems consistent with the observed facts in Figures 1.1 and 2.1. (These data are in Data Table F2.1.) Of course, the linear function is only approximate. Even ignoring the anomalous wartime years, consumption and income cannot be connected by any simple deterministic relationship. The linear model, C = + X, is intended only to represent the salient features of this part of the economy. It is hopeless to attempt to capture every inuence in the relationship. The next step is to incorporate the inherent randomness in its real world counterpart. Thus, we write C = f ( X, ), where is a stochastic element. It is important not to view as a catchall for the inadequacies of the model. The model including appears adequate for the data not including the war years, but for 19421945, something systematic clearly seems to be missing. Consumption in these years could not rise to rates historically consistent with these levels of income because of wartime rationing. A model meant to describe consumption in this period would have to accommodate this inuence. It remains to establish how the stochastic element will be incorporated in the equation. The most frequent approach is to assume that it is additive. Thus, we recast the equation in stochastic terms: C = + X + . This equation is an empirical counterpart to Keyness theoretical model. But, what of those anomalous years of rationing? If we were to ignore our intuition and attempt to t a line to all these datathe next chapter will discuss at length how we should do thatwe might arrive at the dotted line in the gure as our best guess. This line, however, is obviously being distorted by the rationing. A more appropriate 1By this denition, it would seem that in our demand relationship, only income would be an independent variable while both price and quantity would be dependent. That makes sensein a market, price and quantity are determined at the same time, and do change only when something outside the market changes. We will return to this specic case in Chapter 15. 34. 1950 Greene-50240 book May 24, 2002 13:34 CHAPTER 2 The Classical Multiple Linear Regression Model 9 350 325 300C275250 225 225 250 275 300 325 350 375 X 1940 1941 1942 1944 1943 1945 1946 1947 1948 1949 FIGURE 2.1 Consumption Data, 19401950. specication for these data that accommodates both the stochastic nature of the data and the special circumstances of the years 19421945 might be one that shifts straight down in the war years, C = + X + dwaryearsw + , where the new variable, dwaryears equals one in 19421945 and zero in other years and w < . One of the most useful aspects of the multiple regression model is its ability to identify the independent effects of a set of variables on a dependent variable. Example 2.2 describes a common application. Example 2.2 Earnings and Education A number of recent studies have analyzed the relationship between earnings and educa tion. We would expect, on average, higher levels of education to be associated with higher incomes. The simple regression model earnings = 1 + 2 education + , however, neglects the fact that most people have higher incomes when they are older than when they are young, regardless of their education. Thus, 2 will overstate the marginal impact of education. If age and education are positively correlated, then the regression model will associate all the observed increases in income with increases in education. A better specication would account for the effect of age, as in earnings = 1 + 2 education + 3 age + . It is often observed that income tends to rise less rapidly in the later earning years than in the early ones. To accommodate this possibility, we might extend the model to earnings = 1 + 2 education + 3 age + 4 age2 + . We would expect 3 to be positive and 4 to be negative. The crucial feature of this model is that it allows us to carry out a conceptual experiment that might not be observed in the actual data. In the example, we might like to (and could) compare the earnings of two individuals of the same age with different amounts of education even if the data set does not actually contain two such individuals. How education should be 35. Greene-50240 book May 24, 2002 13:34 10 CHAPTER 2 The Classical Multiple Linear Regression Model measured in this setting is a difcult problem. The study of the earnings of twins by Ashenfelter and Krueger (1994), which uses precisely this specication of the earnings equation, presents an interesting approach. We will examine this study in some detail in Section 5.6.4. A large literature has been devoted to an intriguing question on this subject. Education is not truly independent in this setting. Highly motivated individuals will choose to pursue more education (for example, by going to college or graduate school) than others. By the same token, highly motivated individuals may do things that, on average, lead them to have higher incomes. If so, does a positive 2 that suggests an association between income and education really measure the effect of education on income, or does it reect the effect of some underlying effect on both variables that we have not included in our regression model? We will revisit the issue in Section 22.4. 2.3 ASSUMPTIONS OF THE CLASSICAL LINEARREGRESSION MODELThe classical linear regression model consists of a set of assumptions about how a data set will be produced by an underlying data-generating process. The theory will specify a deterministic relationship between the dependent variable and the independent vari ables. The assumptions that describe the form of the model and relationships among its parts and imply appropriate estimation and inference procedures are listed in Table 2.1. 2.3.1 LINEARITY OF THE REGRESSION MODEL Let the column vector xk be the n observations on variable xk, k = 1, . . . , K, and as semble these data in an n K data matrix X. In most contexts, the rst column of X is assumed to be a column of 1s so that 1 is the constant term in the model. Let y be the n observations, y1, . . . , yn, and let be the column vector containing the n disturbances. TABLE 2.1 Assumptions of the Classical Linear Regression Model A1. Linearity: yi = xi11 + xi22 + + xi KK + i . The model species a linear relationship between y and x1, . . . , xK. A2. Full rank: There is no exact linear relationship among any of the independent variables in the model. This assumption will be necessary for estimation of the parameters of the model. A3. Exogeneity of the independent variables: E[i | xj1, xj2, . . . , xj K] = 0. This states that the expected value of the disturbance at observation i in the sample is not a function of the independent variables observed at any observation, including this one. This means that the independent variables will not carry useful information for prediction of i . A4. Homoscedasticity and nonautocorrelation: Each disturbance, i has the same nite variance, 2 and is uncorrelated with every other disturbance, j . This assumption limits the generality of the model, and we will want to examine how to relax it in the chapters to follow. A5. Exogenously generated data: The data in (xj1, xj2, . . . , xj K) may be any mixture of constants and random variables. The process generating the data operates outside the assumptions of the modelthat is, independently of the process that generates i . Note that this extends A3. Analysis is done conditionally on the observed X. A6. Normal distribution: The disturbances are normally distributed. Once again, this is a convenience that we will dispense with after some analysis of its implications. 36. Greene-50240 book May 24, 2002 13:34 CHAPTER 2 The Classical Multiple Linear Regression Model 11 The model in (2-1) as it applies to all n observations can now be written y = x11 + + xKK + , (2-2) or in the form of Assumption 1, ASSUMPTION: y = X + . (2-3) A NOTATIONAL CONVENTION. Henceforth, to avoid a possibly confusing and cumbersome notation, we will use a boldface x to denote a column or a row of X. Which applies will be clear from the context. In (2-2), xk is the kth column of X. Subscripts j and k will be used to denote columns (variables). It will often be convenient to refer to a single observation in (2-3), which we would write yi = xi + i . (2-4) Subscripts i and t will generally be used to denote rows (observations) of X. In (2-4), xi is a column vector that is the transpose of the ith 1 K row of X. Our primary interest is in estimation and inference about the parameter vector . Note that the simple regression model in Example 2.1 is a special case in which X has only two columns, the rst of which is a column of 1s. The assumption of linearity of the regression model includes the additive disturbance. For the regression to be linear in the sense described here, it must be of the form in (2-1) either in the original variables or after some suitable transformation. For example, the model y = Ax e is linear (after taking logs on both sides of the equation), whereas y = Ax + is not. The observed dependent variable is thus the sum of two components, a deter ministic element + x and a random variable . It is worth emphasizing that neither of the two parts is directly observed because and are unknown. The linearity assumption is not so narrow as it might rst appear. In the regression context, linearity refers to the manner in which the parameters and the disturbance enter the equation, not necessarily to the relationship among the variables. For example, the equations y = + x + , y = + cos(x) + , y = + /x + , and y = + ln x + are all linear in some function of x by the denition we have used here. In the examples, only x has been transformed, but y could have been as well, as in y = Ax e , which is a linear relationship in the logs of x and y; ln y = + ln x + . The variety of functions is unlimited. This aspect of the model is used in a number of commonly used functional forms. For example, the loglinear model is ln y = 1 + 2 ln X2 + 3 ln X3 + + K ln XK + . This equation is also known as the constant elasticity form as in this equation, the elasticity of y with respect to changes in x is ln y/ ln xk = k, which does not vary 37. Greene-50240 book May 24, 2002 13:34 12 CHAPTER 2 The Classical Multiple Linear Regression Model with xk. The log linear form is often used in models of demand and production. Different values of produce widely varying functions. Example 2.3 The U.S. Gasoline Market Data on the U.S. gasoline market for the years 19601995 are given in Table F2.2 in Appendix F. We will use these data to obtain, among other things, estimates of the income, own price, and cross-price elasticities of demand in this market. These data also present an interesting question on the issue of holding all other things constant, that was suggested in Example 2.2. In particular, consider a somewhat abbreviated model of per capita gasoline consumption: ln(G/pop) = 1 + 2 ln income + 3 ln priceG + 4 ln Pnewcars + 5 ln Pusedcars + . This model will provide estimates of the income and price elasticities of demand for gasoline and an estimate of the elasticity of demand with respect to the prices of new and used cars. What should we expect for the sign of 4? Cars and gasoline are complementary goods, so if the prices of new cars rise, ceteris paribus, gasoline consumption should fall. Or should it? If the prices of new cars rise, then consumers will buy fewer of them; they will keep their used cars longer and buy fewer new cars. If older cars use more gasoline than newer ones, then the rise in the prices of new cars would lead to higher gasoline consumption than otherwise, not lower. We can use the multiple regression model and the gasoline data to attempt to answer the question. A semilog model is often used to model growth rates: ln yt = xt + t + t . In this model, the autonomous (at least not explained by the model itself) proportional, per period growth rate is d ln y/dt = . Other variations of the general form f (yt ) = g(xt + t ) will allow a tremendous variety of functional forms, all of which t into our denition of a linear model. The linear regression model is sometimes interpreted as an approximation to some unknown,underlyingfunction.(SeeSectionA.8.1fordiscussion.)Bythisinterpretation, however, the linear model, even with quadratic terms, is fairly limited in that such an approximation is likely to be useful only over a small range of variation of the independent variables. The translog model discussed in Example 2.4, in contrast, has proved far more effective as an approximating function. Example 2.4 The Translog Model Modern studies of demand and production are usually done in the context of a exible func tional form. Flexible functional forms are used in econometrics because they allow analysts to model second-order effects such as elasticities of substitution, which are functions of the second derivatives of production, cost, or utility functions. The linear model restricts these to equal zero, whereas the log linear model (e.g., the CobbDouglas model) restricts the inter esting elasticities to the uninteresting values of 1 or +1. The most popular exible functional form is the translog model, which is often interpreted as a second-order approximation to an unknown functional form. [See Berndt and Christensen (1973).] One way to derive it is as follows. We rst write y = g(x1, . . . , xK ). Then, ln y = ln g(. . .) = f (. . .). Since by a trivial transformation xk = exp(ln xk), we interpret the function as a function of the logarithms of the xs. Thus, ln y = f (ln x1, . . . , ln xK ). 38. Greene-50240 book May 24, 2002 13:34 CHAPTER 2 The Classical Multiple Linear Regression Model 13 Now, expand this function in a second-order Taylor series around the point x = [1, 1, . . . , 1] so that at the expansion point, the log of each variable is a convenient zero. Then K ln y = f (0) + [ f ()/ ln xk]| ln x=0 ln xk k=1 K K 1 + [2 f ()/ ln xk ln xl ]| ln x=0 ln xk ln xl + . 2 k=1 l=1 The disturbance in this model is assumed to embody the familiar factors and the error of approximation to the unknown function. Since the function and its derivatives evaluated at the xed value 0 are constants, we interpret them as the coefcients and write K K K 1 ln y = 0 + k ln xk + kl ln xk ln xl + . 2 k=1 k=1 l=1 This model is linear by our denition but can, in fact, mimic an impressive amount of curvature when it is used to approximate another function. An interesting feature of this formulation is that the log linear model is a special case, kl = 0. Also, there is an interesting test of the underlying theory possible because if the underlying function were assumed to be continuous and twice continuously differentiable, then by Youngs theorem it must be true that kl = lk. We will see in Chapter 14 how this feature is studied in practice. Despite its great exibility, the linear model does not include all the situations we encounter in practice. For a simple example, there is no transformation that will reduce y = + 1/(1 + 2x) + to linearity. The methods we consider in this chapter are not appropriate for estimating the parameters of such a model. Relatively straightforward techniques have been developed for nonlinear models such as this, however. We shall treat them in detail in Chapter 9. 2.3.2 FULL RANK Assumption 2 is that there are no exact linear relationships among the variables. ASSUMPTION: X is an n K matrix with rank K. (2-5) Hence, X has full column rank; the columns of X are linearly independent and there are at least K observations. [See (A-42) and the surrounding text.] This assumption is known as an identication condition. To see the need for this assumption, consider an example. Example 2.5 Short Rank Suppose that a cross-section model species C = 1 + 2 nonlabor income + 3 salary + 4 total income + , where total income is exactly equal to salary plus nonlabor income. Clearly, there is an exact linear dependency in the model. Now let 2 = 2 + a, 3 = 3 + a, and 4 = 4 a, 39. Greene-50240 book May 24, 2002 13:34 14 CHAPTER 2 The Classical Multiple Linear Regression Model where a is any number. Then the exact same value appears on the right-hand side of C if we substitute 2, 3, and 4 for 2, 3, and 4. Obviously, there is no way to estimate the parameters of this model. If there are fewer than K observations, then X cannot have full rank. Hence, we make the (redundant) assumption that n is at least as large as K. In a two-variable linear model with a constant term, the full rank assumption means that there must be variation in the regressor x. If there is no variation in x, then all our observations will lie on a vertical line. This situation does not invalidate the other assumptions of the model; presumably, it is a aw in the data set. The possibility that this suggests is that we could have drawn a sample in which there was variation in x, but in this instance, we did not. Thus, the model still applies, but we cannot learn about it from the data set in hand. 2.3.3 REGRESSION The disturbance is assumed to have conditional expected value zero at every observa tion, which we write as E[i | X] = 0. (2-6) For the full set of observations, we write Assumption 3 as: E[1 | X] E[2 | X] ASSUMPTION: E[ | X] = . = 0. (2-7) . . E[n | X] There is a subtle point in this discussion that the observant reader might have noted. In (2-7), the left-hand side states, in principle, that the mean of each i conditioned on all observations xi is zero. This conditional mean assumption states, in words, that no observations on x convey information about the expected value of the disturbance. It is conceivablefor example, in a time-series settingthat although xi might pro vide no information about E[i |], xj at some other observation, such as in the next time period, might. Our assumption at this point is that there is no information about E[i | ] contained in any observation xj . Later, when we extend the model, we will study the implications of dropping this assumption. [See Woolridge (1995).] We will also assume that the disturbances convey no information about each other. That is, E[i | 1, . . . , i1, i+1, . . . , n] = 0. In sum, at this point, we have assumed that the disturbances are purely random draws from some population. The zero conditional mean implies that the unconditional mean is also zero, since E[i ] = Ex[E[i | X]] = Ex[0] = 0. Since,foreachi ,Cov[E[i | X], X] = Cov[i , X],Assumption3impliesthatCov[i , X]= 0 for all i. (Exercise: Is the converse true?) In most cases, the zero mean assumption is not restrictive. Consider a two-variable model and suppose that the mean of is = 0. Then + x + is the same as ( + ) + x + ( ). Letting = + and = produces the original model. For an application, see the discussion of frontier production functions in Section 17.6.3. 40. Greene-50240 book May 24, 2002 13:34 CHAPTER 2 The Classical Multiple Linear Regression Model 15 But, if the original model does not contain a constant term, then assuming E[i ] = 0 could be substantive. If E[i ] can be expressed as a linear function of xi , then, as before, a transformation of the model will produce disturbances with zero means. But, if not, then the nonzero mean of the disturbances will be a substantive part of the model structure. This does suggest that there is a potential problem in models without constant terms. As a general rule, regression models should not be specied without constant terms unless this is specically dictated by the underlying theory.2 Arguably, if we have reason to specify that the mean of the disturbance is something other than zero, we should build it into the systematic part of the regression, leaving in the disturbance only the unknown part of . Assumption 3 also implies that E[y | X] = X. (2-8) Assumptions 1 and 3 comprise the linear regression model. The regression of y on X is the conditional mean, E[y | X], so that without Assumption 3, X is not the conditional mean function. The remaining assumptions will more completely specify the characteristics of the disturbances in the model and state the conditions under which the sample observations on x are obtained. 2.3.4 SPHERICAL DISTURBANCES The fourth assumption concerns the variances and covariances of the disturbances: Var[i | X] = 2 , for all i = 1, . . . , n, and Cov[i , j | X] = 0, for all i = j. Constant variance is labeled homoscedasticity. Consider a model that describes the prof its of rms in an industry as a function of, say, size. Even accounting for size, measured in dollar terms, the prots of large rms will exhibit greater variation than those of smaller rms. The homoscedasticity assumption would be inappropriate here. Also, survey data on household expenditure patterns often display marked heteroscedasticity, even after accounting for income and household size. Uncorrelatedness across observations is labeled generically nonautocorrelation. In Figure 2.1, there is some suggestion that the disturbances might not be truly independent across observations. Although the number of observations is limited, it does appear that, on average, each disturbance tends to be followed by one with the same sign. This inertia is precisely what is meant by autocorrelation, and it is assumed away at this point. Methods of handling autocorrelation in economic data occupy a large proportion oftheliteratureandwillbetreatedatlengthinChapter12.Notethatnonautocorrelation does not imply that observations yi and yj are uncorrelated. The assumption is that deviations of observations from their expected values are uncorrelated. 2Models that describe rst differences of variables might well be specied without constants. Consider yt yt1. If there is a constant term on the right-hand side of the equation, then yt is a function of t, which is an explosive regressor. Models with linear time trends merit special treatment in the time-series literature. We will return to this issue in Chapter 19. 41. Greene-50240 book May 24, 2002 13:34 16 CHAPTER 2 The Classical Multiple Linear Regression Model The two assumptions imply that E[11 | X] E[12 | X] E[1n | X] E[21 | X] E[22 | X] E[2n | X] E[ | X] = . . . . . . . .. . . . E[n1 | X] E[n2 | X] E[nn | X] 2 0 0 0 2 0 = . , . . 0 0 2 which we summarize in Assumption 4: ASSUMPTION: E[ | X] = 2 I. (2-9) By using the variance decomposition formula in (B-70), we nd Var[] = E[Var[ | X]] + Var[E[ | X]] = 2 I. Once again, we should emphasize that this assumption describes the information about the variances and covariances among the disturbances that is provided by the indepen dent variables. For the present, we assume that there is none. We will also drop this assumption later when we enrich the regression model. We are also assuming that the disturbances themselves provide no information about the variances and covariances. Although a minor issue at this point, it will become crucial in our treatment of time series applications. Models such as Var[t | t1] = 2 + 2 a GARCH model (seet1 Section 11.8)do not violate our conditional variance assumption, but do assume that Var[t | t1] = Var[t ]. Disturbances that meet the twin assumptions of homoscedasticity and nonautocor relation are sometimes called spherical disturbances.3 2.3.5 DATA GENERATING PROCESS FOR THE REGRESSORS It is common to assume that xi is nonstochastic, as it would be in an experimental situation. Here the analyst chooses the values of the regressors and then observes yi . This process might apply, for example, in an agricultural experiment in which yi is yield and xi is fertilizer concentration and water applied. The assumption of nonstochastic regressors at this