part 9: model building 9-1/43 regression models professor william greene stern school of business...
TRANSCRIPT
![Page 1: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/1.jpg)
Part 9: Model Building9-1/43
Regression ModelsProfessor William Greene
Stern School of Business
IOMS Department
Department of Economics
![Page 2: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/2.jpg)
Part 9: Model Building9-2/43
Regression and Forecasting Models
Part 9 – Model Building
![Page 3: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/3.jpg)
Part 9: Model Building9-3/43
Multiple Regression Models
Using Binary Variables Logs and Elasticities Trends in Time Series Data Using Quadratic Terms to Improve the Model
![Page 4: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/4.jpg)
Part 9: Model Building9-4/43
Using Dummy Variables Dummy variable = binary variable
= a variable that takes values 0 and 1. E.g. OECD Life Expectancies compared to the
rest of the world:
DALE = β0 + β1 EDUC + β2 PCHexp + β3 OECD + ε
Australia, Austria, Belgium, Canada, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Japan, Korea, Luxembourg, Mexico, The Netherlands, New Zealand, Norway, Poland, Portugal, Slovak Republic, Spain, Sweden, Switzerland, Turkey, United Kingdom, United States.
![Page 5: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/5.jpg)
Part 9: Model Building9-5/43
OECD Life Expectancy
According to these results, after accounting for education and health expenditure differences, people in the OECD countries have a life expectancy that is 1.191 years shorter than people in other countries.
![Page 6: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/6.jpg)
Part 9: Model Building9-6/43
A Binary Variable in Regression
We set PCHExp to 1000, approximately the sample mean.
The regression shifts down by 1.191 years for the OECD countries
![Page 7: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/7.jpg)
Part 9: Model Building9-7/43
Dummy Variable in a Log Regression
E.g., Monet’s signature equation
Log$Price = β0 + β1 logArea + β2 Signed
Unsigned: PriceU = exp(α) Areaβ1
Signed: PriceS = exp(α) Areaβ1 exp(β2)
Signed/Unsigned = exp(β2)
%Difference = 100%(Signed-Unsigned)/Unsigned
= 100%[exp(β2) – 1]
![Page 8: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/8.jpg)
Part 9: Model Building9-8/43
The Signature Effect: 253%
100%[exp(1.2618) – 1] = 100%[3.532 – 1] = 253.2 %
![Page 9: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/9.jpg)
Part 9: Model Building9-9/43
Monet Paintings in Millions
Square Inches
Price
70006000500040003000200010000
30
25
20
15
10
5
0
01
Signed
Scatterplot of Price vs Square Inches
Predicted Price is exp(4.122+1.3458*logArea+1.2618*Signed) / 1000000
Difference is about 253%
![Page 10: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/10.jpg)
Part 9: Model Building9-10/43
Logs in Regression
![Page 11: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/11.jpg)
Part 9: Model Building9-11/43
Elasticity
The coefficient on log(Area) is 1.346 For each 1% increase in area, price goes up by
1.346% - even accounting for the signature effect. The elasticity is +1.346 Remarkable. Not only does price increase with
area, it increases much faster than area.
![Page 12: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/12.jpg)
Part 9: Model Building9-12/43
Monet: By the Square Inch
Area
price
70006000500040003000200010000
20000000
15000000
10000000
5000000
0
Scatterplot of Price vs Area
![Page 13: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/13.jpg)
Part 9: Model Building9-13/43
Logs and Elasticities
Theory: When the variables are in logs:
change in logx = %change in x
log y = α + β1 log x1 + β2 log x2 + … βK log xK + ε
Elasticity = βk
![Page 14: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/14.jpg)
Part 9: Model Building9-14/43
Elasticities
Price elasticity = -0.02070 Income elasticity = +1.10318
![Page 15: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/15.jpg)
Part 9: Model Building9-15/43
A Set of Dummy Variables
Complete set of dummy variables divides the sample into groups.
Fit the regression with “group” effects. Need to drop one (any one) of the
variables to compute the regression. (Avoid the “dummy variable trap.”)
![Page 16: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/16.jpg)
Part 9: Model Building9-16/43
Rankings of 132 U.S.Liberal Arts CollegesReputation = β0 + β1Religious + β2GenderEcon + β3EconFac + β4North + β5South + β6Midwest + β7West + ε
Nancy Burnett: Journal of Economic Education, 1998
![Page 17: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/17.jpg)
Part 9: Model Building9-17/43
Minitab does not like this model.
![Page 18: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/18.jpg)
Part 9: Model Building9-18/43
Too many dummy variables If we use all four region dummies, a is reduntant
Reputation = b0 + bn + … if north
Reputation = b0 + bm + … if midwest
Reputation = b0 + bs + … if south
Reputation = b0 + bw + … if west
Only three are needed – so Minitab dropped west Reputation = b0 + bn + … if north
Reputation = b0 + bm + … if midwest
Reputation = b0 + bs + … if south
Reputation = b0 + … if west
![Page 19: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/19.jpg)
Part 9: Model Building9-19/43
Unordered Categorical Variables
House price data (fictitious)
Style 1 = Split levelStyle 2 = RanchStyle 3 = ColonialStyle 4 = Tudor
Use 3 dummy variables for this kind of data. (Not all 4)
Using variable STYLE in the model makes no sense. You could change the numbering scale any way you like. 1,2,3,4 are just labels.
![Page 20: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/20.jpg)
Part 9: Model Building9-20/43
Transform Style to Types
![Page 21: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/21.jpg)
Part 9: Model Building9-21/43
![Page 22: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/22.jpg)
Part 9: Model Building9-22/43
House Price Regression
Each of these is relative to a Split Level, since that is the omitted category. E.g., the price of a Ranch house is $74,369 less than a Split Level of the same size with the same number of bedrooms.
![Page 23: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/23.jpg)
Part 9: Model Building9-23/43
Better Specified House Price Model
![Page 24: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/24.jpg)
Part 9: Model Building9-24/43
Time Trends in Regression
y = β0 + β1x + β2t + ε β2 is the year to year increase not explained by anything else.
log y = β0 + β1log x + β2t + ε (not log t, just t) 100β2 is the year to year % increase not explained by anything else.
![Page 25: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/25.jpg)
Part 9: Model Building9-25/43
Time Trend in Multiple Regression
After accounting for Income, the price and the price of new cars, per capita gasoline consumption falls by 1.25% per year. I.e., if income and the prices were unchanged, consumption would fall by 1.25%. Probably the effect of improved fuel efficiency
![Page 26: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/26.jpg)
Part 9: Model Building9-26/43
A Quadratic Income vs. Age Regression+----------------------------------------------------+| LHS=HHNINC Mean = .3520836 || Standard deviation = .1769083 || Model size Parameters = 3 || Degrees of freedom = 27323 || Residuals Sum of squares = 794.9667 || Standard error of e = .1705730 || Fit R-squared = .7040754E-01 |+----------------------------------------------------++--------+--------------+--+--------+|Variable| Coefficient | Mean of X|+--------+--------------+-----------+ Constant| -.39266196 AGE | .02458140 43.5256898 AGESQ | -.00027237 2022.85549 EDUC | .01994416 11.3206310+--------+--------------+-----------+
Note the coefficient on Age squared is negative. Age ranges from 25 to 65.
![Page 27: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/27.jpg)
Part 9: Model Building9-27/43
Implied By The Model
![Page 28: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/28.jpg)
Part 9: Model Building9-28/43
A Better Model?
Log Cost = α + β1 logOutput + β2 [logOutput]2 + ε
![Page 29: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/29.jpg)
Part 9: Model Building9-29/43
Candidate Models for CostThe quadratic equation is the appropriate model.
Logc = a + b1 logq + b2 log2q + e
![Page 30: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/30.jpg)
Part 9: Model Building9-30/43
27,326 Household Head Interviews in Germany, 1984 – 1994.
![Page 31: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/31.jpg)
Part 9: Model Building9-31/43
Interaction Term
Education
Age*Education
![Page 32: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/32.jpg)
Part 9: Model Building9-32/43
0 1 2 3
1 3
1
3
logIncome = β +β Educ+β Age+β Age×Educ+...+ε
Effect of a year of Educ depends on Age
dlogIncome/dEduc = β +β Age
b = -0.022385
b = 0.0019006
Age = 21, elasticity = 0.017528
Age = 35, elasticity = 0.044146
![Page 33: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/33.jpg)
Part 9: Model Building9-33/43
Case Study Using A Regression Model: A Huge Sports Contract
Alex Rodriguez hired by the Texas Rangers for something like $25 million per year in 2000.
Costs – the salary plus and minus some fine tuning of the numbers
Benefits – more fans in the stands. How to determine if the benefits exceed the
costs? Use a regression model.
![Page 34: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/34.jpg)
Part 9: Model Building9-34/43
PDV of the Costs
Using 8% discount factor Accounting for all costs Roughly $21M to $28M in each year from
2001 to 2010, then the deferred payments from 2010 to 2020
Total costs: About $165 Million in 2001 (Present discounted value)
![Page 35: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/35.jpg)
Part 9: Model Building9-35/43
Benefits
More fans in the seats Gate Parking Merchandise
Increased chance at playoffs and world series Sponsorships (Loss to revenue sharing) Franchise value
![Page 36: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/36.jpg)
Part 9: Model Building9-36/43
How Many New Fans?
Projected 8 more wins per year. What is the relationship between wins
and attendance? Not known precisely Many empirical studies (The Journal of
Sports Economics) Use a regression model to find out.
![Page 37: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/37.jpg)
Part 9: Model Building9-37/43
Baseball Data
31 teams, 17 years (fewer years for 6 teams) Winning percentage: Wins = 162 * percentage Rank Average attendance. Attendance = 81*Average Average team salary Number of all stars Manager years of experience Percent of team that is rookies Lineup changes Mean player experience Dummy variable for change in manager
![Page 38: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/38.jpg)
Part 9: Model Building9-38/43
Baseball Data (Panel Data – 31 Teams, 17 Years)
![Page 39: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/39.jpg)
Part 9: Model Building9-39/43
A Regression Model
0,team
1
Attendance(team,this year) =
+ γ Attendance(team, last year)
+ β Wins (team,this year)
2
3
+ β Wins(team, last year)
+ All_Stars(team, this year)
+ (team, this year)
![Page 40: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/40.jpg)
Part 9: Model Building9-40/43
A Dynamic Equationy(this year) = f[y(last year)…]
0 1Fans(t)=b +b Wins(t)+cFans(t-1)+ (Loyalty effect)
Suppose Fans(0) = Fans0 (Start observing in a base year)
Suppose we fix Wins(t) at some Wins* and at 0 (no information).
What values
0 1
0 1 0 1
0 1 0 1 0 1
0
does Fans(t) take in a sequence of years?
Fans(1) = b + b Wins* + cFans0
Fans(2) = b + b Wins* + c(b + b Wins* + cFans0)
Fans(3) = b + b Wins* + c(b + b Wins* + c(b + b Wins* + cFans0))
Fans(4) = b 1 0 1 0 1 0 1
2 t-1 2 t-1 t0 1
+ b Wins* + c(b + b Wins* + c(b + b Wins* + c(b + b Wins* + cFans0)))
etc.
Collect terms: Fans(t) = b (1+c+c ... c ) b Wins*(1+c+c ... c )+c Fans0
Suppose 0 < c < 1.
Fans finally settles down at
0 1 1b b b dFans* Fans* = + Wins*. =
1-c 1-c 1-c dWins *
![Page 41: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/41.jpg)
Part 9: Model Building9-41/43
Marginal Value of One More Win
0 1 2 3
0 1 2 3
Our Model is Fans(t) = + β Wins(t) + β Wins(t-1) + β AllStars + γFans(t-1)
Using the formula for the value of Fans*
+β Wins*+β Wins*+β AllStarsFans*=
1-γ
The effect of one more Win every year would b
1 2
3
e dFans*/dWins* = 1
The new player will definitely be an All Star, so we add this effect as well.
The effect of adding an All Star player to the team would be / (1 )
![Page 42: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/42.jpg)
Part 9: Model Building9-42/43
= .54914
1 = 11093.7
2 = 2201.2
3 = 14593.5
Effect of 1 more win
11093.7 2201.2= 32757
1 .59414Effect of adding an All Star
14593.5= 35957
1 .59414
![Page 43: Part 9: Model Building 9-1/43 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics](https://reader036.vdocuments.us/reader036/viewer/2022062314/56649db55503460f94aa70d2/html5/thumbnails/43.jpg)
Part 9: Model Building9-43/43
Marginal Value of an A Rod 8 games * 32,757 fans + 1 All Star = 35957
= 298,016 new fans 298,016 new fans *
$18 per ticket $2.50 parking etc. $1.80 stuff (hats, bobble head dolls,…)
About $6.67 Million per year !!!!! It’s not close.
(Marginal cost is at least $16.5M / year)