1 mgt 511: hypothesis testing and regression lecture 8: framework for multiple regression analysis...
Post on 18-Jan-2016
213 Views
Preview:
TRANSCRIPT
1
MGT 511: Hypothesis Testing and RegressionLecture 8: Framework for Multiple Regression
Analysis
K. SudhirYale SOM-EMBA
2
Recall
Simple Regression T-test of slope coefficients, R-square Forecasts, Prediction and Confidence Intervals Transformations for nonlinearity and non-constant variance
Multiple Regression Partial Slopes, tradeoff between bias and precision ANOVA, F-test Dummy Variables and Interaction Variables Residual Analysis and Outliers
3
Framework for Multiple Regression
Use theory, knowledge to build the initial model Residual Analysis and Refinement of model Perform F-test; If F-test rejects null, perform t-tests
Possible Reasons for Insignificance of Individual Slope Coefficients
Refine the model
4
Step 1: Using knowledge, theory to specify initial model
What is dependent variable? potential predictor variables? Should you use
Transformations to accommodate nonlinear effects Normalize the y or x variables (per-capita, constant $ etc) Dummy variables Interaction variables if slope effects can be different
Collect data, Estimate the model Are the results plausible? For e.g., how is prediction at extreme
values? If not refine model.
5
What should be the Y and X variables?
Y- Sales of personal printers in different sales districts What are appropriate X variables?
Knowledge suggests several segments: College students, home users, small businesses, computer network
workstations Appropriate X variables
College freshmen, household income, small business starts, new network installations
6
Potential X variables: Tradeoffs
Omitting important variables can bias results or reduce explanatory power
Using too many variables can make all variables insignificant
Prioritize the variables, based on what you consider are most important
7
Transformations
Is the relationship nonlinear? Sales-Advertising relationship Experience Curve effect
8
Normalization of the Variables
Normalizing the Y variable: Example Y- Unit Sales in different cities (Problem?) X- Price and Feature Advertising Solution?
Normalizing the X variable: Example Y- Total Market Value of Firm X- Value of Assets, Number of Employees (Problem?) Solution?
9
Interaction Effects
Y- Sales; X: Prices, Feature Y- Sales; X: Price,Holiday
Y-Salary; X: Gender, Experience
10
Plausibility of Results
Will results make sense at extreme values? Usually alerts to nonlinearity issues
Examples: What will sales be at very high prices, very high advertising? What will cost be at high levels of experience?
11
Step 2: Residual Analysis
Check the residuals; refine model Accommodating Nonlinear Effects Accounting for non-constant variance Accounting for outliers
Keep refining the model, estimate the refined model until the residuals are “satisfactory” Remember that residuals will not perfectly follow the “rules”
due to randomness; minor deviations will not affect regression results
12
Step 3: Performing F-tests and t-tests
If estimated equation and residual analysis are OK, conduct F-test for the model as a whole
If we reject the null using the F-test conduct t-tests for individual slopes
Question: What to do if one or more individual slope coefficients are insignificant?
13
Possible Reasons for Insignificance of Individual Slope Coefficients
Omitted Variable Bias Nonlinearity not appropriately taken care of Multicollinearity True effect is non-zero, but small True effect is zero
14
Omitted Variable Bias
One or more relevant predictor variables are missing action: add the variables to the model
Example 1 Y- Sales X- Price Omitted X variable – Advertising
Example 2 Y- Salary X- Schooling Omitted X variable – Job Experience
15
Regression of Salary against Schooling and Experience
CoefficientsStandard Error t Stat P-value Lower 95% Upper 95%Intercept 47334.97 3526.717 13.42182 1.84E-13 40098.75 54571.19Schooling 311.0538 226.6091 1.372645 0.181158 -153.909 776.017
CoefficientsStandard Error t Stat P-value Lower 95% Upper 95%Intercept -65798 29966.01 -2.19575 0.03723 -127394 -4201.91Schooling 5793.49 1457.244 3.975647 0.000498 2798.079 8788.901Experience 1836.442 484.1689 3.792978 0.0008 841.2179 2831.666
Explain this phenomenon
16
Nonlinearity not taken care of
The X variable affects the Y variable differently than assumed in the model
action: use a different transformation
Example: Recall HW Problem Y- Yield X-Temperature; Solution: Add Temperature^2
17
Multicollinearity
Highly Correlated X variables reduce significance of all variables
action 1: reformulate the model (e.g. per capita; constant $) action 2: obtain more data action 3: delete this predictor variable
18
True Effect is Small or Zero
True effect of X is small, but non-zero action 1: obtain more data (or) action 2: delete this variable
True effect of X is zero action 2: delete this variable
19
Possible Reasons for Insignificance of Individual Slope Coefficients
Omitted Variable Bias Nonlinearity not appropriately taken care of Multicollinearity True effect is non-zero, but small True effect is zero
20
Summary
For multiple regression to provide valid and meaningful results, it is critical that the proposed model is “well done”
Before we can justify statistical inference (about the model, about slope parameters or for predictions), the plausibility of the estimated equation should be checked and the residuals should be examined
Variables should be transformed to accommodate nonlinear effects for the original variables (e.g. resulting in linear effects for the transformed variables)
There are many possible reasons for the occurrence of insignificant slope coefficients (and it is not easy to distinguish between these reasons)
top related