chapter 8: regression models for quantitative and qualitative predictors ayona chatterjee spring...

28
Chapter 8: Regression Models for Quantitative and Qualitative Predictors Ayona Chatterjee Spring 2008 Math 4813/5813

Upload: mitchell-wilcox

Post on 02-Jan-2016

233 views

Category:

Documents


5 download

TRANSCRIPT

  • Chapter 8: Regression Models for Quantitative and Qualitative PredictorsAyona ChatterjeeSpring 2008Math 4813/5813

  • Polynomial Regression ModelsWhen the true curvilinear response function is indeed a polynomial.When the true curvilinear response function is unknown but the polynomial is a good approximation to the true function.

  • One Predictor Variable Second OrderLet us consider a polynomial model with one variable raised to the first and second order.

    This polynomial is called a second-order model with one predictor.

  • One Predictor Variable Second OrderNote the second order regression equation in one variable represents a parabola. Here 1 is called the linear effect coefficient and 11 is called the quadratic effect coefficient.

  • One Predictor Variable Third OrderThe third-order model with one predictor variable is given as

  • Two Predictor Variables Second OrderThe regression model

    This is a second-order model with two predictor variables. The equation represents a conic section.

  • Example of a Quadratic Response Surface

  • Hierarchical Approach to FittingThe norm is to fit a second-order or a third-order polynomial and explore if a lower order model is adequate.For example if we have a third-order model in one variable, we may want to test of 111=0, or whether or not both 11 and 111 equal zero.We use extra sums of squares to do the test.

  • Extra Sums of SquaresTo test if 111=0 we would use SSR(x3|x, x2). If we want to test if both 11 and 111 equal zero then we would use SSR(x2, x3|x).Note SSR(x2, x3 | x) = SSR(x2|x) + SSR(x3|x2, x).If a polynomial of a given order is retained, then all related terms of lower-order are also retained.

  • Regression Function in Terms of XTo revert back to the original scale, and un do the centering of the predictor variables, we use the following transformations.

  • ExampleA researcher studied the effects of the charge rate and temperature on the life of a new type of power cell in a small-scale experiment. The charge rate (X1) was controlled at 3 level, and so was the ambient temperature (X2). The life of the power cell was the response (Y). The researcher decided to fit a second-order polynomial regression model.

  • Data Set - Power Cells ExampleScale the units and fit the second order polynomial regression model.Obtain the correlation between the new variables x and original X. Has the transformation reduced collinearity?

  • Test of FitAn F test to test the goodness of fit of the model to the data. Define

    If F* is greater than F table value, the model is not a good fit.

  • Partial F TestSuppose for the given data you want to test if a first-order model is sufficient. Here H0: 11= 22= 12=0The F statistics

  • Interaction Regression ModelsA regression model with p-1 variables contains additive effects if the response function can be written asE{Y} = f1(X1)+f2(X2)+ + fp-1(Xp-1)Note all functions need not be simple.If a response function cannot be written as above, then the model is not additive and interaction terms are present.

  • Interpretation of Interaction Regression ModelsIn presence of interaction term, the regression coefficients cannot be interpreted as before. For a first-order model with interaction term, the change in the mean response with a unit increase in X1 when X2 is held constant is 1 + 3X2 and not just 1 .

  • Reinforcement EffectsWhen the regression coefficients are positive, we say the interaction effect between the two quantitative variables is of a reinforcement or synergistic when the slope of the response function against one of the predictor variables increases for higher levels of the predictor variables. That is when 3 is positive.

  • Interference EffectsWhen the regression coefficients are positive, we say the interaction effect between the two quantitative variables is of an interference or antagonistic type when the slope of the response function against one of the predictor variables decreases for higher levels of the predictor variables. That is when 3 is negative.

  • Implementing an Interaction ModelThere are two points to keep in mind:High multicollinearity may exist between some predictors and hence centering the variables may help in reducing this problem.If there are larger number of predictors, then we have a large choice for possible interaction terms. Choose only the terms that you think will influence the response.

  • Qualitative PredictorsExample: Y is speed at which an insurance innovation is adopted, X1 is the size of the firm, and another predictor variable to identify type of firm. Here let the firm types be stock or mutual company. Thus we can define

  • PrincipleA qualitative variable with c classes will be represented by c-1 indicator variable, each taking on the values 0 and 1. We modify the previous example as

  • Qualitative Predictor with More than Two ClassesSuppose the regression on tool wear (Y) on tool speed (X1) and tool model. Tool model is a qualitative variables with M1, M2, M3 and M4 possible models.

  • Indicator Variables versus Allocated CodesAn alternative to using indicator variables is to use allocated codes. Consider, for instance the predictor variable frequency of product use, which has three classes. Frequent user 3Occasional user 2Nonuser - 1Here we have Yi=0+1Xi1+error.This coding implies that the mean response changes by the same amount when going from a nonuser to an occasional user as when going from occasional user to frequent user.

  • Why indicator variables?Indicator variables make no assumptions about the pacing of the classes.They reply on data to show the differential effects.Alternative model Yi=0+1Xi1+2Xi2+errorHere X1 = 1 for frequent userX2 =1 for occasional userAll other cases we have zero.

  • Quantitative to QualitativeSometimes we may convert quantitative data to qualitative data, for example ages can be grouped and we can use indicator variables to denote the age groups. An alternative coding is to use 1 and 1 for the two levels of a qualitative factor.

  • Comparison of Two or More Regression Functions-ExampleWe can compare regression functions using hypothesis testing and see if two functions represent the same response function or now.Examples.

  • Comparison of Two or More Regression Functions-ExampleA company operates two production lines for making soap bars. For each line, the relation between the speed of the line and the amount of the scrap for the day was studied. A scatter plot of the data for the two production lines suggest that the regression relation between production line speed and amount of scarp is linear but not the same for the two production lines. The slopes appear same but the heights of the regression lines differ. A formal test is desired to determine if the two regression lines are identical.

  • Soap Production line - ExampleFirst fit separate regression models for both production lines.Next combine all the data and using an indicator variable fit a first-order regression model with interaction. Identity of the regression functions for the two production lines is tested by considering the alternativesH0: 2=3=0 and H0: 3=0

    The second option when the true nature is unknown a polynomial is often used. A main danger in polynomial function is that extrapolation can be dangerous and very worng. Here a single predictor is expressed in the first and second power. The predictor variables are centered. This is to reduce the correlation between X and X2. Recall multicollinearity can cause problems in calculating the X`X inverse. Centering the variables reduce multicollinearity. The above equation can also be rewritten as Y-I = beta_0 + beta_1X_i +beta_11 x_i +epsilon. We have an example of a parabolic regression function. The coefficients of beta changes the slope and hence the shape of the function. Here beta_0 represents the mean response of Y when x = 0 ie when X = bar(X). Polynomial model with predictor variables with higher than the third power should be employed with caution. Interpreting the coefficients are more difficult and so is extrapolation and intrapolation. A fitted polynomial model on n-1 order will pass through all n observed values. So do not get a model just to fit the data, we want a model that explain the relation between X and Y. Contains separate linear and quadratic terms along with a cross-product which represents the interaction between x1 and x2. beta-12 is often called the interaction effect coefficient. Here various combinations of the two predictor variables can yield the same response. Similarly write a second-order regression model with three predictor variables. Y_i = beta_0 + beta_1X_i1+beta_2x_i2+beta_3x_13+beta_11x_i1^2 +Beta_22x_i2^2 + beta_33x_i3^2 +beta_12x_i1x_i2 +beta_13x_i1x_i3+beta_23x_i2x_i3 + epsilon_iRemember all predictors are centered. Remember in chap 6 we have already shown that polynomial models are special cases of the general linear model.3rd order model in 1-variable Y_i = beta_0 + beta_1x_i + beta_11x_i^2 + beta_111x_i^3 + epsilonPage 299, first partThus one would not drop the square term if you were to retain the cubic term. Since the quadratic if of lower order, we assume it is providing more basic information about the shape of the response function and the cubic term is providing with refinements. The fitted values and residuals for both models in x and X are the same. Y-I = beta_0 + beta_1x_i1+beta_2x_i2+beta_11x_i1^2+beta_22x_i2^2 + beta_12x_i1x_i2 + epsilon_iUse transformation x_i1 = (X_i1-bar(X))/0.4 = (x_i1 1)/0.4x_i2 = (X_i2 20)/10The regression equation is y_hat = 162.84-55.83x1+75.50x2+27.39x1^2-10.61x_2^2+11.50x1x2Correlation between x1 and x1^2 = 0 but X1 and X1^2 =0.99, so huge reduction.

    Here at x1 and x2 = 0 we have 3 replications, so SSPE = (157-157.33)^2 +(131-157.33)^2 + (184-157.33)^2 = 1404.67SSLF = 5240.44 1404.67 3835.77 n = 11, p = 6, c = 9F^* = 1.82, F(0.95, 3, 2) = 19.2, We do not reject H_0 so the second-order polynomial is a good fit.SSR(x1)=18704, SSR(x_2|X1)=34202, thus SSR(x1^2X2^2X1X2|X1,X2)=ssr(x1^2|x1,x2) + ssr(x2^2|x1, x2, x1^2)+ssr(x1x2|x1, x2, x1^2, x2^2)=1646+284.9+529 = 2459F* = (2459.9/3)/1048.1 = 0.78, F(0.95, 3, 5)=5.41. We do not reject H0 and hence a first-order model is adequate. Fit the first order model, Y^ = 172 55.83x1 +75.500x2 E{Y} = beta0 + beta1X1 + beta11X_1^2 + beta_2X2, first three terms can be all f_1(X1) and the second one can be f_2(X_2).Example of non additive E{Y} = beta_0 + beta_1X_1 + beta_2 X_2 + beta3X_1X_3Here model is Y_I = beta_0 + beta_1x_i1 + beta_2 x_i2 + beta_3 x_i1x_i2 + epsilon_ISimilarly for unit change in X2 we have Beta_2 + beta_3X1If there is no interaction term, then the change in the response function will be parallel lines. E{Y} = 10 + 2X-1 + 5X2, when x_2 = 1 and when X_2 = 3, we get two parallel lines.Example E{Y} = 1- + 2X1 + 5X2 + 0.5X1X2, change X2=1 then X2 = 3. Plot both lines on the calculator and see, they do not intersect.E{Y} = 10 + 2X1 + 5X2 0.5X1X2, here for X2=1 n X2 = 3 the lines intersect.When both beta_1 and beta_2 are negative then a positive Beta_3 gives interference effect while a negative beta_3 gives reinforcement effect.Example, with 8 predictors, you have 28 , 8 choose 2 combinations possible. Remember we can always test if a interaction term can be dropped from a model. Example: Blood pressure- age and sex here age is quantitative and sex is qualitative. Thus the first order model is Y-I = beta_0 + beta_1x_i1 + beta_2 X_i2 + beta_3 X_i3 and the matrix issee page 314 for X matrix.Here the problem is that the XX matrix does not have an inverse as the columns of the matrix are linearly dependent. Option drop X_3Indicator variables also called dummy variables or binary variables. Our response functions are E{Y} beta_0 + beta_1X1 for model M4For M1 E{Y} = (beta_0 + beta_2) + beta_1X_1For M2 E{Y} = beta_0 + beta_3 + beta_1X_1So e have linear models with the same slope for all 4 cases. Thus Beta_2, Beta_3 and Beta_4 measure the differenential effects of the qualitative variable classes on the height on the response function for any given level of X_1 always compared with the classes for which X_2 = X_3 = X4 =0For frequent user, we have E{Y}=beta_0 + 3beta_1For occasional we have E{Y} = beta_0 + 2 beta1Thus E{Y|frequent} E{Y| occasional} = beta_1 = E{Y|occasional}-E{Y|nonuser} This may not be true, the effect from nonuser to occasional user may not be the same as occasional to frequent user.Here beta_1 would measure the difference between frequent user and nonuser, beta_2 difference between nonuser and occasional user.Y^ = 7.57 + 1.322X1 + 90.39X2 0.1767X1X2Production line 1 Y^ = 97.965 +1.145X1Production line 2 Y^ = 7.574 + 1.322X1For the first one F* = (SSR(X2, X!X2|X1) )/ 2 / (SSE(X1, X2, X1X2)/n-4=22.65 F(0.99, 2, 23) = 5.67 we reject H0 and the two production lines are not identical.