regression analysis

28
Regression Analysis Regression analysis is a mathematical measure of the averages relationship between two or more variable in terms of the original units of data. Types of Regression (i) Simple Regression (Two Variable at a time) (ii) Multiple Regression (More than two variable at a time) Linear Regression: If the regression curve is a straight line then there is a linear regression between the variables . Non-linear Regression/ Curvilinear Regression: If the regression curve is not a straight line then there is a non-linear regression between the variables.

Upload: asad-ali

Post on 22-Jan-2015

680 views

Category:

Business


2 download

DESCRIPTION

In statistics, regression analysis is a statistical process for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. More specifically, regression analysis helps one understand how the typical value of the dependent variable (or 'Criterion Variable') changes when any one of the independent variables is varied, while the other independent variables are held fixed. Most commonly, regression analysis estimates the conditional expectation of the dependent variable given the independent variables – that is, the average value of the dependent variable when the independent variables are fixed. Less commonly, the focus is on a quantile, or other location parameter of the conditional distribution of the dependent variable given the independent variables. In all cases, the estimation target is a function of the independent variables called the regression function. In regression analysis, it is also of interest to characterize the variation of the dependent variable around the regression function which can be described by a probability distribution.

TRANSCRIPT

  • 1. Regression Analysis Regression analysis is a mathematical measure of the averages relationship between two or more variable in terms of the original units of data. Types of Regression (i) Simple Regression (Two Variable at a time) (ii) Multiple Regression (More than two variable at a time) Linear Regression: If the regression curve is a straight line then there is a linear regression between the variables . Non-linear Regression/ Curvilinear Regression: If the regression curve is not a straight line then there is a non-linear regression between the variables.

2. Simple Linear Regression Model & its Estimation A simple linear regression model is based on a single independent variable and its general form is:Yt X t t HereInterceptsYt Xtt= dependent variable or regressands = independent variable or regressor = random error or disturbance termImportance of error (i)Slope/ Regression Coefficients t term:It captures the effect of on the dependent variable of all variable not included in the model. (ii) It captures any specification error related to assumed linear functional form. (iii) It captures the effects of unpredictable random componenets present in the dependent variable. 3. Estimation of the Model Yt Xt Sales Adver Exp (thousands (million of of Unit) Rs.) Yt X=309/ 736/7txt2y t Y t Yt xt X t X txtyt-7.14286-0.642864.5918370.413265374.5486.53.8571431.3571435.2346941.841837453.50.857143-1.64286-1.408162.69898363-8.14286-2.1428617.448984.591837252.5-19.1429-2.6428650.591846.984694558.510.857143.35714336.4489811.27041637.518.857142.35714344.448985.55612244.1428Yt =309Xt = 365.1428xt yt =157.37xt 2 = 33.354 4. Estimation of the Model xy x t2 tt157 . 357 4 . 71733 . 354 Y X 44 . 143 ( 4 . 717 )( 5 . 143 ) 19 . 882 Then the estimated simple linear regression model is Y t 19 . 882 4 . 717 X t 5. 2 x y x 22 tt157 . 357 4 . 71733 . 354 Y X 44 . 143 ( 4 . 717 )( 5 . 143 ) 19 . 882 Y t 19 . 882 4 . 717 X t 6. General Formula for First Order Coefficients rYX .W rXY rXW rYW (1 rXW )(1 rYW ) 22General Formula for Second Order CoefficientsrYX .WO rXY .O rXW .O rYW .O (1 r2 XW . O)(1 r2 YW . O) 7. Partial Correlation Remarks: 1. Partial correlation coefficients lies between -1 & 1 2. Correlation coefficients are calculated on the bases of zero order coefficients or simple correlation where no variable is kept constant. Limitation: 1. In the calculation of partial correlation coefficients, it is presumed that there exists a linear relation between variables. In real situation, this condition lacks in some cases. 2. The reliability of the partial correlation coefficient decreases as their order goes up. This means that the second order partial coefficients are not as dependable as the first order ones are. Therefore, it is necessary that the size of the items in the gross correlation should be large. 3. It involves a lot of calculation work and its analysis is not easy. 8. Partial Correlation Example: From the following data calculate 12.3 x1 : 4 0 1 1 1 3 x2 : 2 0 2 4 2 3 x3 : 1 4 2 2 3 0 Solution: X116 2 2,X216 2 2andX316 2 2413040 9. Partial Correlation 10. Multiple Correlation The fluctuation in given series are not usually dependent upon a single factor or cause. For example wheat yields is not only dependent upon rain but also on the fertilizer used, sunshine etc. The association between such series and several variable causing these fluctuation is known as multiple correlation. It is also defined as the correlation between several variable.Co-efficient of Multiple Correlation: Let there be three variable X1, X2 and X3. Let X1 be dependent variable, depending upon independent variable , X2 and X3. The multiple correlation coefficient are defined as follows: R1.23 = Multiple correlation with X1 as dependent variable and X2. and X3. , as independent variable R2.13 = Multiple correlation with X2 as dependent variable and X1. and X3. , as independent variable R3.12 = Multiple correlation with X3 as dependent variable and X1. and X2 , as independent variable 11. Calculation of Multiple Correlation Coefficient General FormulaFor example 12. Remarks Multiple correlation coefficient is a non-negative coefficient. It is value ranges between 0 and 1. It cannot assume a minus value. If R1.23 = 0, then r12 = 0 and r13=0 R1.23 r12 and R1.23 r13 R1.23 is the same as R1.32 (R1.23 )2 = Coefficient of multiple determination. If there are 3 independent variable and one dependent variable the formula for finding out the multiple correlation isR1 .234 1 (1 r214 )(1 r212 . 3 )(1 r2 12 . 34) 13. Limitation 14. Advantages of Multiple Correlation 15. Example Given the following data X1: 3 5 X2: 16 10 X3: 90 726 7 548 4 4212 3 30Compute coefficients of correlation of X3 on X1 and X214 2 12 16. Example 17. Example 18. Types of Correlation r12.3 is the correlation between variables 1 and 2 with variable 3 removed from both variables. To illustrate this, run separate regressions using X3 as the independent variable and X1 and X2 as dependent variables. Next, compute residuals for regression...X