Download - Basics of Regression analysis
1
Basics of Regression Analysis
Presented By Mahak Vijay
01/05/2023
01/05/2023 2
•What is Regression Analysis?•Population Regression Line•Why do we use Regression Analysis?•What are the types of Regression?•Simple Linear Regression Model•Least Square Estimation for parameters•Least Square for Linear Regression•References
Outlines
01/05/2023 3
Regression analysis is a form of predictive modelling technique which investigates the relationship between a dependent (target) and independent variable(s) (predictor).
This technique is used for forecasting, time series modelling and finding the causal effect relationship between the variables.
For example, relationship between rash driving and number of road accidents by a driver is best studied through regression.
What is Regression Analysis?
01/05/2023 4
x
yRegression Line
Actual
Estimated
Errors
Population Regression Line
Independent Variables
Dep
ende
nt V
aria
bles
Study Time
Est
imat
ed G
rade
sPopulation regression function =
+x
Estimated Gradesx = Study Time= Intercept= Slope
Example
01/05/2023 5
Population Regression Line
= Intercept
= Slope
Regression Line
01/05/2023 6
Typically, a regression analysis is used for these purposes:
(1) Prediction of the target variable (forecasting).
(2) Modelling the relationships between the dependent variable and the explanatory variable.
(3) Testing of hypotheses.
Benefits
1. It indicates the strength of impact of multiple independent variables on a dependent variable.
2. It indicates the significant relationships between dependent variable and independent variable.
These benefits help market researchers / data analysts / data scientists to eliminate and evaluate the best set of variables to be used for building predictive models.
Why we need Regression Analysis?
01/05/2023 7
Types of regression analysis:
Regression analysis is generally classified into two kinds: simple and multiple.
Simple Regression:
It involves only two variables: dependent variable , explanatory (independent) variable.
A regression analysis may involve a linear model or a nonlinear model.The term linear can be interpreted in two different ways: 1. Linear in variable2. Linearity in the parameter
Regression Analysis
Simple Multiple
Linear Non Linear
1 Explanatory variable
2+ Explanatory variable
Types of Regression Analysis
01/05/2023 8
Simple linear regression model is a model with a single regressor x that has a linear relationship with a response y.
Simple linear regression model:
y +x + ɛ Response variable Regressor variable
Intercept Slope Random error component
In this technique, the dependent variable is continuous and random variable, independent variable(s) can be continuous or discrete but it is not a random variable, and nature of regression line is linear.
Simple Linear Regression Model
01/05/2023 9
Some basic assumption on the model:
Simple linear regression model:
yi+xi + ɛi for i=(1,2….n)
ɛi is a random variable with zero mean and variance σ2,i.e.
ɛi and ɛj are uncorrelated for i ≠ j, i.e.
ɛi is a normally distributed random variable with mean zero and variance σ2.
Ɛi N (0, σ2).
E(ɛi )=0 ; V(ɛi )= σ2
cov(ɛi , ɛj )=0
01/05/2023 10
yi+xi + ɛi for i=(1,2….n)
E(yi+xi + ɛi)= +xi
V(yi+xi + ɛi)=V(ɛi )=σ2.
=> Ɛi N (0, σ2)
=> Yi N (+xi , σ2)
NOTE : The dataset should satisfy the basic assumption.
E(ɛi )=0
01/05/2023 11
The parameters and are unknown and must be estimates using sample data: (,), (,),……(,)
x
y+x + ɛ
x
y
Least Square Estimation for Parameters
+xi + ɛi
01/05/2023 12
The line fitted by least square is the one that makes the sum of squares of all vertical discrepancies as small as possible.
x
y
We estimate the parameters so that sum of squares of all the vertical difference between the observation and fitted line is minimum.
S=2(x1,y1)
(x1,)
(y1-)= ɛ1
+xi + ɛi
01/05/2023 13
Minimizing the function requires to calculate the first order condition with respect to alpha and beta and set them zero:
I: = -2
II: = -2
We can mathematically solve for :
I: = -2
=
=-
Where
S=2
01/05/2023 14
II: = -2
1
=
=
=
Proof:
=
=
=
=0
=- ; =
01/05/2023 15
Example
= =
= -
01/05/2023 16
Calculating R2 Using Regression Analysis R-squared is a statistical measure of how close the data are to the fitted regression line(For measuring the
goodness of fit ). It is also known as the coefficient of determination. Firstly we calculate distance between actual values and mean value and also calculate distance between
estimated value and mean value. Then compare both the distances.
01/05/2023 17
Example
01/05/2023 18
Performance of Model
01/05/2023 19
The standard error of the estimate is a measure of the accuracy of predictions.
Note: The regression line is the line that minimizes the sum of squared deviations of prediction (also called the sum of squares error).
The standard error of the estimate is closely related to this quantity and is defined below:
Where Y = actual valueY’= Estimated ValueN = No. of observations
Standard error of the Estimate (Mean square error)
01/05/2023 20
X Y Y' Y-Y' (Y-Y')2
1.00 1.00 1.210 -0.210 0.044 2.00 2.00 1.635 0.365 0.133 3.00 1.30 2.060 -0.760 0.578 4.00 3.75 2.485 1.265 1.600
5.00 2.25 2.910 -0.660 0.436
Sum 15.00 10.30 10.30 0.000 2.791
Example
01/05/2023 21
Difference
01/05/2023 22
Solve : Ax=b
The columns of A define a vector space range(A).
2a
1a
Ax 2211 aa xx
Ax is an arbitrary vector in range(A).
b is a vector in Rn and also in the column space of A so this has a solution.
b
Least Square for Linear Regression
01/05/2023 23
The columns of A define a vector space range(A).
2a
1a
Ax 2211 aa xx
Ax is an arbitrary vector in range(A).
b is a vector in Rn but not in the column space of A then it doesn’t has a solution.
b
Try to find out that makes A as close to as possible and this is called least square solution of our problem.
xAb ˆ
01/05/2023 24
b
2a
1a
xA ˆ
xAb ˆ
A is the orthogonal projection of b onto range(A)
bAxAAxAbA TTT ˆˆ 0
25
26
Matlab Implementation (Linear_Regression3.m)
27
Matlab Implementation (Linear_Regression3.m)
01/05/2023 28
[1] Sykes, Alan O. "An introduction to regression analysis." (1993).
[2] Chatterjee, Samprit, and Ali S. Hadi. Regression analysis by example. John Wiley & Sons, 2015.
[3] Draper, Norman Richard, Harry Smith, and Elizabeth Pownell. Applied regression analysis. Vol. 3. New York: Wiley, 1966.
[4] Montgomery, Douglas C., Elizabeth A. Peck, and G. Geoffrey Vining. Introduction to linear regression analysis. John Wiley & Sons, 2015.
[5] Seber, George AF, and Alan J. Lee. Linear regression analysis. Vol. 936. John Wiley & Sons, 2012.
Reference
01/05/2023 29
THANK YOU