correlation analysis. a measure of association between two or more numerical variables. for examples...
TRANSCRIPT
Correlation Analysis
Correlation Analysis
A measure of association between two or more numerical variables.
For examples
height & weight relationship
price and demand relationship
CORRELATION
Correlation Analysis
Independent and Dependent Variables
Independent variable: The variable that is the basis of estimation is called.
Dependent variable: The variable whose value is to be estimated is called dependent variable. The dependent variables are dependent on independent variables.
Correlation Analysis
Example
Student Hours studied % Marks
1 6 82
2 2 63
3 1 57
4 5 88
5 3 68
6 2 75
Correlation Analysis
USE
1. With the help of correlation analysis we can measure in one figure the degree of relationship existing between the variables.
2. Correlation analysis contributes to the economic behavior, aids in locating the critically important variables on which disturbances spread and suggest to him the paths through which stabilizing forces become effective.
3.In business, correlation analysis enables the executive to estimate costs, sales, price and other variables on the basis of some other series with which these costs, sales or prices may be functionally related.
Correlation Analysis
Example
Independent variable in this example is the number of hours studied.
The mark the student obtains is a dependent variable.
The mark student obtains depend upon the number of hours he or she will study.
Are these two variables related?
Correlation Analysis
Types of correlation
Correlation
Simple, partial and multiple
Positive and negative
Correlation Analysis
Height & Weight
Income & Expenditure
Training & performance
Positive correlation
A positive relationship exists when both variables increase or decrease at the same time.
Correlation Analysis
Strength and age
Demand & Price
Negative correlation
A negative relationship exist when one
variable increases and the other variable
decreases or vice versa.
Correlation Analysis
Simple, partial and multiple correlation
Association between only twovariables is Simple correlation.
(e.g. Height & Weight)
Association among morethan two variables is Multiple correlation.
(e.g. Capital, Production cost, Advertisement cost & Profit)
Incase of multiple correlation the association between two variables is called Partial correlation when effects of other variables remain constant.
(e.g. correlation between Capital & Profit when the effects of Production cost & Advertisement cost remain unchanged.)
Correlation Analysis
Scatter plots
A scatter plot is a chart that shows the relationship between two quantitative variables measured on the same observations.
In a scatter plot, one of the variables (usually the independent variable) is plotted along the horizontal or X axis and the other is plotted along the vertical or Y axis.
Correlation Analysis
Specific Example Specific Example
For seven random For seven random summer days, a summer days, a person recorded the person recorded the temperature and their and their water consumption, , during a three-hour during a three-hour period spent outside. period spent outside.
Temperature (F)
Water Consumption
(ounces)
75 1683 2085 2585 2793 3297 4899 48
Correlation Analysis
How would you describe the graph?How would you describe the graph?
Correlation Analysis
Types of correlations
Y
X
Y
X
Y
Y
X
X
(continued)
Perfect positive
Perfect negative
Strong positive
Strong negative
Correlation Analysis
No linear correlation
x = height y = IQ
160
150
140
130120
110
100
9080
60 64 68 72 76 80
Height
IQ
Correlation Analysis
Correlation Coefficient
A quantity which measures the direction and
the strength of the linear association between
two numerical paired variables is called
correlation coefficient.
Correlation Analysis
Pearson’s correlation coefficient or product moment correlation
2222
2222
22
YnYXnX
YXnYX
YYnXXn
YXYXn
YYXX
YYXXr
ii
ii
iiii
iiii
ii
ii
Pearson’s Correlation coefficient (continued)
Correlation Analysis
Example 1
A company has brought out an annual report in which the capital investment and profits were given for the few years.
Capital Investment
(cores)10 16 18 24 36 48 57
Profits (lakh)
12 14 13 18 26 38 62
Correlation Analysis
Calculation
X= Capital investment Y= Profits
2222r
iiii
iiii
YYnXXn
YXYXn
Correlation Analysis
Continue…
10 12
16 14
18 13
24 18
36 26
48 38
57 62
= = = = =
X Y XY 2X 2Y
iX iY iiYX 2
iX 2iY
Correlation Analysis
Example 2
A departmental store has the following statistics of sales for a period of last one year of 8 salesmen who have varying years of experience.
Salesmen Years of exp.
Annual sales(tk)
1 1 80
2 3 97
3 4 92
4 4 102
5 6 103
6 8 111
7 10 119
8 11 117
Correlation Analysis
Calculation
X= years of experiences Y= Annual Sales
2222r
iiii
iiii
YYnXXn
YXYXn
Correlation Analysis
Continue…
salesmen
1 1 80
2 3 97
3 4 92
4 4 102
5 6 103
6 8 111
7 10 119
8 11 117
X Y XY 2X 2Y
iX iY iiYX 2
iX 2iY
Correlation Analysis
Properties of r
r lies between -1 to +1. i.e.,
The correlation coefficient is a symmetric measure.
The r will be negative or positive depending on whether the sign of the numerator of the formula is positive or negative.
The correlation coefficient is a dimensionless quantity, implying that it is not expressed in any unit of measurement.
11 r
Correlation Analysis
Interpretation
r=1 indicates a perfect positive correlation or relationship. In this case, all the points in a scatter diagram lie on a straight line that has a upward direction.
r=-1 indicates a perfect negative correlation or relationship. In this case, all the points in a scatter diagram lie on a straight line that has a downward direction.
r=0 indicates that the variables are not linearly related or no correlation.
Correlation Analysis
Interpretation
Value of r close to 1 indicates a strong positive correlation or strong positive linear relationship
Value of r close to -1 indicates a strong negative correlation or strong negative linear relationship
Positive value of r close to 0 indicates a weak positive correlation or weak linear relationship.
Negative value of r close to 0 indicates a weak negative correlation or weak negative linear relationship.
Correlation Analysis
Perfect negative corr.
Perfect positive corr.
Zero corr.
Weak negative corr.
Strong negative corr.
Weak positive corr.
Strong positive corr.
-1 - 0.5 0 0.5 1
Moderate negative corr.
Moderate positive corr.
Negative correlation Positive correlation
Correlation AnalysisCorrelation Coefficient Interpretation
CoefficientRange
Strength ofRelationship
0.01 - 0.20 Very weak
0.21 - 0.40 weak
0.41 - 0.60 Moderate
0.61 - 0.80 Strong
0.80 - .99 Very strong
Correlation Analysis
Interpret the following
i. r = -.098
ii. r = 1
iii. r = 0.5
iv. r = -1
v. r = 0
vi. r= .92
Correlation Analysis
Types of correlations
Y
X
Y
X
Y
Y
X
X
(continued)
r=1
r=-1
r close to +1
r close to -1
Correlation Analysis
Y
X
Type of correlation
r close to zero
Correlation Analysis
r = ? Why?
1.
r = ? Why?
2.
Interpret the following
Correlation Analysis
Y
X
3.
r = ? Why?
Interpret the following
Correlation Analysis
Characteristics of Correlation Correlation does not tell us anything about
causation.
To calculate correlation, both variables must be
quantitative (not categorical).
A positive value for r indicates a positive association
between x and y. A negative value for r indicates a
negative association between x and y.
Correlation Analysis
Regression Analysis
Regression analysis is a technique of studying
the relationship of one independent variable
with one or more dependent variables with a
view to estimating or predicting the average
value of the dependent variable in terms of the
known or fixed values of the independent
variables.
Correlation Analysis
Objectives of regression
Estimate the relationship that exists between
the dependent variable and the independent
variable.
Determine the effect of each of the
independent variables on the dependent
variables.
Prediction the value of the dependent
variable for a given value of the independent
variable.
Correlation Analysis
Regression vs. Correlation
The correlation answers the STRENGTH of
linear association between paired variables,
say X and Y. On the other hand, the regression
tells us the FORM of linear association that
best predicts Y from the values of X.
In case of correlation, it never measure cause
and effect relationship whereas regression
specially measures this.
Correlation Analysis
Regression vs. Correlation
Linear regression are not symmetric in terms
of X and Y. That is interchanging X and Y will
give a different regression value. On the other
hand, if you interchange variables X and Y in
the calculation of correlation coefficient you
will get the same value of this correlation
coefficient.
Correlation Analysis
Types of regression
Linear regression
that shows the relationship between one dependent variable and one independent variable.
Multiple regression
that shows the relationship between one dependent variable and two or more independent variables.
Correlation Analysis
Regression model (equation)
A model is mathematical equation that describes
the relationship between a dependent variable and
a set of independent variables.
Intercept term slope term
Dependent variable Independent variable
ii bXaY
Correlation Analysis
Interpretation
ii bXaY Y is dependent variable X is independent variable a is intercept term, also the expected
value of Y for X=0. b is slope term, also known as regression
coefficient. It represents the amount of change in Y for each unit change in X.
Correlation Analysis
Estimates of a and b
XbYa ˆˆ
22
2ˆ
ii
iiii
i
ii
XXn
YXYXn
XX
YYXXb
The least squares principle is used to estimate a
and b. The equations to determine a and b are
Correlation Analysis
Properties of b
It lies between
Negative value of b indicates the relationship between two variables is negative.
Positive value of b indicates the relationship between two variables is positive.
b=0 indicates there is no relationship between the two variables.
b
Correlation Analysis
Example 1
Age of trucks years 5 4 3 1 7
Repair expense last year in hundreds of $
7 7 6 4 10
XbaY ˆˆˆ
22
ˆ
ii
iiii
XXn
YXYXnb
XbYa ˆˆ
Correlation Analysis
Calculation
5 7 35 25
4 7 28 16
3 6 18 9
2 4 8 4
7 10 70 49
= = = =
X 2XXYY
iX
iY iiYX 2
iX
Correlation Analysis
Example 1
Mr. A, president of a financial services, believes that there is a relationship between the no. of client contacts and the dollar amount of sales. To document this assertion, Mr. A gathered the following sample information.
No. of contacts
14 12 20 16 46 23
Sales ( $Thousand)
24 14 28 30 80 30
Correlation Analysis
Calculation
Find the Regression co-efficient and interpret it. Find the regression equation that express the
relationship between these two variables. Determine the amount of sales if 40 contacts
are made.
Correlation Analysis
Problem
Given data
X 8 12 5 8 15 10
Y 10 6 12 10 7 9
1. Draw a scatter diagram2. Calculate the correlation coefficient and interpret it.3. Find the regression coefficient and interpret it.4. Determine the value of Y when X= 11, 18
Correlation Analysis
Coefficient of Determination
The coefficient of determination (r2) is the proportion of the total variation in the dependent variable (Y) that is explained or accounted for by the variation in the independent variable (X).
It is the square of the coefficient of correlation. It ranges from 0 to 1. It does not give any information on the direction
of the relationship between the variables.
Correlation Analysis