measures of association relationship
Post on 13-Jul-2015
94 Views
Preview:
TRANSCRIPT
CONTENTS; BI VARIATE ANAYSIS
CONCEPT OF RELATIONSHIP
CROSS TABULATION &
PERCENTAGE DIFFERENCE
PROPORTIONAL ERROR
REDUCTION
MULTIVARIATE ANALYSIS
BIVARIATE ANALYSIS
Bivariate analysis means analysis of
two variables simultaneously.
Brand preference of LEVIS age
youth adults
YES 140(70%) 20(10%)
NO 60(30%) 180(90%)
TOTAL 200 200
CONCEPT OF RELATIONSHIP
Association means the relationship
between two variables under study.
It states how many categories of
variable x go with certain categories of
variable y.
This is called principle of co variance
and this is the basic notion of
association.
Eg; demand and price, crop yield and
fertilizer input..etc
Questions to be considered..
Is there any relationship between the
variables under study.??
If so,. then what is the direction and
degree of relationship??
Is the relationship is a casual one??
Is the relationship is statistically
significant??
Measures of association
Most important measures of association are;
1.Association coefficients
2.Cross tabulation and percentage difference.
3.Correlation coefficient
4.Regression analysis
Association coefficients
1. Lambda
2. Goodman and kruskal’s tau
3. Gamma
4. Kendall's tau
5. Somers’ d
Cross tabulation and percentage difference
It is used for measuring association
between nominal level variables.
Nominal level variable are purely
qualitative and can be categorized
only.
A.PEFERENCE OF BRAND X AND AGE
BRAND PREFERENCE AGE
YOUTH ADULT TOTAL
1. YES 120(60%) 120(60%) 240
1. NO 80(40%) 80(40%) 160
A two way table is prepared (table A).
The values for one variable is put
along one side of the table and the
values of the another variable is put
along the other side of the table . each
variable is categorized into two or
more categories and values are cross
tabulated for those sub categories
In table A variables of brand
preference and age are independent
i.e., not associated because
proportionately as many youths as
adults prefer brand X
B.PREFERENCE OF BRAND Y & AGE
BRAND
PREFERENCE
AGE
YOUTH ADULT TOTAL
YES 140(70%) 20(10%) 160
NO 60(30%) 180(90%) 240
TOTAL 2OO 200 400
In table B , we can find that the
variables are greatly associated
because higher proportion of youths
prefer brand Y than adults.
There we can say that the variables
age and brand preference are
associated.
Direction of relationship
Direction of association may be of two
types
1.positive relationship
2.negative relationship
When one variable increases other variable also increases , then a direct or positive relationship exists between the variables.
E.g. children’s age and weight, fertilizer input and crop yield.
When one variable increases other variable decreases then an indirect or negative relationship.
E.g.., price and demand , socio economic status and family size.
Strength of relationship
Strength of relationship is determined
by the pattern of differences between
the values of variables.
If there are marked percentage
difference between different
categories of variables, the
relationship between them is strong.
If the difference is slight then the
relationship is weak.
Table A – strong relationship.
Statistical significance
Statistical significance is determined
by using an appropriate tests of
significance.
Stronger the relationship is more likely
to be significant
PREDICTION OR PROPORTIONAL REDUCTION OF ERROR (PRE)
Loan status number
delinquent 30
Non delinquent 20
total 50
Best prediction to be made is based on mode.
Here the mode is delinquent.
By assuming all are delinquent ,,
Our prediction will result in 20 errors
Error rate =20/50 x100 = 40%
Loan repayment(x) Membership(y)
member Non member total
Delinquent 22 8 30
Non delinquent 4 16 20
total 26 24 50
Here we use within the category
mode.
With in the delinquent category the
mode is delinquent, if we guess all are
delinquent then;
Error rate = 4/26 x 100 = 15%
Within non delinquent category mode
is non delinquent, assuming all are
non delinquent,,
Error rate=8/24 x 100 = 33%
Proportional error reduction;
Total error = 4+8 = 12
Error rate =(4+8)/(26+24) = 24%
From 40 % we have reduced it to 24% this is known as PRE.
PRE = (E1 – E2) / E1
E1- original no. of errors after employing independent variable
E2- new error after employing independent variable as predictor
Rules of PRE..
PR1-to predict the dependent variable
use its own mode
PR2- to predict the dependent
variable, use within category modes of
the independent variable.
MULTIVARIATE ANALYSIS
Analysis of multiple variables of a
phenomenon is called multivariate
analysis.
It involves simultaneous analysis of
more than two variables.
It provides complete explanations of
for complex phenomena and permit
assessing casual relationship through
statistical control.
WHY MULTIVARIATE..???
The bi-variate measures of
relationship has only limited function
of establishing co variation and its
directions.
Many a time the relationship between
variables may affected by a third
variable or un revealed variable.
Some times the phenomenon under
study cannot be explained through bi
variate analysis.
THE CONCEPT OF CONTROL
A correlation between an independent and dependent variable is not a sufficient basis for inferring casual relationship between them.
The relationship may be caused by a third variable and that is the cause of both independent and dependent variable.
By eliminating such effects only ,the original bivariate association can be validated. it can be achieved through CONTROL
CONTROL; In social sciences research this can be
achieved through..; 1.CROSS TABULATION
2.PARTIAL CORRELATION
3.MULTPLE CORRELATION
4.MULTIPLE REGREESION
5.FACTOR ANALYSIS
CROSS TABULATION
First we establish the relationship between independent and dependent variable, through bivariate analysis.
We select a independent third variable which is associated with independent variable and use it as a control variable
Then the sample is subdivided into sub groups.
We may find an entirely different result;
1.dissappearence / weakening of original relationship.
2.new relationship
3.A strong relationship under one condition and not in another.
Example;
A . POLITICAL AWARNESS AND PLACE OF LIVING
POLITICAL AWARNESS LOCATION
URBAN RURAL
HIGH 200(50%) 140(28%)
LOW 200(50%) 360(72%)
TOTAL 400(100%) 500(100%)
B. PLACE OF LIVING BY EDUCATIONAL LEVEL
PLACE OF LIVING EDUCATIONAL LEVEL
HIGH LOW
URBAN 300(75%) 100(20%)
RURAL 100(25%) 400(80%)
TOTAL 400(100%) 500(100%)
C. POLITICAL AWARNESS BY
EDUCATIONAL LEVEL
POLITICAL AWARNESS EDUCATIONAL LEVEL
HIGH LOW
HIGH 240(60%) 100(20%)
LOW 160(40%) 400(80%)
TOTAL 400(100%) 500(100%)
D. POLITICAL AWARNESS BY PLACE OF
LIVING CONTROLLING FOR EDUCATION
LEVEL
POLITICAL
AWARNESS
HIGH EDUCATION LOW EDUCATION TOTAL
URBAN RURAL URBAN RURAL
HIGH 180(60%
)
60(60%) 20(20%) 80(20%) 340
LOW 120(40%
)
40(40%) 80(80%) 320(80%) 560
TOTAL 300(100) 100(100%
)
100(100%
)
400(100%
)
900
Findings..
Educational level determines for both
political awareness and place of living.
that is people who are educated tend
to live in urban area and they are
more politically aware .
There is no inherent link between
political awareness and place of living
and the relationship between them is
spurious.
EVALUATION..
It used in all levels of measurement.
It is a tedious process.
It necessitates sub division of sample
to sub categories.
Validity and reliability is questionable
PARTIAL CORRELATION
It is a statistical method designed to
measure the relationship between an
independent variable and dependent
variable by holding all other variables
constant.
It cancels out the effect of control
variable on dependent and
independent variable and thus shows
the unmarred direct association
between them.
A partial correlation with one
control variable is known as
first order correlation, and with
two it is known as second
order correlation, and so on.
interpretation
It ranges from -1.00 to +1.00 .
If the value of partial correlation is not
very much difference from correlation
of independent and dependent
variables, original association between
them may be real. If on the other hand
if the partial correlation is far below the
orginal zero order correlation, the
original association found to be
spurious
MULTIPLE CORRELATION
Multiple correlation shows (R)
shows the combined effects of
two or more independent
variables on dependent
variable.
The value of r2 is termed as
coefficient of multiple
determination.
Interpretation.
The r2 ranges from 0 to +1.00
The value 1.00 shows that
independent variables perfectly predict
the dependent variable.
And the value zero indicating that
there is no linear relationship.
top related