© 2011 pearson prentice hall, salkind. measurement, reliability and validity
TRANSCRIPT
© 2011 Pearson Prentice Hall, Salkind.
Explain why measurement is important to the research process.
Discuss the four levels of measurement and provide an example of each.
Explain the concept of reliability in terms of observed score, true score, and error.
Describe the two elements that can make up an error score. List methods for increasing reliability. Discuss four ways in which reliability can be examined. Provide a conceptual definition of validity. List the three traditional types of validity. Explain the relationship between reliability and validity.
© 2011 Pearson Prentice Hall, Salkind.
The Measurement Process Levels of Measurement Reliability and Validity: Why They Are Very,
Very Important Validity The Relationship Between Reliability and
Validity Closing (and Very Important) Thoughts
© 2011 Pearson Prentice Hall, Salkind.
Two definitions◦ Stevens—“assignment of numerals to objects or
events according to rules.”◦ “…the assignment of values to outcomes.”
Chapter foci◦ Levels of measurement◦ Reliability and validity
© 2011 Pearson Prentice Hall, Salkind.
Variables are measured at one of these four levels Qualities of one level are characteristic of the next level
up The more precise (higher) the level of measurement, the
more accurate is the measurement process
Level of Measurement
For Example Quality of Level
Ratio Rachael is 5’ 10” and Gregory is 5’ 5” Absolute zero
Interval Rachael is 5” taller than Gregory An inch is an inch is an inch
Ordinal Rachael is taller than Gregory Greater than
Nominal Rachael is tall and Gregory is short Different from
© 2011 Pearson Prentice Hall, Salkind.
Qualities Example What You Can Say
What You Can’t Say
Assignment of labels
Gender— (male or female)Preference— (like or dislike)Voting record— (for or against)
Each observation belongs in its own category
An observation represents “more” or “less” than another observation
© 2011 Pearson Prentice Hall, Salkind.
Qualities Example What You Can Say
What You Can’t Say
Assignment of values along some underlying dimension
Rank in collegeOrder of finishing a race
One observation is ranked above or below another.
The amount that one variable is more or less than another
© 2011 Pearson Prentice Hall, Salkind.
Qualities Example What You Can Say
What You Can’t Say
Equal distances between points
Number of words spelled correctlyIntelligence test scoresTemperature
One score differs from another on some measure that has equally appearing intervals
The amount of difference is an exact representation of differences of the variable being studied
© 2011 Pearson Prentice Hall, Salkind.
Qualities Example What You Can Say
What You Can’t Say
Meaningful and non-arbitrary zero
AgeWeightTime
One value is twice as much as another or no quantity of that variable can exist
Not much!
© 2011 Pearson Prentice Hall, Salkind.
Continuous variables◦ Values can range along a continuum◦ E.g., height
Discrete variables (categorical)◦ Values are defined by category boundaries◦ E.g., gender
© 2011 Pearson Prentice Hall, Salkind.
Measurement should be as precise as possible
In psychology, most variables are probably measured at the nominal or ordinal level
But—how a variable is measured can determine the level of precision
© 2011 Pearson Prentice Hall, Salkind.
Reliability—tool is consistent Validity—tool measures “what-it-should” Good assessment tools
◦ Rejection of Null hypotheses OR
◦ Acceptance of Research hypotheses
© 2011 Pearson Prentice Hall, Salkind.
Method ErrorObserved Score = True Score + Error Score
Trait Error
© 2011 Pearson Prentice Hall, Salkind.
Observed score ◦ Score actually observed◦ Consists of two components
True Score Error Score
Method ErrorObserved Score = True Score + Error Score
Trait Error
© 2011 Pearson Prentice Hall, Salkind.
True score◦ Perfect reflection of true value for individual◦ Theoretical score
Method ErrorObserved Score = True Score + Error Score
Trait Error
© 2011 Pearson Prentice Hall, Salkind.
Error score ◦ Difference between observed and true score
Method ErrorObserved Score = True Score + Error Score
Trait Error
© 2011 Pearson Prentice Hall, Salkind.
Method error is due to characteristics of the test or testing situation
Trait error is due to individual characteristics Conceptually, reliability =
Reliability of the observed score becomes higher if error is reduced!!
Method ErrorObserved Score = True Score + Error Score
Trait Error
True ScoreTrue Score + Error Score
© 2011 Pearson Prentice Hall, Salkind.
Increase sample size Eliminate unclear questions Standardize testing conditions Moderate the degree of difficulty of the
tests Minimize the effects of external events Standardize instructions Maintain consistent scoring procedures
© 2011 Pearson Prentice Hall, Salkind.
Reliability is measured using a◦ Correlation coefficient◦ r test1•test2
Reliability coefficients◦ Indicate how scores on one test change relative to
scores on a second test◦ Can range from -1.0 to +1.0
+1.00 = perfect reliability 0.00 = no reliability
© 2011 Pearson Prentice Hall, Salkind.
Type of Reliability
What It Is How You Do It What the Reliability Coefficient Looks Like
Test-Retest A measure of stability
Administer the same test/measure at two different times to the same group of participants
rtest1•test1
Parallel Forms
A measure of equivalence
Administer two different forms of the same test to the same group of participants
rform1•form2
Inter-Rater A measure of agreement
Have two raters rate behaviors and then determine the amount of agreement between them
Percentage of agreements
Internal Consistency
A measure of how consistently each item measures the same underlying construct
Correlate performance on each item with overall performance across participants
Cronbach’s alpha
Kuder-Richardson
© 2011 Pearson Prentice Hall, Salkind.
A valid test does what it was designed to do A valid test measures what it was designed
to measure
© 2011 Pearson Prentice Hall, Salkind.
Validity refers to the test’s results, not to the test itself
Validity ranges from low to high, it is not “either/or”
Validity must be interpreted within the testing context
© 2011 Pearson Prentice Hall, Salkind.
Type of Validity What Is It? How Do You Establish It?
Content A measure of how well the items represent the entire universe of items
Ask an expert if the items assess what you want them to
Criterion
Concurrent A measure of how well a test estimates a criterion
Select a criterion and correlate scores on the test with scores on the criterion in the present
Predictive A measure of how well a test predicts a criterion
Select a criterion and correlate scores on the test with scores on the criterion in the future
Construct A measure of how well a test assesses some underlying construct
Assess the underlying construct on which the test is based and correlate these scores with the test scores
© 2011 Pearson Prentice Hall, Salkind.
Correlate new test with an established test
Show that people with and without certain traits score differently
Determine whether tasks required on test are consistent with theory guiding test development
© 2011 Pearson Prentice Hall, Salkind.
Convergent validity—different methods yield similar results Discriminant validity—different methods yield different results
Method 1
Paper and Pencil
Method 2
Activity Level Monitor
Method 1
Paper and Pencil
Method 2
Activity Level Monitor
Trait 1
Method 1
Paper and Pencil Moderate Low
Impulsivity Method 2
Activity Level Monitor
Moderate
Trait 2
Method 1
Paper and Pencil
Activity Level
Method 2
Activity Level Monitor
Low
Trait 1
Impulsivity
Trait 2
Activity Level
© 2011 Pearson Prentice Hall, Salkind.
A valid test must be reliableBut
A reliable test need not be valid
© 2011 Pearson Prentice Hall, Salkind.
You must define a reliable and valid dependent variable or you will not know whether or not there truly is no difference between groups!
Use a test with established and acceptable levels of reliability and validity.
If you cannot do this, develop such a test for your thesis or dissertation (and do no more than that) OR change what you are measuring.
© 2011 Pearson Prentice Hall, Salkind.
Explain why measurement is important to the research process?
Discuss the four levels of measurement and provide an example of each?
Explain the concept of reliability in terms of observed score, true score, and error?
Describe the two elements that can make up an error score?
List methods for increasing reliability? Discuss four ways in which reliability can be examined? Provide a conceptual definition of validity? List the three traditional types of validity? Explain the relationship between reliability and validity?