d8 and d9 personality test development 10 2007-posting
DESCRIPTION
TRANSCRIPT
![Page 1: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/1.jpg)
Personality Test Development
Introduction to Clinical Psychology
Discussion Section #8 and #9
![Page 2: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/2.jpg)
Personality Test Construction
Goal: Gain an increased understanding of the
concepts reliability and validity as they pertain to tests
Gain an increased understanding of test development methods
![Page 3: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/3.jpg)
Test Construction Procedure
1. Identify a need for a new test
2. Assemble an item pool (decide on scale and item formats)
3. Pilot item pool
4. Select “good” items
5. Examine test’s psychometric properties (reliability and validity)
![Page 4: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/4.jpg)
1. Identify Need for a New Test
What is the objective of the new test/is there really a need for it
How will the test be administered? What is the ideal item format for this test? Should more than one form be developed? What special training will be required of test
users in terms of administering or interpreting the test?
![Page 5: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/5.jpg)
2. Assemble Item Pool
Two decisions: Content Format
![Page 6: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/6.jpg)
Content
Develop a pool of items that fully measure the construct
Example: Depression What items should be included in the
pool?
![Page 7: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/7.jpg)
Format
Dichotomous (true false) Polychotomous (multiple choice) Likert scales (degree of agreement) …many others
![Page 8: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/8.jpg)
3. Pilot Item Pool
Try the pool of items out on people for whom the test is being developed
Test should be administered under conditions similar to those that the developed test will be administered (e.g. same instructions, time frame, time limits)
![Page 9: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/9.jpg)
4. Select “Good” Items
Selecting “good” items involves complex statistical analysis of the test results which varies according to the purpose of the test.(called item analysis)
However, in tests of attitudes or personality characteristics one consideration is whether individuals endorse the full range of the scale provided.
![Page 10: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/10.jpg)
5. Examine Test’s Psychometric Properties
Does the test yield consistent results (reliability)?
Do the test items measure the intended construct (validity)?
![Page 11: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/11.jpg)
Test Construction Exercise: Part 1
Develop a test that distinguishes first and later born children
![Page 12: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/12.jpg)
Test Construction Exercise: Procedure
Divide into groups of 4 to 5 studentsIn Class As a group, develop an item to distinguish first
born from later born children Note: use a personality construct and not a
physical characteristic (e.g. I have no older siblings)
Develop two responses for the item Once your item is ready, tell Sara or Eunyoe
so they can write it on the board (so others won’t give the same item)
![Page 13: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/13.jpg)
Administer Test
Item % First Born Agree
% Later Born Agree
![Page 14: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/14.jpg)
Administer Final Test and Score!
![Page 15: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/15.jpg)
Psychometric Properties of Tests
Reliability and Validity
![Page 16: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/16.jpg)
Reliability
Consistency of the observations or measurements
Reliability is inversely related to the degree of error in the instrument.
High measurement error translates to low reliability
Low measurement error translates to high reliability
![Page 17: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/17.jpg)
What !?
What does this mean!?
High measurement error translates to low reliability
Low measurement error
translates to high reliability
Easy Example: A broken scale
There will be high measurement error on a broken scale, correct?
How consistent are the weights likely to be on a broken scale?
Is a broken or working scale going to have more error?
Is the broken or working scale going to be more reliable?
![Page 18: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/18.jpg)
Types of Measurement Error
Random
Factors unpredictably influence measurements.
Examples:
Mood, environmental distractions, hunger or motivation interfere with the responses.
Systematic
A persistent bias in the test or in the interpretations made by examiner.
Systematic errors, because they are consistently made will not affect reliability but they will affect validity
![Page 19: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/19.jpg)
Types of Reliability
Inter-rater reliability (relevant to observational systems and psychological assessments requiring ratings or judgment)
Test-retest reliability Split-half
Note: Each form of reliability is not equally important for every assessment method
![Page 20: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/20.jpg)
Inter-rater Reliability
Degree of correspondence between two raters
Inter-rater reliability of diagnoses based on DSM criteria improved with DSM-III and the development of operational criteria for most of the mental disorders
Note: We will learn how to calculate next week!.
![Page 21: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/21.jpg)
Test-Retest Reliability
The consistency of results over periods of time.
The consistency of the results for a test given at two different time periods
The correlation of test result scores
![Page 22: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/22.jpg)
Quantifying Test-Retest Reliability
Reliability is expressed as a correlation coefficient
Values range from 0 (not at all consistent or reliable) to 1 ( perfectly consistent and reliable.
The value for adequate reliability is about .80 or greater
![Page 23: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/23.jpg)
Factors Affecting Test-Retest Reliability Estimates
Length of the intervening interval Stability of the measured trait
For example: In characteristics that are stable, like intelligence, the
interval of time between the two tests should not affect the stability of the results.
In contrast, in characteristics that are not stable, like depressed mood, the longer the interval between tests, the less reliable or consistent the scores. (not necessarily bad)
![Page 24: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/24.jpg)
Split Half Reliability
The consistency of scores on two halves of the test
![Page 25: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/25.jpg)
Validity
A test can be reliable (consistently give the same results) but not valuable.
Why?
If the test does not measure the correct construct, then it is not useful even if the results are consistent.
![Page 26: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/26.jpg)
Validity
The degree to which a test measures what it is designed or intended to measure.
![Page 27: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/27.jpg)
Types of Validity
Face validity Content validity Criterion validity (predictive and
concurrent) Discriminant Construct validity
![Page 28: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/28.jpg)
Face Validity
A judgment about the relevance of test items A type of validity that is more from the
perspective of the test taker as opposed to the test user
Example: Personality testsIntroversion-Extroversion test will be perceived
as a highly (face) valid measure of personality functioning
The inkblot test may not be perceived as a (face) valid method of personality functioning
![Page 29: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/29.jpg)
Content Validity
Degree to which the measure covers the full range of the (personality) construct.
and Degree to which the measure excludes
factors that are not representative of the construct
![Page 30: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/30.jpg)
Criterion Validity
The degree to which the test results (from your measure) are correlated with another related construct.
WHAT!?
For example: the degree to which scores on an intelligence test are correlated with school performance or achievement.
![Page 31: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/31.jpg)
Types of Criterion Validity Concurrent: the two constructs are assessed at the same
time Predictive: one construct may be measured at a later
date
For example:Concurrent: the correlation of SAT score with G.P.A. at the
time of taking the SAT in high school.Predictive: the correlation of SAT score taken in high school
with final G.P.A. upon graduating from college
![Page 32: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/32.jpg)
Discriminant Validity The degree to which the score on a measure
of a personality trait does not correlate with scores on measures of traits that are unrelated with the trait under investigation.
For example: (from text)Trait being measured: phobiaUnrelated trait: intelligenceYou would not expect the score on your phobia
scale to be correlated with the score on an intelligence test
![Page 33: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/33.jpg)
Construct Validity
The degree to which the measure reflects the structure and features of the hypothetical construct that is being measured
Measured by combining all these aspects of validity.
![Page 34: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/34.jpg)
Exercise: Reliability and Validity applied to the Edinburgh Postnatal Depression Scale (EPDS)
Let’s consider reliability and validity in the context of a real measure: the EPDS
![Page 35: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/35.jpg)
What is the Edinburgh Postnatal Depression Scale (EPDS)?
John Cox, Jenifer Holden & Ruth Sagovsky
10 item depression screening tool (reliable and valid)
Simple to complete Acceptable to
mothers and health workers
![Page 36: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/36.jpg)
What is the Edinburgh Postnatal Depression Scale (EPDS)?
Psychometric Characteristics 10 item scale Assesses mood aspects of depression
not confounding somatic symptoms Acceptable to women Validated Translated into many languages
![Page 37: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/37.jpg)
Stems of all 10 EPDS Items I have been able to laugh and see the funny side
of things. I have looked forward with enjoyment to things. I have blamed myself unnecessarily when things
went wrong. I have been anxious or worried for no good reason. Things have been getting on top of me.
![Page 38: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/38.jpg)
Stems of all 10 EPDS Items (cont)
I have felt scared or panicky for no very good reason.
I have been so unhappy that I have had difficulty sleeping.
I have felt sad or miserable. I have been so unhappy that I have been
crying. The thought of harming myself has occurred to
me.
![Page 39: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/39.jpg)
Psychometric Evaluation of the EPDS: An Exercise
Is the EPDS a good measure of depression?
Psychometrically, what does it mean to ask if the EPDS is a “good” measure of depression?
Note: Follow the questions on the handout
![Page 40: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/40.jpg)
Handout Questions
![Page 41: D8 and d9 personality test development 10 2007-posting](https://reader033.vdocuments.us/reader033/viewer/2022061109/5451b9e3af795911068b70a6/html5/thumbnails/41.jpg)
Test Construction Exercise:Part 2: Evaluating Developed Tests
1. Regroup into your “test groups”
2. Evaluate items in terms of content validity and adequacy of scales
3. Select final items for test
4. Propose methods for evaluating reliability and validity of new measure