test reliability & development using irt - jonathan … reliability & development using irt...
TRANSCRIPT
![Page 1: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/1.jpg)
Test Reliability & Development Using IRT
University of KansasItem Response Theory
Stats Camp ‘07
![Page 2: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/2.jpg)
Overview
• Reliability with IRT–Item and Test Information
Functions• Concepts• Equations• Uses and Examples
• Optimal Test Design
![Page 3: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/3.jpg)
Reliability with IRT
• We all know that reliability (precision) is a desirable property for an assessment.
• The more reliable a test is, the more precisely we can measure the construct.
• For any scaling procedure (IRT or CTT), as reliability goes up, the standard error of measurement goes down.
![Page 4: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/4.jpg)
Reliability with IRT
• In CTT, reliability is a one-number summary of test precision, and there is a corresponding single standard error of measurement that is used for any test score.
• In IRT, test precision is conceptualized as something called Information, which is conditional on the trait level being measured.– Some tests could measure certain trait levels very
well but measure others poorly…
![Page 5: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/5.jpg)
Reliability with IRT
• A further advantage of IRT with respect to evaluating reliability is that we can consider the amount of Information an item and/or a test provides.
• In CTT, measures of item quality exist, but these are only indirectly related to what the reliability of the test will be.
![Page 6: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/6.jpg)
Item Information Function
• “Item Information” indicates an item’s usefulness for assessing ability.
• By “usefulness” we basically mean how good an item is at distinguishing examinees with lower ability levels from those with higher ability levels.
• Information Precision
![Page 7: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/7.jpg)
0.0
0.2
0.4
0.6
0.8
1.0
-3 -2 -1 0 1 2 3
Ability (θ)
P (u
= 1
| θ)
0.0
0.2
0.4
0.6
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ)
0.8
1.0
![Page 8: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/8.jpg)
Item Information Function
• Items are basically more informative where the slope of the ICC is steepest, which happens when…bj is relatively close to θi,aj is relatively high, andcj is relatively low
• If cj = 0, an item provides its maximum information when θi = bj
![Page 9: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/9.jpg)
0.0
0.2
0.4
0.6
0.8
1.0
-3 -2 -1 0 1 2 3
Ability (θ)
P (u
= 1
| θ)
a = 1.0
c = 0.0
b = 1.0 or 2.0
![Page 10: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/10.jpg)
0.0
0.2
0.4
0.6
0.8
1.0
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ)
a = 1.0
c = 0.0
b = 1.0 or 2.0
![Page 11: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/11.jpg)
0.0
0.2
0.4
0.6
0.8
1.0
-3 -2 -1 0 1 2 3
Ability (θ)
P (u
= 1
| θ)
b = -1.0
c = 0.2
a = 1.0 or 0.5
![Page 12: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/12.jpg)
0.0
0.2
0.4
0.6
0.8
1.0
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ)
b = -1.0
c = 0.2
a = 1.0 or 0.5
![Page 13: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/13.jpg)
0.0
0.2
0.4
0.6
0.8
1.0
-3 -2 -1 0 1 2 3
Ability (θ)
P (u
= 1
| θ)
a = 1.0
b = 0.0
c = 0.0 or 0.2
![Page 14: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/14.jpg)
0.0
0.2
0.4
0.6
0.8
1.0
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ)
a = 1.0
b = 0.0
c = 0.0 or 0.2
![Page 15: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/15.jpg)
Item Information Function
• IMPORTANT: information is a function of θ, which means that an item could be very informative for some ability levels and relatively uninformative for others.
• Example: difficult items are informative for higher ability levels, but don’t tell us much about lower ability levels (because they mostly get all those items wrong!)
![Page 16: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/16.jpg)
0.0
0.2
0.4
0.6
0.8
1.0
-3 -2 -1 0 1 2 3
Ability (θ)
P (u
= 1
| θ)
c = 0.0
a = 1.2 or 0.8
b = 1.0 or 0.0
![Page 17: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/17.jpg)
0.0
0.2
0.4
0.6
0.8
1.0
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ)
c = 0.0
a = 1.2 or 0.8
b = 1.0 or 0.0
![Page 18: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/18.jpg)
Item Information Functionfor the 3-PL
' 2
2 2
( ) ( ) 2
[ ( )]( )
( ) ( )
(1 )[ ][1 ]j j j j
jj
j j
j jDa b Da b
j
PI
P Q
D a cc e eθ θ
θθ
θ θ
− − −
=
−=
+ +
![Page 19: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/19.jpg)
Notes on IIF
• The roles of aj and cj are easy to see– as aj increases, information increases– as cj increases, information decreases
• As ability moves away from bj (+ or -) the denominator increases, so information approaches zero.
![Page 20: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/20.jpg)
Maximum Information
If cj = 0, then Information is maximized at bj
If cj > 0, then Information is maximized at an ability level slightly greater than bj
max1 ln 0.5(1 1 8 )j j
j
b cDa
θ ⎡ ⎤= + + +⎣ ⎦
![Page 21: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/21.jpg)
Test Information Function
• Just like we add up ICCs to get a TCC, we add up IIFs to get a TIF.
• Information will continue to increase as we add test items, therefore increasing precision.
• All things equal, longer tests provide increased measurement precision.
![Page 22: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/22.jpg)
Test Information Function
• Defined for a set of items at each point along the ability (θ) scale
• Test information is influenced by the ‘quality’ and the number of test items
1
( ) ( )n
jj
I Iθ θ=
=∑
![Page 23: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/23.jpg)
0.0
0.2
0.4
0.6
0.8
1.0
-3 -2 -1 0 1 2 3
Ability (θ)
P (u
= 1
| θ)
![Page 24: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/24.jpg)
0
1
2
3
4
5
6
7
8
-3 -2 -1 0 1 2 3
Ability (θ)
E(X
| θ)
![Page 25: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/25.jpg)
0.0
0.2
0.4
0.6
0.8
1.0
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ)
![Page 26: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/26.jpg)
0
1
2
3
4
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ)
![Page 27: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/27.jpg)
0
1
2
3
4
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ)
![Page 28: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/28.jpg)
Conditional Error for Maximum Likelihood Estimates
• One of the great benefits of IRT scaling is that measurement precision and error can now be considered conditional on θ.
![Page 29: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/29.jpg)
Conditional Error for Maximum Likelihood Estimates
• Standard error of an MLE is determined by:
1ˆ( )ˆ( )
SEI
θθ
=
![Page 30: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/30.jpg)
Conditional Standard Error
• The imprecision of ability estimation is therefore inversely related to the amount of Information with respect to ability that is available.
• Since Information increases with the quality and number of items, the SE conversely decreases…which hopefully makes some sense!
![Page 31: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/31.jpg)
0
1
2
3
4
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ) a
nd S
E(θ)
8-item Test Information Function
![Page 32: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/32.jpg)
0
2
4
6
8
10
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ) a
nd S
E(θ)
Information may be spread across a relatively wide range…
![Page 33: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/33.jpg)
0
2
4
6
8
10
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ) a
nd S
E(θ)
or maximized around an ability level of interest(e.g., a cutscore)
![Page 34: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/34.jpg)
Info and SE Example
At 1.0, ( 1) 91 1ˆ( ) 0.33
ˆ 9( )ˆ ˆIf 1.0, ( ) 0.33
I
SEI
SE
θ θ
θθ
θ θ
= = =
= = =
= =
![Page 35: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/35.jpg)
Info and SE Example
At 0.0, ( 0) 31 1ˆ( ) 0.58
ˆ 3( )ˆ ˆIf 0.0, ( ) 0.58
I
SEI
SE
θ θ
θθ
θ θ
= = =
= = =
= =
![Page 36: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/36.jpg)
Info and SE Example
At 1.0, ( 1) 11 1ˆ( ) 1.0
ˆ 1( )ˆ ˆIf 1.0, ( ) 1.0
I
SEI
SE
θ θ
θθ
θ θ
=− =− =
= = =
=− =
![Page 37: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/37.jpg)
95% Confidence Interval
• Because MLEs are asymptotically normally distributed, we create a 95% confidence interval around a point estimate of ability by adding and subtracting 1.96 standard errors:
• Estimate ± 1.96 SE(recall critical values from a standard normal distribution)
![Page 38: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/38.jpg)
0
0.1
0.2
0.3
0.4
0.5
-3 -2 -1 0 1 2 3
Prob
abili
tyStandard Normal Distribution
0.025 0.025
0.95
![Page 39: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/39.jpg)
95% Confidence Interval
• For θ = 1, SE=0.33 1.0 ± 0.65– 95% chance that examinee’s true ability is in
between 0.35 and 1.65• For θ = 0, SE=0.58 0.0 ± 1.14
– 95% chance that examinee’s true ability is in between -1.14 and 1.14
• For θ = -1, SE=1.0 -1.0 ± 1.96– 95% chance that examinee’s true ability is in
between -2.96 and 0.96
![Page 40: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/40.jpg)
95% Confidence Interval
• As information increases…– SE decreases– CI becomes narrower– Increased trust in ability estimate
• As information decreases…– SE increases– CI becomes wider– Decreased trust in ability estimate
![Page 41: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/41.jpg)
Notes on IIF and TIF
• Note that the contribution of Ij(θ) to I(θ) does not depend on the particular combination of test items.– Each item contributes independently
• This is a very big advantage of IRT over CTT: reliability can be described conditionally (as information), and it does not depend on the particular set of items.
![Page 42: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/42.jpg)
Mini-CTT lesson• In CTT, item discrimination (quality) is the
item-total correlation• This will depend on the item itself, but is
also influenced by the other test items.• Adding items changes the total score, thus
changing the correlation.• Therefore, it’s difficult to anticipate the
reliability of a test when creating a form from a bank of previously piloted items, unless those items all appeared together.
![Page 43: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/43.jpg)
CTT versus IRT• In IRT, item quality is Information, which
is affected by aj, bj, cj, and θ.• An item’s information function will be
independent of the other items on the test, as will its contribution to the TIF.
• Adding more and/or better items will increase TIF, but won’t impact any IIF.
• Therefore, it’s easy to anticipate the reliability of a test when creating a form from a bank of previously piloted items.
![Page 44: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/44.jpg)
Excel Spreadsheet Demo
• Show Excel Spreadsheet containing eight items, their ICCs, TCC, IIFs, TIF and SE.
• Specify different item parameters and determine how changes affect the resulting graphs.
![Page 45: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/45.jpg)
Uses of Item and Test Information Functions
1) Providing conditional SE of trait2) Building a test to meet desired
statistical specifications3) Revising an existing test4) Comparing tests
![Page 46: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/46.jpg)
Conditional SE
• As previously stated, the precision (reliability) and imprecision (error) of a test scaled with IRT is conditional on θ.
• Tests may be better or worse for measuring certain trait levels
![Page 47: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/47.jpg)
Test Development
• From a pool of previously piloted test items, IRT makes it relatively easy to switch items in and out and determine what the resulting Information function will be.
• This tells the test maker what the conditional standard errors will be, too.
![Page 48: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/48.jpg)
Test Development
• Another benefit to test development is that multiple forms may be built to the same statistical specifications.
• This process is often referred to as “Pre-equating.”
• Building strictly parallel forms is always difficult, but these procedures can help.
![Page 49: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/49.jpg)
Test Revision
• Likewise, test items may be removed from previously existing forms (e.g, to create a “short form” of a test).
• Test items may also need to be added if the previous form is found to be unreliable.
• Estimating the new reliability of the test is straightforward with IRT
![Page 50: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/50.jpg)
Test Revision
• In CTT, such test revisions require the assumption that the deleted or added items are of comparable statistical quality to those already on the test.–Spearman-Brown prophecy formula–This may or may not be true!
![Page 51: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/51.jpg)
Comparing Tests
• When comparing the reliability (i.e., precision) of two test forms, its useful to determine the ratio of their information with respect to θ.
• This ratio is known as the relative efficiency of a test: RE(θ).
• Consider two previous example TIFs
![Page 52: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/52.jpg)
0
2
4
6
8
10
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ) a
nd S
E(θ)
Information targeted around a cutscore
We’ll call this“Form X”
![Page 53: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/53.jpg)
0
2
4
6
8
10
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ) a
nd S
E(θ)
Information spread across a wide range
We’ll call this“Form Y”
![Page 54: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/54.jpg)
( ) info for form X at ( )( ) info for form Y at
Suppose at =1 ( ) 9.0 =1 ( ) 3.6
9Then, ( 1) 2.53.6
X
Y
X
Y
IREI
II
RE
θ θθθ θ
θ θθ θ
θ
= →
→ =→ =
= = =
![Page 55: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/55.jpg)
0
2
4
6
8
10
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ)
In the region θ = 1, Form X is 2.5 times more efficient than Form Y
![Page 56: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/56.jpg)
0
2
4
6
8
10
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ)
In the region θ ≈ 0.10, Form X is just as efficient as Form Y
![Page 57: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/57.jpg)
0
2
4
6
8
10
-3 -2 -1 0 1 2 3
Ability (θ)
Info
( θ)
In the region θ = -1, Form X is LESS efficient than Form Y RE(θ)=0.23
![Page 58: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/58.jpg)
0
1
2
3
4
5
6
-3 -2 -1 0 1 2 3
Ability (θ)
RE(θ)
Form X is more efficient than Form Y above the point θ ≈ 0.1
![Page 59: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/59.jpg)
0
2
4
6
8
10
12
-3 -2 -1 0 1 2 3
Ability (θ)
RE(θ)
Form Y is more efficient than Form X below the point θ ≈ 0.1
![Page 60: Test Reliability & Development Using IRT - Jonathan … Reliability & Development Using IRT University of Kansas Item Response Theory Stats Camp ‘07 Overview • Reliability with](https://reader030.vdocuments.us/reader030/viewer/2022020214/5b19b1ad7f8b9a2d258ccd5e/html5/thumbnails/60.jpg)
Next…
• Test Score Equating using IRT