inference & culture slide 1 october 21, 2004 cognitive diagnosis as evidentiary argument robert...
Post on 21-Dec-2015
213 views
TRANSCRIPT
Inference & Culture Slide 1October 21, 2004
Cognitive Diagnosis as Evidentiary Argument
Robert J. Mislevy
Department of Measurement, Statistics, & EvaluationUniversity of Maryland, College Park, MD
October 21, 2004
Presented at the Fourth Spearman Conference, Philadelphia, PA, Oct. 21-23, 2004.
Thanks to Russell Almond, Charles Davis, Chun-Wei Huang, Sandip Sinharay, Linda Steinberg, Kikumi Tatsuioka, David Williamson, and Duanli Yan.
Inference & Culture Slide 2October 21, 2004
Introduction
An assessment is a particular kind of evidentiary argument.
Parsing a particular assessment in terms of the elements of an argument provides insights into more visible features such as tasks and statistical models.
Will look at cognitive diagnosis from this perspective.
Inference & Culture Slide 3October 21, 2004
Toulmin's (1958) structure for arguments
Reasoning flows from data (D) to claim (C) by justification of a warrant (W), which in turn is supported by backing (B). The inference may need to be qualified by alternative explanations (A), which may have rebuttal evidence (R) to support them.
C
D
W
B
A
R
since
soon
accountof
unless
supports
Inference & Culture Slide 4October 21, 2004
Specialization to assessment
The role of psychological theory:» Nature of claims & data» Warrant connecting claims and data: “If student were x, would probably do y”
The role of probability-based inference: “Student does y; what is support for x’s?”
Will look first at assessment under behavioral perspective, then see how cognitive diagnosis extends the ideas.
Inference & Culture Slide 5October 21, 2004
Behaviorist Perspective
The evaluation of the success of instruction and of the student’s learning becomes a matter of placing the student in a sample of situations in which the different learned behaviors may appropriately occur and noting the frequency and accuracy with which they do occur.
D.R. Krathwohl & D.A. Payne, 1971, p. 17-18.
The claim addresses the expected value of performance of the targeted kind in the targeted situations.
The claim addresses the expected value of performance of the targeted kind in the targeted situations.
C : Sue's probability ofcorrectly answering a 2-digit subtraction problemwith borrowing is p
W:Sampling theory machineryA: [e.g., observational
errors, data errors,misclassification ofresponses orperformance situations,distractions, etc.]
since
so
unless
and
for reasoning from trueproportion for correctresponses in n targetedsituations to observed counts .
D11: Sue'sanswer to Item j
D11: Sue'sanswer to Item j
D1j: Sue'sanswer to Item j
D2j structure
and contentsof Item j
D2j structure
and contentsof Item j
D2j structure
and contentsof Item j
The student data address the salient features of the responses.
The student data address the salient features of the responses.
C : Sue's probability ofcorrectly answering a 2-digit subtraction problemwith borrowing is p
W:Sampling theory machineryA: [e.g., observational
errors, data errors,misclassification ofresponses orperformance situations,distractions, etc.]
since
so
unless
and
for reasoning from trueproportion for correctresponses in n targetedsituations to observed counts .
D11: Sue'sanswer to Item j
D11: Sue'sanswer to Item j
D1j: Sue'sanswer to Item j
D2j structure
and contentsof Item j
D2j structure
and contentsof Item j
D2j structure
and contentsof Item j
The task data address the salient features of the stimulus situations (i.e., tasks).
The task data address the salient features of the stimulus situations (i.e., tasks).
C : Sue's probability ofcorrectly answering a 2-digit subtraction problemwith borrowing is p
W:Sampling theory machineryA: [e.g., observational
errors, data errors,misclassification ofresponses orperformance situations,distractions, etc.]
since
so
unless
and
for reasoning from trueproportion for correctresponses in n targetedsituations to observed counts .
D11: Sue'sanswer to Item j
D11: Sue'sanswer to Item j
D1j: Sue'sanswer to Item j
D2j structure
and contentsof Item j
D2j structure
and contentsof Item j
D2j structure
and contentsof Item j
The warrant encompasses definitions of the class of stimulus situations, response classifications, and sampling theory.
The warrant encompasses definitions of the class of stimulus situations, response classifications, and sampling theory.
C : Sue's probability ofcorrectly answering a 2-digit subtraction problemwith borrowing is p
W:Sampling theory machineryA: [e.g., observational
errors, data errors,misclassification ofresponses orperformance situations,distractions, etc.]
since
so
unless
and
for reasoning from trueproportion for correctresponses in n targetedsituations to observed counts .
D11: Sue'sanswer to Item j
D11: Sue'sanswer to Item j
D1j: Sue'sanswer to Item j
D2j structure
and contentsof Item j
D2j structure
and contentsof Item j
D2j structure
and contentsof Item j
Inference & Culture Slide 10October 21, 2004
Statistical Modeling of Assessment Data
X1
.X2
.X3
.
p()
p(X1|)
p(X2|)
p(X3|)
Claims in terms of values of unobservable variables in student model (SM)--characterize student knowledge.
Data modeled as depending probabilistically on SM vars.
Estimate conditional distributions of data given SM vars.
Bayes theorem to infer SM variables given data.
Claims in terms of values of unobservable variables in student model (SM)--characterize student knowledge.
Data modeled as depending probabilistically on SM vars.
Estimate conditional distributions of data given SM vars.
Bayes theorem to infer SM variables given data.
Inference & Culture Slide 11October 21, 2004
Specialization to cognitive diagnosis
Information-processing perspective foregrounded in cognitive diagnosis
Student model contains variables in terms of, e.g.,» Production rules at some grain-size» Components / organization of knowledge» Possibly strategy availability / usage
Importance of purpose
Inference & Culture Slide 12October 21, 2004
Responses consistent with the"subtract smaller from larger" bug
821 - 285 664
885 - 221 664
63 - 15 52
17 - 9 1 2
“Buggy arithmentic”: Brown & Burton (1978); VanLehn (1990)
Inference & Culture Slide 13October 21, 2004
Some Illustrative Student Models in Cognitive Diagnosis
Whole number subtraction:» ~ 200 production rules (VanLehn, 1990)» Can model at level of bugs (Brown & Burton) or at
the level of impasses (VanLehn) John Anderson’s ITSs in algebra, LISP
» ~ 1000 production rules» 1-10 in play at a given time
Reverse-engineered large-scale tests» ~10-15 skills
Mixed number subtraction (Tatsuoka)» ~5-15 production rules / skills
Inference & Culture Slide 14October 21, 2004
Mixed number subtraction
Based on example from Prof. Kikumi Tatsuoka (1982).» Cognitive analysis & task design» Methods A & B» Overlapping sets of skills under methods
Bayes nets described in Mislevy (1994):» Five “skills” required under Method B.» Conjunctive combination of skills» DINA stochastic model
Inference & Culture Slide 15October 21, 2004
Skill 1: Basic fraction subtractionSkill 2: Simplify/ReduceSkill 3: Separate whole number from fractionSkill 4: Borrow from whole numberSkill 5: Convert whole number to fractions
W :Sampling theory
since
so
and
for items withfeature setdefining Class 1
D11D11D11j : Sue'sanswer to Item j, Class 1
D2j
of Item j
D2j
of Item j
D21j structure
and contentsof Item j, Class1
C : Sue's probability ofanswering a Class 1subtraction problem withborrowing is p1
W0: Theory about how persons withconfigurations {K1,...,Km} would belikely to respond to items withdifferent salient features.
W :Sampling theory
since
so
and
for items withfeature setdefining Class n
D11D11D1nj : Sue'sanswer to Item j, Class n
D2j
of Item j
D2j
of Item j
D2nj structure
and contentsof Item j, Class n
C : Sue's probability ofanswering a Class nsubtraction problem withborrowing is pn
since
and
so
...
...
C: Sue's configuration ofproduction rules foroperating in the domain(knowledge and skill) is K
W :Sampling theory
since
so
and
for items withfeature setdefining Class 1
D11D11D11j : Sue'sanswer to Item j, Class 1
D2j
of Item j
D2j
of Item j
D21j structure
and contentsof Item j, Class1
C : Sue's probability ofanswering a Class 1subtraction problem withborrowing is p1
W0: Theory about how persons withconfigurations {K1,...,Km} would belikely to respond to items withdifferent salient features.
W :Sampling theory
since
so
and
for items withfeature setdefining Class n
D11D11D1nj : Sue'sanswer to Item j, Class n
D2j
of Item j
D2j
of Item j
D2nj structure
and contentsof Item j, Class n
C : Sue's probability ofanswering a Class nsubtraction problem withborrowing is pn
since
and
so
...
...
C: Sue's configuration ofproduction rules foroperating in the domain(knowledge and skill) is K
Like behaviorist inference at level of behavior in classes of structurally similar tasks.
Like behaviorist inference at level of behavior in classes of structurally similar tasks.
W :Sampling theory
since
so
and
for items withfeature setdefining Class 1
D11D11D11j : Sue'sanswer to Item j, Class 1
D2j
of Item j
D2j
of Item j
D21j structure
and contentsof Item j, Class1
C : Sue's probability ofanswering a Class 1subtraction problem withborrowing is p1
W0: Theory about how persons withconfigurations {K1,...,Km} would belikely to respond to items withdifferent salient features.
W :Sampling theory
since
so
and
for items withfeature setdefining Class n
D11D11D1nj : Sue'sanswer to Item j, Class n
D2j
of Item j
D2j
of Item j
D2nj structure
and contentsof Item j, Class n
C : Sue's probability ofanswering a Class nsubtraction problem withborrowing is pn
since
and
so
...
...
C: Sue's configuration ofproduction rules foroperating in the domain(knowledge and skill) is K
Structural patterns among behaviorist claims are data for inferences about unobservable production rules that govern behavior.
Structural patterns among behaviorist claims are data for inferences about unobservable production rules that govern behavior.
Inference & Culture Slide 19October 21, 2004
W :Sampling theory
since
so
and
for items withfeature setdefining Class 1
D11D11D11j : Sue'sanswer to Item j, Class 1
D2j
of Item j
D2j
of Item j
D21j structure
and contentsof Item j, Class1
C : Sue's probability ofanswering a Class 1subtraction problem withborrowing is p1
W0: Theory about how persons withconfigurations {K1,...,Km} would belikely to respond to items withdifferent salient features.
W :Sampling theory
since
so
and
for items withfeature setdefining Class n
D11D11D1nj : Sue'sanswer to Item j, Class n
D2j
of Item j
D2j
of Item j
D2nj structure
and contentsof Item j, Class n
C : Sue's probability ofanswering a Class nsubtraction problem withborrowing is pn
since
and
so
...
...
C: Sue's configuration ofproduction rules foroperating in the domain(knowledge and skill) is K
•This level distinguishes cognitive diagnosis from subscores.•A typical (but not necessary) difference is that cognitive diagnosis has many-to-many relationship between observable variables and student-model variables. As partitions, subscores have 1-1 relationships between scores and inferential targets.
•This level distinguishes cognitive diagnosis from subscores.•A typical (but not necessary) difference is that cognitive diagnosis has many-to-many relationship between observable variables and student-model variables. As partitions, subscores have 1-1 relationships between scores and inferential targets.
Inference & Culture Slide 20October 21, 2004
Structural and stochastic aspects of inferential models
Structural model relates student model variables (s) to observable variables (xs)» Conjunctive, disjunctive, mixture» Complete vs incomplete (e.g., fusion model)» The Q matrix (next slide)
Stochastic model addresses uncertainty» Rule based; logical with noise» Probability-based inference (discrete Bayes nets,
extended IRT models)» Hybrid (e.g., Rule Space)
Inference & Culture Slide 21October 21, 2004
The Q-matrix (Fischer, Tatsuoka)Items Features
1 1 1 0 0
2 0 1 0 0
3 1 0 0 1
4 0 0 1 1
5 0 0 1 1
qjk is extent Feature k pertains to Item j Special case: 0/1 entries and a 1-1 relationship
between features and student-model variables.
Inference & Culture Slide 22October 21, 2004
Conjunctive structural relationship
Person i: i = (i1, i2, …, iK) » Each ik =1 if person possesses “skill”, 0 if
not.
Task j: qj = (qj1, qj2, …, qjK) » A qjk = 1 if item j “requires skill k”, 0 if not.
Iij = 1 if (qjk =1 ik =1) for all k, 0 if (qjk =1 but ik =0) for any k.
Inference & Culture Slide 23October 21, 2004
Conjunctive structural relationship:No stochastic model
Pr(xij =1| i , qj ) = Iij No uncertainty about x given There is uncertainty about given x, even if
no stochastic part, due to competing explanations (Falmagne):
xij = {0,1} just gives you partitioning into all s that cover of qj, vs. those that miss with respect to at least one skill.
Inference & Culture Slide 24October 21, 2004
Conjunctive structural relationship:DINA stochastic model
Now there is uncertainty about x given Pr(xij =1| Iij =0) = j0 -- False positive
Pr(xij =1| Iij =1) = j1 -- True positive Likelihood over n items:
Posterior :
1
, ,, 1ij ij
ij ij
x x
i i j j I j Ij
x q
,i i j ix q p
Inference & Culture Slide 25October 21, 2004
The particular challenge of competing explanations
Triangulation» Different combinations of data fail to support
some alternative explanations of responses, and reinforce others.
» Why was an item requiring Skills 1 & 2 wrong?– Missing Skill 1? Missing Skill 2? A slip?– Try items requiring 1 & 3, 2 & 4, 1& 2 again.
Degree design supports inferences» Test design as experimental design
Bayes net for mixed number
subtraction(Method B)
Simplify/reduce (Skill 2)
Mixed number skills
Borrow from whole number
(Skill 4)
Separate whole number from
fraction (Skill 3)
Basic fraction subtraction
(Skill 1)
Skills 1 & 3
Skills 1, 3, & 4
Skills 1,2,3,&4
6/7 - 4/7
2/3 - 2/3
3 7/8 - 2
3 4/5 - 3 2/5
4 5/7 - 1 4/7
3 1/2 - 2 3/2
4 4/12 - 2 7/12
4 1/3 - 2 4/3
4 1/10 - 2 8/10
4 - 3 4/3
4 1/3 - 1 5/3 2 - 1/3
7 3/5 - 4/5
3 - 2 1/5
Skills 1 & 2
11/8 - 1/8Skills 1, 3, 4,
& 5
Skills 1, 2, 3, 4, & 5
Convert whole number to
fraction (Skill 5)
Item 12
Item 4
Item 10
Item 11
Item 18
Item 20
Item 7 Item 19
Item 15
Item 17
Item 14
Item 9 Item 16
Item 6
Item 8
Simplify/reduce (Skill 2)
Mixed number skills
Borrow from whole number
(Skill 4)
Separate whole number from
fraction (Skill 3)
Basic fraction subtraction (Skill 1)
Skills 1 & 3
Skills 1, 3, & 4
Skills 1,2,3,&4
6/7 - 4/7
2/3 - 2/3
3 7/8 - 2
3 4/5 - 3 2/5
4 5/7 - 1 4/7
3 1/2 - 2 3/2
4 4/12 - 2 7/12
4 1/3 - 2 4/3
4 1/10 - 2 8/10
4 - 3 4/3
4 1/3 - 1 5/3 2 - 1/3
7 3/5 - 4/5
3 - 2 1/5
Skills 1 & 2
11/8 - 1/8 Skills 1, 3, 4, & 5
Skills 1, 2, 3, 4, & 5
Convert whole number to
fraction (Skill 5)
Item 12
Item 4
Item 10
Item 11
Item 18
Item 20
Item 7 Item 19
Item 15
Item 17
Item 14
Item 9 Item 16
Item 6
Item 8
Structural aspects: The logical conjunctive relationships among skills, and which sets of skills an item requires. Latter determined by its qj vector.
Structural aspects: The logical conjunctive relationships among skills, and which sets of skills an item requires. Latter determined by its qj vector.
Bayes net for mixed number
subtraction(Method B)
Stochastic aspects,Part 1: Empirical relationships among skills in population (red).
Stochastic aspects,Part 1: Empirical relationships among skills in population (red).
Simplify/reduce (Skill 2)
Mixed number skills
Borrow from whole number
(Skill 4)
Separate whole number from
fraction (Skill 3)
Basic fraction subtraction (Skill 1)
Skills 1 & 3
Skills 1, 3, & 4
Skills 1,2,3,&4
6/7 - 4/7
2/3 - 2/3
3 7/8 - 2
3 4/5 - 3 2/5
4 5/7 - 1 4/7
3 1/2 - 2 3/2
4 4/12 - 2 7/12
4 1/3 - 2 4/3
4 1/10 - 2 8/10
4 - 3 4/3
4 1/3 - 1 5/3 2 - 1/3
7 3/5 - 4/5
3 - 2 1/5
Skills 1 & 2
11/8 - 1/8 Skills 1, 3, 4, & 5
Skills 1, 2, 3, 4, & 5
Convert whole number to
fraction (Skill 5)
Item 12
Item 4
Item 10
Item 11
Item 18
Item 20
Item 7 Item 19
Item 15
Item 17
Item 14
Item 9 Item 16
Item 6
Item 8
Bayes net for mixed number
subtraction(Method B)
Stochastic aspects,Part 2: Measurement errors for each item (yellow).
Stochastic aspects,Part 2: Measurement errors for each item (yellow).
Simplify/reduce (Skill 2)
Mixed number skills
Borrow from whole number
(Skill 4)
Separate whole number from
fraction (Skill 3)
Basic fraction subtraction (Skill 1)
Skills 1 & 3
Skills 1, 3, & 4
Skills 1,2,3,&4
6/7 - 4/7
2/3 - 2/3
3 7/8 - 2
3 4/5 - 3 2/5
4 5/7 - 1 4/7
3 1/2 - 2 3/2
4 4/12 - 2 7/12
4 1/3 - 2 4/3
4 1/10 - 2 8/10
4 - 3 4/3
4 1/3 - 1 5/3 2 - 1/3
7 3/5 - 4/5
3 - 2 1/5
Skills 1 & 2
11/8 - 1/8 Skills 1, 3, 4, & 5
Skills 1, 2, 3, 4, & 5
Convert whole number to
fraction (Skill 5)
Item 12
Item 4
Item 10
Item 11
Item 18
Item 20
Item 7 Item 19
Item 15
Item 17
Item 14
Item 9 Item 16
Item 6
Item 8
Bayes net for mixed number
subtraction(Method B)
Probabilities before
observations
Item10
Item11
Item12
Item14
Item15
Item16
Item17
Item18
Item19
Item20Item4
Item6
Item7
Item8
Item9
MixedNumbers
Skill1
Skill2
Skill3 Skill4
Skill5
Skills1&2
Skills1&3
Skills12345
Skills1234 Skills1345
Skills134
yesno
yesno
yesno
yesno
yesno
yesno
yesno
yesno
yesno
yesno
yesno
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
Figure 10
Inference Network for Method B, Initial Status
Note: Bars represent probabilities, summing to one for all the possible values of a variable.
Bayes net for mixed number
subtraction
Probabilities after
observations
yesno
yesno
yesno
yesno
yesno
yesno
yesno
yesno
yesno
yesno
yesno
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
Bars represent probabilities, summing to one for all the possible values of a variable. A shaded bar
extending the full width of a node represents certainty, due to having observed the value of that variable.
Item10
Item11
Item12
Item14
Item15
Item16
Item17
Item18
Item19
Item20Item4
Item6
Item7
Item8
Item9
MixedNumbers
Skill1
Skill2
Skill3 Skill4
Skill5
Skills1&2
Skills1&3
Skills12345
Skills1234 Skills1345
Skills134
Figure 11
Inference Network for Method B, After Observing Item Responses
Bayes net for mixed number
subtraction
For mixture of strategies
across people
Bayes net for mixed number
subtraction
Item10
Item11
Item12
Item14
Item15
Item16
Item17
Item18
Item19
Item20
Item4
Item6
Item7
Item8
Item9
Method
MixedNumbers
Skill1 Skill2
Skill3 Skill4
Skill5
Skill6
Skill7
Skills1&2
Skills1&3
Skills1&6
Skills1234
Skills12345
Skills125
Skills12567
Skills126
Skills1267
Skills134
Skills1345
Skills156
Figure 12
Directed Acyclic Graph for Both Methods
Inference & Culture Slide 33October 21, 2004
Extensions (1)
More general …» Student models (continuous vars, uses)» Observable variables (richer, times, multiple)» Structural relationships (e.g., disjuncts)» Stochastic relationships (e.g., NIDA, fusion)» Model-tracing temporary structures (VanLehn)
Inference & Culture Slide 34October 21, 2004
Extensions (2)
Strategy use» Single strategy (as discussed above)» Mixture across people (Rost, Mislevy)» Mixtures within people (Huang: MV Rasch)
Huang’s example of last of these follows…
A. The truck exerts the same amount of force on the car as the car exerts on the truck.
B. The car exerts more force on the truck than the truck exerts on the car.
C. The truck exerts more force on the car than the car exerts on the truck.
D. There’s no force because they both stop.
What are the forces at the instant of impact?
20 mph 20 mph
A. The truck exerts the same amount of force on the car as the car exerts on the truck.
B. The car exerts more force on the truck than the truck exerts on the car.
C. The truck exerts more force on the car than the car exerts on the truck.
D. There’s no force because they both stop.
What are the forces at the instant of impact?
10 mph 20 mph
A. The truck exerts the same amount of force on the fly as the fly exerts on the truck.
B. The fly exerts more force on the truck than the truck exerts on the fly .
C. The truck exerts more force on the fly than the fly exerts on the truck.
D. There’s no force because they both stop.
10 mph 1 mph
What are the forces at the instant of impact?
Inference & Culture Slide 38October 21, 2004
The Andersen/Rasch Multidimensional Model for m strategy categories
m
qjqiqjpipij pXP
1
)exp(/)exp()(
p is an integer between 1 and m;
ip is the pth element in the person i’s vector-valued parameter;
ijx is the strategy person i uses for item j;
jp is the pth element in the item j’s vector-valued parameter.
Inference & Culture Slide 39October 21, 2004
Conclusion: The Importance of Coordination…
Among psychological model, task design, and analytic model » (KWSK “assessment triangle”)» Tatsuoka’s work is exemplary in this respect:
– Grounded in psychological analyses– Grainsize & character tuned to learning model– Test design tuned to instructional options
Inference & Culture Slide 40October 21, 2004
Conclusion: The Importance of Coordination…
With purpose, constraints, resources» Lower expectations for retrofitting existing
tests designed for different purposes, under different perspectives & warrants.
» Information & Communication Technology (ICT) project at ETS
– Simulation-based tasks– Large scale– Forward design