the andes intelligent tutoring system: five years of evaluations kurt vanlehn pittsburgh science of...
TRANSCRIPT
The Andes Intelligent Tutoring System:Five years of evaluations
Kurt VanLehnPittsburgh Science of Learning Center (PSLC)
University of Pittsburgh
The physics LearnLab course committee
Andes development– Anders Weinstein
– Brett van de Sande
– Kurt VanLehn (co-chair)
U.S. Naval Academy– Don Treacy (co-chair)
– Bob Shelby
– Mary Wintersgill
– Kay Schulze
Experimenters– Scotty Craig
– Sandy Katz
– Bob Hausmann
– Michael Ringenberg
Meet weekly– Thursdays, 3:30
Funding
The U. S. Office of Naval ResearchCognitive Science Program
The U.S. National Science FoundationPittsburgh Science of Learning Center
USN
Research question
Given– Whole semester of instruction– No change to content of course– No change to lectures, labs, assignments– Standard exams (not designed by experimenters)
Can a homework helper increase learning?
Prior work with answer-only tutoring steps
Web-based homework grading systems– E.g., Web-assign, CAPA, Mastering Physics– Provide feedback & hints on the answer only
Compared to ordinary paper-based homework– Positive benefits
When paper-based homework is collected & graded– No benefits (Pascarella, 2002; Dufresne, Mestre & Rath, 2002)
Interpretation – Motivating students to do their homework provides benefits,
but the answer-only tutoring system provides no additional benefits
Prior work with tutoring systems that give feedback & hints on steps Lisp Tutor (Corbett, 2001) and many others
– Same homework problems & text– Experimenter’s exams only– But not a whole semester (only 5 lessons)
Pump curriculum + Pat tutor (Koedinger et al)– Whole year of high-school algebra– Both experimenter’s exams & standard exams– But content confounded with tutoring system
Earlier evaluations of Andes– First half-semester only– Experimenter’s exams only
Why does it matter?
Ideally, an intelligent homework helper… – can increase learning without changing the course, and– the increase is strong enough to show in final exam
» The diligent always do well & slackers always do poorly
» Cramming
If not…– still useful if it facilitates content upgrades, and– the upgrades cause robust increases in learning
Outline
Andes Evaluation Discussion
Next
What kind of physics?
US university introductory physics courses US high school advanced physics courses A typical problem:
If a 2000 kg car at the top of a 20 degree inclined driveway 20 m long slips its parking brake and rolls down. If we ignore friction and drag, what is the magnitude of the velocity of the car when it hits the garage door?
Andes user interface
Read a physics problem
Type in equations
Draw vectors
Type in answer
Andes feedback and hints“What should I do next?”
Green means correctRed means incorrect
“What’s wrong with that?”
Dialogue & hints
Major challenges
Dealing with equations– Giving red/green feedback– Undoing algebraic combination
» For “what should I do next?”
– Analyzing errors in equations Scale-up
– 13 chapters, 500 textbook pages– 350+ problems– 300+ principles
Outline
Andes Evaluation
– Method– Main results– Which students benefited?– Which knowledge benefited?– Interpretation of results
Discussion
Next
Evaluations of Andes at the US Naval Academy
Fall semesters 2000, 2001, 2002 & 2003 Only the homework modality was varied:
Andes vs. paper-based– Same textbook
– Similar lectures, labs, recitations
– Similar homework problems
– Same exams
Students were motivated to do paper-based homework– Either collected and graded
– Or 1 homework problem on each quiz
Exams Midterm exam
– 1 hour, 4 problems– Scored on derivation & answer
» Drawings (30%)» Variable definitions (20%)» Equations (40%)» Answers (10%)
Final exam– 3 hours, 50 problems– Multiple choice
Next
Checking prior competence of Andes and control students
Grade-point averages equal Distribution of majors equal
– Engineering majors vs.
– Science majors vs.
– Other majors
Midterm exam results(All differences reliable, p < .01)
50
55
60
65
70
75
2000 2001 2002 2003
ControlAndes
How to calculate effect size?
Calculating effect size over 4 different midterm exams
Normalize each scorez_score(student) = [raw_score(student) – mean(exam)] / standard_deviation(exam)
For each condition, pool z-scores across years Effect size = 0.61
Final exam
Exam covers 100% of course, but Andes didn’t– Does now
Use 2003 exam only; Andes covered 70%– 89 Andes students– 823 non-Andes students
Prior competence not equal
Majors not equally distributed– Andes group had more engineering majors
GPAs not equally distributed– Andes group had marginally higher GPAs
Factor out prior competence statistically– For each major, regress GPA on final exam score– Residual_score(student) =
raw_score(student) – predicted_score(student’s major, student’s GPA)
Final exam results
-1
-0.5
0
0.5
1
1.5
2
2.5
Control Andes
Difference is reliable (p = 0.028)
Effect size = 0.25
Outline
Andes Evaluation
– Method– Main results– Which students benefited?– Which knowledge benefited?– Interpretation of results
Discussion
Next
Andesy = 0.9473x - 2.4138
R2 = 0.2882
Controlsy = 0.7956x - 2.5202
R2 = 0.2048
-3.0000
-2.0000
-1.0000
0.0000
1.0000
2.0000
3.0000
1 1.5 2 2.5 3 3.5 4
GPA
Z-s
core
on
exam
ANDES
CONTROLS
Linear (ANDES)
Linear (CONTROLS)
Benefits same regardless of GPA
Benefits varied by major on final exam but not on midterm exam
Midterm exam results
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
Engineers Scientists Others
Control Andes
Final exam results
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Engineers Scientists Others
Non-Andes Andes
Outline
Andes Evaluation
– Method– Main results– Which students benefited?– What knowledge benefited?– Interpretation of results
DiscussionNext
Effect sizes for subscores of midterm exam
-1
-0.5
0
0.5
1
1.5
Drawings Variables Equations Answers
Interpretation of results Engineering & science majors learned the
red path and prefer it– Andes does not increase their final exam scores
They use blue path on the midterm– Andes increases their midterm exam scores
Other majors do not have red path, so they use the blue path on both exams– Andes increases both exams’ scores
On midterm exams, subscores measure components of blue path separately– Biggest benefit for diagrams & variables
– Smaller on equations; none on answer
Problem
Diagram & variables
Equations
Answer
Andes
Andes
Prior physics
Prior math & physics
Summary of results Main result: Andes provides benefits
– Midterm exam effect size: 0.61 – Final exam effect size: 0.25
Andes helps students learn conceptual skills– Effect sizes on conceptual subscores: 1.21 & 0.69– Effect sizes on calculational subscores: 0.11 & -0.08
Some students appear to have a non-conceptual method for solving problems– Competes with the conceptual method taught by Andes– They use it on the (answer-only) final exam– This dilutes the benefit of Andes on final exam
Outline
Andes Evaluation Discussion
– Andes compared to others
– Why is Andes effective?
Next
Effect sizes on experimenter’s & standard exams of 3 tutoring systems
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Experimenter's 1 Experimenter's 2 Standard
LispPump+PatAndes
Interpretation of the comparison with other tutors
Andes is about the same as other tutoring systems that give feedback and hints on steps
Perhaps the Pump+Pat benefits are due solely to the tutoring system and not the content upgrade
Summary: Studies of homework helpers when content is controlled
Ordinary paper-based homework
Motivated paper-based homework
Feedback & hints on answer only
Feedback & hints on steps
Large benefits
Large benefits
No benefits
Outline
Andes Evaluation Discussion
– Andes compared to others
– Why is feedback & hints on steps so effective?
Next
Hypothesis: Andes increases the number of successful knowledge events Without feedback & hints on steps, students skip them
– Guess
– Copy similar example’s step & edit
– Copy & edit a higher goal’s outcome
Doing a step correctly requires– Figuring out how the first time (sense-making)
– Figuring out why the second & third times (refinement)
– Recalling why & how the other times (fluency building)
This increases number of successful knowledge events– Wherein a student constructs or applies a knowledge
component
Thanks for your attention!
At www.andes.pitt.edu– Download stand-alone version of Andes– Try OLI version of Andes– Download papers on Andes
Sorry, but Andes only runs on Windows