i wish i could believe you: the frustrating unreliability of some assessment research
TRANSCRIPT
I wish I could believe you:the frustrating unreliability of some assessment research
Tim Hunt & Sally Jordan, The Open University@tim_hunt @SallyJordan9
Are these two things related?
Trick question (of course)
2
From a great web sitehttp://www.tylervigen.com/spurious-correlations
Correlation &Causation
3
Sly (1999)
614 students P01 S01 S02
Practice tests as formative assessment improve student performance on computer-managed learning assessment
4
A computerised assessment was quite exciting in itself back in 1999!
Questions picked at random from a bank.
P01 & S01 used the same test bank.S02 was different, with no practice.
Sly (1999)
614 students P01 S01 S02 609 students
417 62.18% 72.72% 66.88% 415
197 – 67.56% 62.24% 194
Practice tests as formative assessment improve student performance on computer-managed learning assessment
5
All standard deviations 15–17%
Sly (1999)
614 students P01 S01 S02 609 students
417 62.18% 72.72% 66.88% 415
197 – 67.56% 62.24% 194
Practice tests as formative assessment improve student performance on computer-managed learning assessment
6
All standard deviations 15–17%
+5.38%+5.16% +4.64%
OU level 3 physics (SM358)
An investigation into factors affecting physics’ students engagement with online assessment (Bolton & Jordan)
7
OU level 3 physics (SM358)
The assessment strategy
8
0 TMAs 1 TMA 2 TMAs 3 TMAs 4 TMAs
0 iCMAs
1 iCMA
2 iCMAs
3 iCMAs
4 iCMAs
5 iCMAs
6 iCMAs
OU level 3 physics (SM358)
Proportion of students
9
0 TMAs 1 TMA 2 TMAs 3 TMAs 4 TMAs
0 iCMAs 11.6% 3.4% 1.5% 0.5% 0.5%
1 iCMA 1.5% 1.0%
2 iCMAs 1.5% 2.4% 1.5%
3 iCMAs 1.5%
4 iCMAs 5.3% 2.4%
5 iCMAs 0.5% 3.9% 5.8% 8.2%
6 iCMAs 0.5% 0.5% 5.8% 5.8% 34.3%
OU level 3 physics (SM358)
Exam mark
10
0 TMAs 1 TMA 2 TMAs 3 TMAs 4 TMAs
0 iCMAs 6.0
1 iCMA
2 iCMAs 17.0 24.0
3 iCMAs 60.0
4 iCMAs 43.7 62.0
5 iCMAs 23.0 46.0 62.6 69.5
6 iCMAs 35.3 60.8 77.5
OU level 3 physics (SM358)
Exam mark compared to predictive model
11
0 TMAs 1 TMA 2 TMAs 3 TMAs 4 TMAs
0 iCMAs −20.8
1 iCMA
2 iCMAs −43.9 −27.5
3 iCMAs −9.0
4 iCMAs −15.6 +1.8
5 iCMAs −3.8 −11.1 +1.4 +2.4
6 iCMAs −17.1 +3.4 +4.6
Confoundingvariables
12
Berkeley gender bias case (1973)
Men Women
Applicants Admitted Applicants Admitted
Total 8442 44% 4321 35%
https://en.wikipedia.org/wiki/Simpson%27s_paradox#Berkeley_gender_bias_case
13
Berkeley gender bias case (1973)
Men Women
Department Applicants Admitted Applicants Admitted
A 825 62% 108 82%
B 560 63% 25 68%
C 325 37% 593 34%
D 417 33% 375 35%
E 191 28% 393 24%
F 272 6% 341 7%
https://en.wikipedia.org/wiki/Simpson%27s_paradox#Berkeley_gender_bias_case
14
RealExperiments
15
What is an experiment?
Split participants into two equal groups.
Split randomly, so if there are confounding variables,they are probably equally split between groups.
Give different ‘treatments’ to each group,trying to keep everything else the same.
Blind the treatment, if possible, to reduce all sorts of biases.
But, blinding is not normally possible in education.(You probably know if you just sat an exam!)
[Pick your favourite research methods book]
16
Karpicke & Blunt (2011) + many more
Retrieval practice produces more learning than elaborative studying with concept mapping
17
… but! Wooldridge et al (2014)
The testing effect with authentic educational materials:A cautionary note
18
“Based on [the testing effect], … some textbooks are now accompanied by quizzing ancillaries …The quizzes are designed with the assumption that answering factual and application questions will promote a more integrated mental model that incorporates the target knowledge.”
Typically, the quizzes and test banks sample items from similar sub-sections in the textbook but not necessarily the same information.
… but! Wooldridge et al (2014)
19
The testing effect with authentic educational materials:A cautionary note
How reliable isstudent opinion?
20
Background
Our own work with interactive computer-marked assignments (iCMAs)
21
Findings from a questionnaire
StatementDefinitely agree or
mostly agreeNeutral
Mostly or definitely disagree
Answering iCMA questions helps me to learn
129(85%)
7(5%)
12(8%)
If I get the answer to an iCMA question wrong, the computer-generated feedback is useful
128(85%)
11(7%)
8(5%)
Responses received from 151 students (response rate 20%)(Jordan, 2011)
22
Watching students in a usability lab
Six students observed answering questions (Jordan, 2009)
23
Data analysis
Much more data presented in Jordan (2014)
24
Reflection
Weaver (2006, p. 386) reports that 90% of students agreed with the statement “Positive comments have boosted my confidence.”
Marriott (2009, p. 243) reports that 93% of students agreed with the statement“I find the immediate reporting of my test result valuable.” It is almost certainly the case that more students report that they find feedback useful than actually make good use of it. This is in line with the bias in self-reported behaviour that is observed in medicine and business. (Jordan, 2014, p. 69).
But: Student opinion is important. (Dermo, 2009).
We need to consider student opinion, butwe also need to consider students’ actual actions.
25
Ethics
26
Ethics
Is it ethical to only give a helpful intervention to half the class?
Are we allowed to do experiments in Education?
27
Look at evidence-based medicine
How do you know it’s effective if you have not done the experiment?If you don't know whether it is effective, is it ethical to use it?
(They have been doing this for a while)
28
NICEAcademicresearchers
Drugcompanies
Doctors
Meta analysis
The literature
Medical schools
The end
29
References
Bolton, J., Jordan, R. & Jordan, S. (2015). An investigation into factors affecting physics' studentsengagement with online assessment, Manuscript in preparation.
Cohen, L., Manon, L. & Morrison, K. (2011). Research methods in education, 7th Edition, Routledge.
Dermo, J. (2009). e-Assessment and the student learning experience: A survey of student perceptions of e‐assessment. British Journal of Educational Technology, 40(2), 203–214.
Goldacre, B. (2008). Bad Science, Fourth Estate.
Goldacre, B. (2012). Bad Pharma, Fourth Estate.
Jordan, S. (2009). Assessment for learning: pushing the boundaries of computer-based assessment.Practitioner Research in Higher Education, 3(1), 11–19.
Jordan, S. (2011). Using interactive computer–based assessment to support beginning distance learners of science, Open Learning, 26(2), 147–164.
Jordan, S. (2014). E-assessment for learning? Exploring the potential of computer-marked assessment and computer-generated feedback, from short-answer questions to assessment analytics. PhD thesis. The Open University. At http://oro.open.ac.uk/41115/.
Karpicke, J. & Blunt, J. (2011). Retrieval practice produces more learning than elaborative studying withconcept mapping, Science, 331(6018) 772–775.
Marriott, P. (2009). Students' evaluation of the use of online summative assessment on an undergraduatefinancial accounting module. British Journal of Educational Technology, 40(2), 237–254.
Sly, L. (1999). Practice tests as formative assessment improve student performance on computer‐managed learning assessments, Assessment & Evaluation in Higher Education, 24(3), 339–343.
Vigen, T. (2014). Spurious Correlations, at http://www.tylervigen.com/spurious-correlations.
Weaver, M. R. (2006). Do students value feedback? Student perceptions of tutors’ written responses.Assessment & Evaluation in Higher Education, 31(3), 379–394.
Wikipedia (2015). Simpson's paradox, at https://en.wikipedia.org/wiki/Simpson%27s_paradox.
Wooldridge, C., Bugg, J., McDaniel, M. & Liu, Y. (2014). The testing effect with authentic educational materials:A cautionary note, Journal of Applied Research in Memory and Cognition, 3(3), 214–221. 30
Summary
Correlation vs causation
Confounding variables
Experiments – designed to minimise confounding variables
Don't abstract your experiment so muchthat the results aren't relevant
Student opinion and attitudes are importantbut different from actions or effectiveness
Ethical issues are real, but should be overcome
31
@tim_hunt [email protected]@SallyJordan9 [email protected]