designing statistical learning environments with
Embed Size (px)
TRANSCRIPT

CAL'09 - Brighton - Patrick Wessa
Designing Statistical Learning Environments with Educational
Compendium Technology

CAL'09 - Brighton - Patrick Wessa
Outline
● Technology● Reproducible
Computing● Compendium● Compendium Platform● Applications
● Design of SLE● Empirical Findings● Building Guidelines● Educational Research● Educational Quality
Control

CAL'09 - Brighton - Patrick Wessa
Claerbout's principle*
● An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and that complete set of instructions that generated the figures.
*Source: Jan de Leeuw

CAL'09 - Brighton - Patrick Wessa
My question
● If academic statisticians find it hard (if not impossible) to verify or review the results in empirical papers, how could we possibly expect students to learn from statistical results without the proper tools to easily review, verify, or challenge them?

CAL'09 - Brighton - Patrick Wessa
Reproducible Computing

CAL'09 - Brighton - Patrick Wessa
Reproducible Computing
● http://www.freestatistics.org
● http://www.wessa.net
● Wessa, P., “A framework for statistical software development, maintenance, and publishing within an open-access business model”, Computational Statistics, Springer Verlag, 2008
● Wessa P., “Reproducible Computing: a new Technology for Statistics Education and Educational Research”, IAENG Transactions on Engineering Technologies, Volume II, American Institute of Physics, 2009, forthcoming

CAL'09 - Brighton - Patrick Wessa
Setting up the course

CAL'09 - Brighton - Patrick Wessa
A framework for statistical software development, maintenance, and publishing within an open-access business model, 2008, Computational Statistics, Springer

CAL'09 - Brighton - Patrick Wessa
Computations are “blogged”

CAL'09 - Brighton - Patrick Wessa
Error messages

CAL'09 - Brighton - Patrick Wessa
Error messages

CAL'09 - Brighton - Patrick Wessa
Weekly assignments
Learning Statistics based on the Compendium and Reproducible Computing, Proceedings of the World Congress on Engineering and Computer Science 2008, ISBN: 978-988-98671-0-2,
UC Berkeley, San Francisco, USA

CAL'09 - Brighton - Patrick Wessa
Snapshot of “Blogged” Computation
Reproduce or Reuse at wessa.net
Cite the computation as follows

CAL'09 - Brighton - Patrick Wessa
Social Interaction, Collaboration, Networking, ...

CAL'09 - Brighton - Patrick Wessa
Social Networks (“co-opetition”)

CAL'09 - Brighton - Patrick Wessa
Fraud detection

CAL'09 - Brighton - Patrick Wessa
Feedback (Peer Review)
Submitting Peer Review (feedback) is a good learning activity – not a good grading procedure

CAL'09 - Brighton - Patrick Wessa
Lectures● 13 weeks (semester)
● Week 1: Introduction (explanation) + workshop assignment
● Week 2-12: Workshops + Peer Assessments
● Week 13: Final Exam (multiple choice)
● Grades received from Peers do NOT count => there is no penalty for making mistakes!!
● The quality of feedback messages is graded by the educator
Week 1 Week 2 Week 3 Week 4 ...
ExamL1 L2 L3 L4 L5
WS1 WS2 WS3 WS4 WS5
Rev 1 Rev 2 Rev 3 Rev 4 ...
...

CAL'09 - Brighton - Patrick Wessa
Problem: separate threads of discussion
Week 1 Week 2 Week 3 Week 4 ...
ExamL1 L2 L3 L4 L5
WS1 WS2 WS3 WS4 WS5
Rev 1 Rev 2 Rev 3 Rev 4 ...
...

CAL'09 - Brighton - Patrick Wessa
Computation 3
Computation 1
Connected threads of discussion
Computation 2
Computation 4
Computation 5
Computation 6

CAL'09 - Brighton - Patrick Wessa
4 cohorts, 2 years
Year 0 Bachelor Prep. Progr.
Female 58 53
Male 53 76
Year 1 Bachelor Prep. Progr.
Female 41 45
Male 42 74
time

CAL'09 - Brighton - Patrick Wessa
time

CAL'09 - Brighton - Patrick Wessa
Double Hierarchical Structure

CAL'09 - Brighton - Patrick Wessa
Integrated Design
● Statistical Computation = Core Object of Study● Statistical Computation = Core IT Object
=>● Communication (peer review) should be an function
of the Computation● Hierarchical Parent-Child relationships between
computations are maintained & can be browsed

CAL'09 - Brighton - Patrick Wessa
Predictive Performance
Y0 (U) Y0 (C) Y1 (U) Y1 (C) Y1*(C)
Correctly Classified (RT) 75.0 % 82.9 % 75.7 % 88.6 % 90.1%
Correctly Classified (CV) 44.6 % 72.9 % 36.6 % 75.2 % 80.2%
Kappa Statistic (RT) 0.6015 0.5914 0.6259 0.7183 0.7345
Kappa Statistic (CV) 0.1382 0.386 0.0201 0.3863 0.4757
Number of leaves 29 13 36 11 7
Size of tree 57 25 71 21 13
Peer Review Moodle Compendium Platform

CAL'09 - Brighton - Patrick Wessa
Overfitting problems
=== Confusion Matrix ===
a b c d < classified as 13 1 4 0 | a = Excellent 1 73 8 0 | b = Fail 2 18 57 0 | c = Guess 4 7 4 10 | d = Pass
Correctly Classified Instances 153 75.7426 %Incorrectly Classified Instances 49 24.2574 %Kappa statistic 0.6259
=== Confusion Matrix ===
a b c d < classified as 1 14 2 1 | a = Excellent 9 41 26 6 | b = Fail 4 38 30 5 | c = Guess 2 14 7 2 | d = Pass
Correctly Classified Instances 74 36.6337 %Incorrectly Classified Instances 128 63.3663 %Kappa statistic 0.0201
● In-sample ● Out-of-sample

CAL'09 - Brighton - Patrick Wessa
Year 0 (Corrected)

CAL'09 - Brighton - Patrick Wessa
Year 0 (Corrected)activenon-active
malemalefemale
bachelor prep.progr.

CAL'09 - Brighton - Patrick Wessa
Year 1 (Corrected)

CAL'09 - Brighton - Patrick Wessa
Year 1 (Corrected)activenon-active
drop-out
female male
prep.progr. prep.progr.bachelor bachelor