designing statistical learning environments with

of 30 /30
CAL'09 - Brighton - Patrick Wessa Designing Statistical Learning Environments with Educational Compendium Technology

Upload: others

Post on 25-Feb-2022

36 views

Category:

Documents


0 download

Embed Size (px)

TRANSCRIPT

Page 1: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Designing Statistical Learning Environments with Educational

Compendium Technology

Page 2: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Outline

● Technology● Reproducible

Computing● Compendium● Compendium Platform● Applications

● Design of SLE● Empirical Findings● Building Guidelines● Educational Research● Educational Quality

Control

Page 3: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Claerbout's principle*

● An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and that complete set of instructions that generated the figures.

*Source: Jan de Leeuw

Page 4: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

My question

● If academic statisticians find it hard (if not impossible) to verify or review the results in empirical papers, how could we possibly expect students to learn from statistical results without the proper tools to easily review, verify, or challenge them?

Page 5: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Reproducible Computing

Page 6: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Reproducible Computing

● http://www.freestatistics.org

● http://www.wessa.net

● Wessa, P., “A framework for statistical software development, maintenance, and publishing within an open-access business model”, Computational Statistics, Springer Verlag, 2008

● Wessa P., “Reproducible Computing: a new Technology for Statistics Education and Educational Research”, IAENG Transactions on Engineering Technologies, Volume II, American Institute of Physics, 2009, forthcoming

Page 7: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Setting up the course

Page 8: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

A framework for statistical software development, maintenance, and publishing within an open-access business model, 2008, Computational Statistics, Springer

Page 9: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Computations are “blogged”

Page 10: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Error messages

Page 11: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Error messages

Page 12: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Weekly assignments

Learning Statistics based on the Compendium and Reproducible Computing, Proceedings of the World Congress on Engineering and Computer Science 2008, ISBN: 978-988-98671-0-2,

UC Berkeley, San Francisco, USA

Page 13: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Snapshot of “Blogged” Computation

Reproduce or Reuse at wessa.net

Cite the computation as follows

Page 14: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Social Interaction, Collaboration, Networking, ...

Page 15: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Social Networks (“co-opetition”)

Page 16: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Fraud detection

Page 17: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Feedback (Peer Review)

Submitting Peer Review (feedback) is a good learning activity – not a good grading procedure

Page 18: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Lectures● 13 weeks (semester)

● Week 1: Introduction (explanation) + workshop assignment

● Week 2-12: Workshops + Peer Assessments

● Week 13: Final Exam (multiple choice)

● Grades received from Peers do NOT count => there is no penalty for making mistakes!!

● The quality of feedback messages is graded by the educator

Week 1 Week 2 Week 3 Week 4 ...

ExamL1 L2 L3 L4 L5

WS1 WS2 WS3 WS4 WS5

Rev 1 Rev 2 Rev 3 Rev 4 ...

...

Page 19: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Problem: separate threads of discussion

Week 1 Week 2 Week 3 Week 4 ...

ExamL1 L2 L3 L4 L5

WS1 WS2 WS3 WS4 WS5

Rev 1 Rev 2 Rev 3 Rev 4 ...

...

Page 20: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Computation 3

Computation 1

Connected threads of discussion

Computation 2

Computation 4

Computation 5

Computation 6

Page 21: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

4 cohorts, 2 years

Year 0 Bachelor Prep. Progr.

Female 58 53

Male 53 76

Year 1 Bachelor Prep. Progr.

Female 41 45

Male 42 74

time

Page 22: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

time

Page 23: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Double Hierarchical Structure

Page 24: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Integrated Design

● Statistical Computation = Core Object of Study● Statistical Computation = Core IT Object

=>● Communication (peer review) should be an function

of the Computation● Hierarchical Parent-Child relationships between

computations are maintained & can be browsed

Page 25: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Predictive Performance

Y0 (U) Y0 (C) Y1 (U) Y1 (C) Y1*(C)

Correctly Classified (RT) 75.0 % 82.9 % 75.7 % 88.6 % 90.1%

Correctly Classified (CV) 44.6 % 72.9 % 36.6 % 75.2 % 80.2%

Kappa Statistic (RT) 0.6015 0.5914 0.6259 0.7183 0.7345

Kappa Statistic (CV) 0.1382 0.386 0.0201 0.3863 0.4757

Number of leaves 29 13 36 11 7

Size of tree 57 25 71 21 13

Peer Review Moodle Compendium Platform

Page 26: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Overfitting problems

=== Confusion Matrix ===

  a  b  c  d   <­­ classified as 13  1  4  0 |  a = Excellent  1 73  8  0 |  b = Fail  2 18 57  0 |  c = Guess  4  7  4 10 |  d = Pass

Correctly Classified Instances         153               75.7426 %Incorrectly Classified Instances        49               24.2574 %Kappa statistic                          0.6259

=== Confusion Matrix ===

  a  b  c  d   <­­ classified as  1 14  2  1 |  a = Excellent  9 41 26  6 |  b = Fail  4 38 30  5 |  c = Guess  2 14  7  2 |  d = Pass

Correctly Classified Instances          74               36.6337 %Incorrectly Classified Instances       128               63.3663 %Kappa statistic                          0.0201

● In-sample ● Out-of-sample

Page 27: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Year 0 (Corrected)

Page 28: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Year 0 (Corrected)activenon-active

malemalefemale

bachelor prep.progr.

Page 29: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Year 1 (Corrected)

Page 30: Designing Statistical Learning Environments with

CAL'09 - Brighton - Patrick Wessa

Year 1 (Corrected)activenon-active

drop-out

female male

prep.progr. prep.progr.bachelor bachelor