the panini-testpeople.inf.ethz.ch/jaggim/meetup/4/slides/ml-meetup-4-stekhoven.pdf · problem: what...
TRANSCRIPT
The Panini-Test
Daniel J. Stekhoven
CEO Quantik AG
Statistician
JT Rodgers et al. Nature 000, 1-4 (2014) doi:10.1038/nature13255
Activation of mTORC1 is necessary and sufficient for the alert
phenotype.
JT Rodgers et al. Nature 000, 1-4 (2014) doi:10.1038/nature13255
Activation of mTORC1 is necessary and sufficient for the alert
phenotype.
16 days ‘til…
5 Copyright ©1994 - 2014 FIFA.
And the most important thing before the world cup starts…
© 1997-2014 Panini SpA
8
410 stickers 410 stickers
9
box box blister blister
5 Stickers 50 Blister = 250 Stickers
© 1997-2014 Panini SpA
10
?
20 Minuten – Panini-Box – 21. März 2014
Gut feeling and hypotheses
12
Complete box not so many doubles
Single Blisters bought at different stores many doubles
«Null hypothesis»:
Stickers filled randomly into the boxes
Alternative hypothesis:
Stickers are filled systematically into the boxes, such that not many doubles are present
«Null», because there is no system behind the filling of the boxes
How can we decide between these two hypotheses?
Hypothesis test
I bought a box with 250 stickers and I could fill 242 of these stickers into an empty album (410 possible pictures).
If we assume that the null hypothesis is true:
Is it plausible, that I could glue 242 pictures into the album?
Do the null hypothesis «randomly filled boxes» and the event «242 stickers at once» fit together?
13
Problem: What is «normal»?
• If I was able to put many more stickers than «normal» into the album, then the boxes were probably not filled at random
• If we assume that the null hypothesis is true – how many stickers can we put into an album normally?
• Level of significance: How «abnormal» does an observation has to be, such that we do not believe in the null hypothesis anymore? – e.g. 1/1’000’000 we reject the null hypothesis if we
observe something that is less probable than 1/1’000’000
14
Solution: computer simulation
15
1 1 186 186
2 2 192 192
1 mio 1 mio 193 193
Resultat der Computersimulation
16
Nu
mb
er
of
alb
um
s
Number of stickers
How «abnormal» is our observation?
17
Nu
mb
er o
f al
bu
ms
Number of stickers
Conclusion
• If we assume that the stickers are filled into the boxes at random:
– The probability for observing an event with 242 stickers put in a new album coming from a single box is less than 1/1’000’000!
Our observation and the simulation (the null hypothesis world) do not fit together!
18
Stickers are not filled randomly into the boxes
20 Minuten – Panini-Box – 21. März 2014
Summary
1. Model: Draw 250 sticker with replacements from 410 possible stickers
2. Null hypothesis: «stickers are randomly filled into the boxes» Alternative: «systematically filled-in, such that less doubles appear»
3. Test statistic: Number of stickers put into a new album when we buy a box of 250 stickers. Distribution of the test statistic under the null hypothesis: computer simulation
4. Level of significance: = 1/1’000’000
5. Critical region of the test statistic: The computer has not observed more than 211 stickers in one album using 1 mio iterations critical region: K={212, 213, …, 250}
6. Test decision: The observed value (242) is within the critical region. This is why the null hypothesis will be rejected on the level of significance of 1/1’000’000
20
Acknowledgment
• Original idea by Markus Kalisch
Copyright ©1994 - 2014 FIFA.
• …it’s all about collecting data
© 1997-2014 Panini SpA