open data talk at the world bank
TRANSCRIPT
![Page 1: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/1.jpg)
Anthony GoldbloomKaggle
Making data science a sport
Photo by mikebaird, www.flickr.com/photos/mikebaird
![Page 2: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/2.jpg)
![Page 3: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/3.jpg)
![Page 4: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/4.jpg)
Competitions are judged on objective criteria
Competition Mechanics
![Page 5: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/5.jpg)
![Page 6: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/6.jpg)
![Page 7: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/7.jpg)
![Page 8: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/8.jpg)
“In less than a week, Martin O’Leary, a PhD student in glaciology, outperformedthe state-of-the-art algorithms”
“The world’s brightest physicists have been working for decades on solving one of the great unifying problems of our universe”
Kaggle’s Dark Matter Competition on the White House blog
![Page 9: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/9.jpg)
![Page 10: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/10.jpg)
User base: 60,000 data scientists
![Page 11: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/11.jpg)
Our User Base
![Page 12: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/12.jpg)
• neural networks• logistic regression• support vector machine• decision trees• ensemble methods• adaBoost• Bayesian networks
• genetic algorithms• random forest• Monte Carlo methods• principal component analysis• Kalman filter• evolutionary fuzzy modeling
Users apply different techniques
![Page 13: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/13.jpg)
EXAMPLE ESSAY QUESTION —
We all understand the benefits of laughter. For example, someone once said, “Laughter is the shortest distance between two people.”
Many other people believe that laughter is an important part of any relationship. Tell a true story in which laughter was one element or part.
![Page 14: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/14.jpg)
“Have you ever experienced a time with your friends or family where you laughed so hard your stomach hurt, and your eyes were filled with tears? Laughing is something every person needs.
A great laugh can make a persons day and put a smile on their face. If no one laughed the world would be a terribly sad place. My friends and I are always laughing, to the point where were rolling on the ground, clutching our stomachs laughing.”
Automated results by
the winning algorithm are
as reliable as manual
assessment by teachers.
![Page 15: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/15.jpg)
![Page 16: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/16.jpg)
![Page 17: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/17.jpg)
Probability of going to hospital in the next six months
& Obesity
Diabetes
& Hypertension
& High Cholesterol
![Page 18: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/18.jpg)
RTA Competition: Travel Time Prediction
![Page 19: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/19.jpg)
Boehringer Ingelheim Competition: Data
Mutates Molecule +1700 fields
True Molecule2
False Molecule3
True Molecule4
True Molecule 5
…
True 0
![Page 20: Open Data talk at the World Bank](https://reader035.vdocuments.us/reader035/viewer/2022062313/55bdb069bb61eb3c3e8b46b9/html5/thumbnails/20.jpg)
Is it a lemon?