Download - Data Science
![Page 1: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/1.jpg)
Data Science
Svet Ivantchev, eFaberUniEE, 7 de marzo de 2012
![Page 2: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/2.jpg)
![Page 3: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/3.jpg)
![Page 4: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/4.jpg)
El sexy job del la década?
“I keep saying the sexy job in the next ten years will be statisticians. People think I'm joking, but who would've guessed that computer engineers
would've been the sexy job of the 1990s?”
Hal Varian, The McKinsey Quarterly, January 2009
![Page 5: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/5.jpg)
http://www.dataists.com/2010/09/the-data-science-venn-diagram/
![Page 6: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/6.jpg)
http://www.mymodernmet.com/profiles/blogs/stephen-wildish-clever-venn-diagrams
![Page 7: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/7.jpg)
http://www.mymodernmet.com/profiles/blogs/stephen-wildish-clever-venn-diagrams
![Page 8: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/8.jpg)
Hablaremos de:
• Presentación de datos
• Aprendizaje automático
• Estadística
• Big Data
![Page 9: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/9.jpg)
Presentación de datos
Cuatro “sets” de datos con los mismas “medidas”
Anscombe, F. (1973), Graphs in Statistical Analysis, The American Statistician, pp. 195-199.
![Page 10: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/10.jpg)
Los mismos promedios presentados gráficamente
http://en.wikipedia.org/wiki/Anscombe%27s_quartet
![Page 11: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/11.jpg)
Ej: Epidemia de cólera en Londres
• año 1854
• 19 de agosto -- 29 de septiembre: 616 muertos
• tardan 2.5 semanas en descubrir el motivo
![Page 12: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/12.jpg)
![Page 13: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/13.jpg)
Comparativas sin sentido
Radiación solar y la bolsa
![Page 14: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/14.jpg)
El contexto
![Page 15: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/15.jpg)
![Page 16: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/16.jpg)
![Page 17: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/17.jpg)
![Page 18: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/18.jpg)
![Page 19: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/19.jpg)
![Page 20: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/20.jpg)
![Page 21: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/21.jpg)
![Page 22: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/22.jpg)
![Page 23: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/23.jpg)
NYT y la deuda de los países
http://www.nytimes.com/interactive/2011/10/23/sunday-review/an-overview-of-the-euro-crisis.html
![Page 24: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/24.jpg)
Estadística 101
![Page 25: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/25.jpg)
![Page 26: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/26.jpg)
![Page 27: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/27.jpg)
Aprendizaje automático(aka Machine Learning)
Desarrollo de algoritmos y métodos quepermiten a los ordenadores “evolucionar”
en base de datos empíricos
![Page 28: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/28.jpg)
Temas y ejemplos
• Clasificación
• Recomendaciones
• Clustering (con zip :-) ?)
• Ejemplo y relación con Compresión
• En la vida real I: datos vs algoritmos
• En la vida real II: experiencia vs metodología
![Page 29: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/29.jpg)
Relacionado: búsqueda
• Idea de TF-IDF, tf (t, d) * idf (t, D)
• Idea de PageRank
![Page 30: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/30.jpg)
Peligro: Usar sin entender
![Page 31: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/31.jpg)
Otro ejemplo (mejor)
![Page 32: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/32.jpg)
Datos propios
• Con un móvil Android
• Del coche
![Page 33: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/33.jpg)
Con un Android
![Page 34: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/34.jpg)
![Page 35: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/35.jpg)
![Page 36: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/36.jpg)
![Page 37: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/37.jpg)
![Page 38: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/38.jpg)
![Page 39: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/39.jpg)
![Page 40: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/40.jpg)
![Page 41: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/41.jpg)
![Page 42: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/42.jpg)
![Page 43: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/43.jpg)
![Page 44: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/44.jpg)
![Page 45: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/45.jpg)
![Page 46: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/46.jpg)
OBD II
![Page 47: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/47.jpg)
![Page 48: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/48.jpg)
![Page 49: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/49.jpg)
![Page 50: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/50.jpg)
![Page 51: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/51.jpg)
![Page 52: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/52.jpg)
![Page 53: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/53.jpg)
GPS speed
![Page 54: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/54.jpg)
Engine RPM
![Page 55: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/55.jpg)
Acc pedal pos
![Page 56: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/56.jpg)
Fuel flow
![Page 57: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/57.jpg)
CO2
![Page 58: Data Science](https://reader035.vdocuments.us/reader035/viewer/2022062513/55628c19d8b42ad1688b5722/html5/thumbnails/58.jpg)
Recapitulando
• Visualización
• Gestión de muchos datos
• Métodos matemáticos y estadística