introduction to data science and data visualization july...

17
Introduction to Data Science and Data Visualization German Hernandez July 5th to 29th of 2016

Upload: hoangkiet

Post on 30-Mar-2018

245 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

Introduction to Data Science and Data Visualization

German Hernandez

July 5th to 29th of 2016

Page 2: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization
Page 3: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

Agenda

1. What is data science?

2. From data visualization to data analytics

3. The beauty of data visualization

4. Big Data University: Introduction to Data Analysis with Demos

5. Embracing the uncertainty: the New Machine Intelligence - Classical artificial intelligence vs learning from data (machine learning –statistical learning)

Page 4: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

1. What is data science?

Page 5: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

Brad Schumitsch,

former Data Scientist at Twitch in

https://blog.mixpanel.com/2016/03/30/this-is-the-difference-between-statistics-and-data-science - Justin Megahan

Page 6: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

“I think data-scientist is a sexed up term for a statistician” Nate Silver to the 2013 the Joint Statistical Meeting.

"Harvard Business Review recently called data science 'The Sexiest Job of the 21st Century.' “ Murtaza

Haider. Big Data University

There is certainly no lack of demand for data scientists. A few months ago, Glassdoor named it the top job of 2016 – with more than 1,700 job openings and an

average salary of $116k.

“Don’t get me started on data scientists,” “99% of the applicants are not actually data scientists,” “They can’t do

what we need.” CTO

Page 7: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

Today’s data scientists are an eclectic mix of economists, physicists, and mathematicians. Oddballs who by some

series of events and education happen to be both skilled engineers and number crunchers.

Page 8: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

2. From data visualization to data analytics

Page 9: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

• From Data Visualization to Data Analytics, J. E. Ramirez-Marquez, Stevens pdf

• Visualize This https://www.youtube.com/watch?v=mkEXx7sDXAI• Pizza place geography http://flowingdata.com/2013/10/14/pizza-

place-geography/• Milestones in the history of thematic cartography, statistical graphics,

and data visualization. Michael Friendly, 2009http://www.math.yorku.ca/SCS/Gallery/milestone/milestone.pdf

• Charles Minard's 1869 chart showing the number of men in Napoleon’s 1812 Russian campaign army, their movements, as well as the temperature they encountered on the return path https://en.wikipedia.org/wiki/File:Minard.png

• The Visual Display of Quantitative Information, Edward R. Tufte, 2001. https://www.amazon.com/Visual-Display-Quantitative-Information/dp/0961392142

• Rock 'N' Roll is Here to Pay: The History and Politics of the Music Industry, Steve Chappel and Reebe Garofalo, 1977. http://followtheyellowbricks.com/wp-content/uploads/2014/05/0002N6-2903.gif

Page 10: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

3. The beauty of data visualization

Page 11: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

• Top 10 TED Talks for the Data Scientists http://www.kdnuggets.com/2016/02/top-10-tedtalks-data-scientists.html

• TED The beauty of data visualization David McCandless turns complex data sets (like worldwide military spending, media buzz, Facebook status updates) into beautiful, simple diagrams that tease out unseen patterns and connections. Good design, he suggests, is the best way to navigate information glut — and it may just change the way we see the world. http://www.ted.com/talks/david_mccandless_the_beauty_of_data_visualization

Page 12: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

4. Big Data University: Introduction to Data Analysis with Demos

Page 13: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

• Watson (computer) wikipedia• This is Jeopardy! on 12/05/2013 YouTube• IBM's Watson Supercomputer Destroys Humans in

Jeopardy YouTube• IBM Watson is a technology platform that uses natural language

processing and machine learning to reveal insights from large amounts of unstructured data. http://www.ibm.com/watson/what-is-watson.html.

• IBM Watson Analytics: Analytics made easy Predictive analytics and data visualization built for you. Analyze your data in minutes on your own without downloading software. https://watson.analytics.ibmcloud.com.

• Beauty and the Labor Market, Daniel S. Hamermesh and Jeff E. Biddle. The American Economic Review, Vol. 84, No. 5, (Dec., 1994), pp. 1174-1194 https://wiwi.uni-paderborn.de/fileadmin/dep1ls6/Research/Beauty_and_the_Labor_Market_Hamermesh_Biddle.pdf

Page 14: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

• Homework Assignment 4, Carlos M. Carvalho, Statistics – Texas MBA, McCombs School of Business, Problem 1: Beauty Pays!http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/HW4Core_Solutions.pdf

• BeautyData.csv from Daniel Hamermesh - UT’s economics department - research about the impact of beauty in labor income at http://www.rob-mcculloch.org/data/index.html

• Data Scientist Workbench: Open Data Science Made Easy! https://datascientistworkbench.com

• Big Data University: An IBM community initiative to provide education on big data, data science and analytic technologies from experts using hands-on exercises and interactive videos. It’s completely free. http://bigdatauniversity.com//

• Big Data University: Introduction to Data Analysis with Demos http://bigdatauniversity.com/moodle/course/view.php?id=882

• Big Data University: Getting Started with Data Science (BETA) http://bigdatauniversity.com/moodle/mod/page/view.p

Page 15: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

5. Embracing the uncertainty: the New Machine Intelligence -

Classical artificial intelligence vs learning from data (machine learning – statistical learning)

Page 16: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

• Homework Assignment 4, Carlos M. Carvalho, Statistics – Texas MBA, McCombs School of Business, Problem 1: Beauty Pays!http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/HW4Core_Solutions.pdf

• BeautyData.csv from Daniel Hamermesh - UT’s economics department - research about the impact of beauty in labor income at http://www.rob-mcculloch.org/data/index.html

• Data Scientist Workbench: Open Data Science Made Easy! https://datascientistworkbench.com

• Big Data University: An IBM community initiative to provide education on big data, data science and analytic technologies from experts using hands-on exercises and interactive videos. It’s completely free. http://bigdatauniversity.com//

• Big Data University: Introduction to Data Analysis with Demos http://bigdatauniversity.com/moodle/course/view.php?id=882

• Big Data University: Getting Started with Data Science (BETA) http://bigdatauniversity.com/moodle/mod/page/view.p

Page 17: Introduction to Data Science and Data Visualization July ...disi.unal.edu.co/~gjhernandezp/datascience/talks/Introductionto... · Introduction to Data Science and Data Visualization

Homework

Due next Monday

1. Obtain the Certificate of Big Data University - Introduction to Data Analysis with Demos.

2. Complete tutorial in Watson Analytics, to reach the tutorial click on ? In the top menu and follow Getting Started .

3. Perform the analysis presented in class of BeautyDataModified.csvdata in Watson Analytics, name and save the Discovery Set as Beauty Analysis.

4. Perform an analysis of ExtraMaritalAffairs.csv data in Watson Analytics, , name and save the Discovery Set as ExtraMaritalAnalysis.

5. Extra credit: Obtain the Certificate of Big Data University - Watson Analytics Fundamentals