bigclean, nov 2012 filip hráček · google confidential and proprietary big data is hard. 1. data...
TRANSCRIPT
![Page 1: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/1.jpg)
Google Confidential and Proprietary
Tools for open dataFilip HráčekBigClean, Nov 2012
![Page 2: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/2.jpg)
Google Confidential and Proprietary
Big data is hard.
1. Data gathering – hard2. Data refinement – hard3. Data analysis – super hard4. Data sharing – hard
![Page 3: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/3.jpg)
Google Confidential and Proprietary
Data gathering
![Page 4: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/4.jpg)
Google Confidential and Proprietary
Public Data Explorerhttp://www.google.com/publicdata/directory
![Page 5: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/5.jpg)
Google Confidential and Proprietary
Fusion Tables (Public)http://research.google.com/tables
![Page 6: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/6.jpg)
Google Confidential and Proprietary
Google Trendshttp://www.google.cz/trends
![Page 7: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/7.jpg)
Google Confidential and Proprietary
Google Trends Correlatehttp://www.google.com/trends/correlate
![Page 8: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/8.jpg)
Google Confidential and Proprietary
Google Ngram Viewerhttp://books.google.com/ngrams
![Page 9: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/9.jpg)
Google Confidential and Proprietary
Data refinement
![Page 10: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/10.jpg)
Google Confidential and Proprietary
Open Refinehttp://code.google.com/p/google-refine/
![Page 11: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/11.jpg)
Google Confidential and Proprietary
![Page 12: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/12.jpg)
Google Confidential and Proprietary
Data analysis
![Page 13: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/13.jpg)
Google Confidential and Proprietary
Big data.
1. Text editor2. Excel3. Local database
![Page 14: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/14.jpg)
Google Confidential and Proprietary
https://bigquery.cloud.google.com/
Google BigQuery
![Page 15: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/15.jpg)
Google Confidential and Proprietary
Google Fusion Tableshttp://www.google.com/fusiontables http://research.google.com/tables
![Page 16: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/16.jpg)
Google Confidential and Proprietary
Data sharing
![Page 17: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/17.jpg)
Google Confidential and Proprietary
Google Fusion Tableshttp://www.google.com/fusiontables
![Page 18: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/18.jpg)
Google Confidential and Proprietary
Google Fusion Tables – in the wildhttp://www.guardian.co.uk/news/datablog/interactive/2011/aug/16/riots-poverty-map
![Page 19: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/19.jpg)
Google Confidential and Proprietary
Google Docs – in the wildhttp://data.blog.ihned.cz/c1-57386250-hledejte-s-nami-fakta-v-projevu-davida-ratha
![Page 20: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/20.jpg)
Google Confidential and Proprietary
Big thank you! (for your attention)
Filip Hráček, Google Czech Republic
![Page 21: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4](https://reader036.vdocuments.us/reader036/viewer/2022090605/6059ee1485d356695d51b0cf/html5/thumbnails/21.jpg)
Google Confidential and Proprietary
Links
● Public Data Explorer○ Import public data form
● Google Correlate○ Explanation (Comics)
● Ngram Viewer○ Advanced use