skillshare - let's talk about r in data journalism
Post on 19-Aug-2015
204 Views
Preview:
TRANSCRIPT
Let’s Talk About PRESENTED BY
DAVID SELASSIE OPOKU
@sdopoku
11 August 2015
An introduction for Data-driven Journalism
Outline1. TaRget audience
2. About R: What is R?
3. Example Use Case & Best PRactices
4. Setup & RStudio
5. Resources
Target Audience
R is a great tool for anyone who works with data
● Data journalists
● School of Data fellows
● Open Data enthusiasts
● People curious about or new to R
● Statisticians
About
What is R?1. Open source
2. Statistical computing & graphics programming language
and environment
3. More than just statistics and graphics
4. Wealth of functionality i.e packages
5. RStudio: a powerful integrated development
environment (IDE)
R vs. Spreadsheet-like software
1. More powerful data manipulation capabilities
2. It reads any type of data
3. Easier automation & faster computation
4. It supports larger data sets
5. Advanced Statistics capabilities
6. State-of-the-art graphics with packages such as ggplot2
7. It runs on many platforms
8. Anyone can contribute packages to improve its functionality
See: 14 Reasons Why R is Better Than Excel
Setup R & RStudio
Live Demo of R & RStudio Installation; RStudio
Environmnent
R in the Data pipeline
Popular R Packages In The Data pipeline ❖ Find & Obtain
➢ quandl (finance & economics) | foreign (SAS, SPSS) | RODBC,
RMySQL, RPostgresSQL, RSQLite (Databases) | XLConnect, xlsx (Excel)➢ Maps: sp, maptools, maps, ggmap➢ Web: XML, jsonlite, httr
❖ Clean & Verify➢ dplyr, tidyr (data manipulation) | stringr (regular expressions &
strings) | lubridate (dates and times)
❖ Analyze➢ car, randomForest, glmnet, caret,
❖ Visualise ➢ ggplot2, ggvis, rgl, leaflet, htmlwidgets, shiny, googleVis
❖ Report ➢ shiny, R Markdown, xtable, knitr
Example Use Case
Resources
Resources - Individuals & Organisations 1. R Project
2. RStudio
3. Datacamp
4. Hadley Wickham - @hadleywickham
5. R-bloggers
6. Nathan Yau’s Flowing Data Tutorials
Resources - Tutorials, Articles & Books
Article: Data Analysts Captivated by R’s Power
Tutorials & Webinars
1. http://www.r-tutor.com/r-introduction
2. Code School’s Try R
3. 5 data visualizations in 5 minutes: each in 5 lines or less of R
4. RStudio Webinars
Cheatsheets: https://www.rstudio.com/resources/cheatsheets/
Books
1. R Cookbook (O'Reilly Cookbooks) by Paul Teetor
2. R Graphics Cookbook by Winston Chang
3. RStudio List of Training Books
References 1. What is R?
2. Beginner's guide to R: Introduction
3. How SAS, R & SPSS compare [infographic]
4. Comparison of R, Matlab, SciPy, Excel, SAS, SPSS, Stata
5. Garrett Grolemund’s Quick list of useful R packages
6. 14 reasons why R is better than Excel
7. An overview of RStudio Features
top related