skillshare - let's talk about r in data journalism

17

Click here to load reader

Upload: school-of-data

Post on 19-Aug-2015

204 views

Category:

Data & Analytics


4 download

TRANSCRIPT

Page 1: Skillshare - Let's talk about R in Data Journalism

Let’s Talk About PRESENTED BY

DAVID SELASSIE OPOKU

@sdopoku

11 August 2015

An introduction for Data-driven Journalism

Page 2: Skillshare - Let's talk about R in Data Journalism

Outline1. TaRget audience

2. About R: What is R?

3. Example Use Case & Best PRactices

4. Setup & RStudio

5. Resources

Page 3: Skillshare - Let's talk about R in Data Journalism

Target Audience

Page 4: Skillshare - Let's talk about R in Data Journalism

R is a great tool for anyone who works with data

● Data journalists

● School of Data fellows

● Open Data enthusiasts

● People curious about or new to R

● Statisticians

Page 5: Skillshare - Let's talk about R in Data Journalism

About

Page 6: Skillshare - Let's talk about R in Data Journalism

What is R?1. Open source

2. Statistical computing & graphics programming language

and environment

3. More than just statistics and graphics

4. Wealth of functionality i.e packages

5. RStudio: a powerful integrated development

environment (IDE)

Page 7: Skillshare - Let's talk about R in Data Journalism

R vs. Spreadsheet-like software

1. More powerful data manipulation capabilities

2. It reads any type of data

3. Easier automation & faster computation

4. It supports larger data sets

5. Advanced Statistics capabilities

6. State-of-the-art graphics with packages such as ggplot2

7. It runs on many platforms

8. Anyone can contribute packages to improve its functionality

See: 14 Reasons Why R is Better Than Excel

Page 8: Skillshare - Let's talk about R in Data Journalism

Setup R & RStudio

Page 9: Skillshare - Let's talk about R in Data Journalism

Live Demo of R & RStudio Installation; RStudio

Environmnent

Page 10: Skillshare - Let's talk about R in Data Journalism

R in the Data pipeline

Page 11: Skillshare - Let's talk about R in Data Journalism

Popular R Packages In The Data pipeline ❖ Find & Obtain

➢ quandl (finance & economics) | foreign (SAS, SPSS) | RODBC,

RMySQL, RPostgresSQL, RSQLite (Databases) | XLConnect, xlsx (Excel)➢ Maps: sp, maptools, maps, ggmap➢ Web: XML, jsonlite, httr

❖ Clean & Verify➢ dplyr, tidyr (data manipulation) | stringr (regular expressions &

strings) | lubridate (dates and times)

❖ Analyze➢ car, randomForest, glmnet, caret,

❖ Visualise ➢ ggplot2, ggvis, rgl, leaflet, htmlwidgets, shiny, googleVis

❖ Report ➢ shiny, R Markdown, xtable, knitr

Page 12: Skillshare - Let's talk about R in Data Journalism

Example Use Case

Page 13: Skillshare - Let's talk about R in Data Journalism
Page 14: Skillshare - Let's talk about R in Data Journalism

Resources