skillshare - let's talk about r in data journalism

Post on 19-Aug-2015

204 Views

Category:

Data & Analytics

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Let’s Talk About PRESENTED BY

DAVID SELASSIE OPOKU

@sdopoku

11 August 2015

An introduction for Data-driven Journalism

Outline1. TaRget audience

2. About R: What is R?

3. Example Use Case & Best PRactices

4. Setup & RStudio

5. Resources

Target Audience

R is a great tool for anyone who works with data

● Data journalists

● School of Data fellows

● Open Data enthusiasts

● People curious about or new to R

● Statisticians

About

What is R?1. Open source

2. Statistical computing & graphics programming language

and environment

3. More than just statistics and graphics

4. Wealth of functionality i.e packages

5. RStudio: a powerful integrated development

environment (IDE)

R vs. Spreadsheet-like software

1. More powerful data manipulation capabilities

2. It reads any type of data

3. Easier automation & faster computation

4. It supports larger data sets

5. Advanced Statistics capabilities

6. State-of-the-art graphics with packages such as ggplot2

7. It runs on many platforms

8. Anyone can contribute packages to improve its functionality

See: 14 Reasons Why R is Better Than Excel

Setup R & RStudio

Live Demo of R & RStudio Installation; RStudio

Environmnent

R in the Data pipeline

Popular R Packages In The Data pipeline ❖ Find & Obtain

➢ quandl (finance & economics) | foreign (SAS, SPSS) | RODBC,

RMySQL, RPostgresSQL, RSQLite (Databases) | XLConnect, xlsx (Excel)➢ Maps: sp, maptools, maps, ggmap➢ Web: XML, jsonlite, httr

❖ Clean & Verify➢ dplyr, tidyr (data manipulation) | stringr (regular expressions &

strings) | lubridate (dates and times)

❖ Analyze➢ car, randomForest, glmnet, caret,

❖ Visualise ➢ ggplot2, ggvis, rgl, leaflet, htmlwidgets, shiny, googleVis

❖ Report ➢ shiny, R Markdown, xtable, knitr

Example Use Case

Resources

top related