hacking data visualisations

84
Hacking Data Visualisations MELINDA SECKINGTON @MSECKINGTON

Upload: melinda-seckington

Post on 05-Jun-2015

387 views

Category:

Technology


0 download

DESCRIPTION

A quick look at how/why we process data visualization, a brief history of visualizations, and an intro to R.

TRANSCRIPT

Page 1: Hacking data visualisations

Hacking Data Visualisations

MELINDA SECKINGTON !@MSECKINGTON

Page 2: Hacking data visualisations

@mseckington

Page 3: Hacking data visualisations
Page 4: Hacking data visualisations
Page 5: Hacking data visualisations
Page 6: Hacking data visualisations

Hacking data visualisations

@mseckington

Page 7: Hacking data visualisations

Why?

Page 8: Hacking data visualisations

https://www.flickr.com/photos/laurenmanning/6632168961/

Page 9: Hacking data visualisations

https://www.flickr.com/photos/jamjar/5491205608

Page 10: Hacking data visualisations

“I feel that everyday, all of us now are being blasted by information design. It's being poured into our eyes through the Web, and we're all visualizers now; we're all demanding a visual aspect to our information. There's something almost quite magical about visual information. It's effortless, it literally pours in. And if you're navigating a dense information jungle, coming across a beautiful graphic or a lovely data visualization, it's a relief, it's like coming across a clearing in the jungle.”

DAVID MCCANDLESS - THE BEAUTY OF DATA VISUALIZATION

@mseckington

Page 11: Hacking data visualisations

Tor NorretrandersTHE BANDWIDTH OF OUR SENSES

@mseckington

Page 12: Hacking data visualisations
Page 13: Hacking data visualisations
Page 14: Hacking data visualisations
Page 15: Hacking data visualisations
Page 16: Hacking data visualisations

A brief history of data visualisations

Page 17: Hacking data visualisations

Theatrum Orbis Terrarum May 20, 1570

The first modern atlas, collected by Abraham Ortelis. !This was a first attempt to gather all maps that were known to man at the time and bind them together.

A BRIEF HISTORY OF DATA VISUALISATION

Page 18: Hacking data visualisations

https://www.flickr.com/photos/smailtronic/2361594300

Page 19: Hacking data visualisations

A BRIEF HISTORY OF DATA VISUALISATION

Bills of Mortality

From 1603, London parish clerks collected health-related population data in order to monitor plague deaths, publishing the London Bills of Mortality on a weekly basis. !John Graunt amalgamated 50 years of information from the bills, producing the first known tables of public health data.

BEAUTIFUL SCIENCE AT THE BRITISH LIBRARY - THE GUARDIAN

Page 20: Hacking data visualisations

A BRIEF HISTORY OF DATA VISUALISATION

1644: First known graph of statistical data !

MICHAEL VAN LANGREN - ESTIMATES OF DISTANCE IN LONGITUDE BETWEEN TOLEDO AND ROME

Page 21: Hacking data visualisations

A BRIEF HISTORY OF DATA VISUALISATION

Page 22: Hacking data visualisations

A BRIEF HISTORY OF DATA VISUALISATION

1786 first bar chart William Playfair

Exports and imports of Scotland to and from different parts for one Year from Christmas 1780 to Christmas 1781

Page 23: Hacking data visualisations

A BRIEF HISTORY OF DATA VISUALISATION

Street map of cholera deaths in Soho 1853 John Snow

Snow's 'ghost map' shows deaths from cholera around Broad Street between 19 August and 30 September 1854. Snow simplified the street layout, highlighting the 13 water pumps serving the area and representing each death as a black bar. His map demonstrates how cholera was spreading, not by a 'miasma' rising from the Thames, but in water contaminated by human waste

BEAUTIFUL SCIENCE AT THE BRITISH LIBRARY - THE GUARDIAN

Page 24: Hacking data visualisations

A BRIEF HISTORY OF DATA VISUALISATION

Diagram of the Causes of Mortality in the Army in the East !1858 Florence Nightingale

In her seminal ‘rose diagram’, Nightingale demonstrated that far more soldiers died from preventable epidemic diseases (blue) than from wounds inflicted on the battlefield (red) or other causes (black) during the Crimean War (1853-56)

BEAUTIFUL SCIENCE AT THE BRITISH LIBRARY - THE GUARDIAN

Page 25: Hacking data visualisations

How?

Page 26: Hacking data visualisations

HOW?

https://www.flickr.com/photos/jdhancock/8031897271

Page 27: Hacking data visualisations

https://www.flickr.com/photos/laurenmanning/5658951917/

Page 28: Hacking data visualisations

HOW?

@mseckington

Page 29: Hacking data visualisations

HOW?

@mseckington

Page 30: Hacking data visualisations

HOW?

@mseckington

Page 31: Hacking data visualisations

HOW?

@mseckington

Page 32: Hacking data visualisations

HOW?

@mseckington

Page 33: Hacking data visualisations

A quick intro to R

Page 34: Hacking data visualisations

A QUICK INTRO TO R

What is R? !

@mseckington

Page 35: Hacking data visualisations

A QUICK INTRO TO R

What is R? !R is a free programming language and environment for statistical computing and graphics. !

@mseckington

Page 36: Hacking data visualisations

A QUICK INTRO TO R

What is R? !R is a free programming language and environment for statistical computing and graphics. !Created by statisticians for statisticians.

@mseckington

Page 37: Hacking data visualisations

A QUICK INTRO TO R

What is R? !R is a free programming language and environment for statistical computing and graphics. !Created by statisticians for statisticians. !Comes with a lot of facilities for data manipulation, calculation, data analysis and graphical display.

@mseckington

Page 38: Hacking data visualisations

A QUICK INTRO TO R

What is R? !R is a free programming language and environment for statistical computing and graphics. !Created by statisticians for statisticians. !Comes with a lot of facilities for data manipulation, calculation, data analysis and graphical display. !Highly and easily extensible.

@mseckington

Page 39: Hacking data visualisations

A QUICK INTRO TO R

Page 40: Hacking data visualisations
Page 41: Hacking data visualisations

!> data()!!list all datasets available !

@mseckington

Page 42: Hacking data visualisations

!> data()!!list all datasets available !> movies = data(movies)!> movies <- data(movies)!!assign movies data to movies variable !

@mseckington

Page 43: Hacking data visualisations

!> data()!!list all datasets available !> movies = data(movies)!> movies <- data(movies)!!assign movies data to movies variable !> dim(movies)![1] 58788! 24!!

@mseckington

Page 44: Hacking data visualisations

!> data()!!list all datasets available !> movies = data(movies)!> movies <- data(movies)!!assign movies data to movies variable !> dim(movies)![1] 58788! 24!!> names(movies)![1] "title" “year" “length" “budget" "rating" “votes" ![7] “r1" “r2" “r3" “r4" “r5" “r6"![13] “r7" “r8" “r9" “r10" “mpaa" “Action" ![19] “Animation" "Comedy" “Drama" “Documentary" “Romance”"Short"!

@mseckington

Page 45: Hacking data visualisations

!> movies[7079,]! !!! title ! ! ! ! ! year ! length budget rating votes !7079 Bourne Identity, The 2002 !119!! 75000000 7.3 ! 29871 !!r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 mpaa !4.5 4.5 4.5 4.5 4.5 14.5 24.5 34.5 14.5 4.5 PG-13!!Action Animation Comedy Drama Documentary Romance Short! 1 0 0 1 0 0 0!!returns 1 row => all the data for 1 movies !

@mseckington

Page 46: Hacking data visualisations

!> movies[7079,]! !!! title ! ! ! ! ! year ! length budget rating votes !7079 Bourne Identity, The 2002 !119!! 75000000 7.3 ! 29871 !!r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 mpaa !4.5 4.5 4.5 4.5 4.5 14.5 24.5 34.5 14.5 4.5 PG-13!!Action Animation Comedy Drama Documentary Romance Short! 1 0 0 1 0 0 0!!returns 1 row => all the data for 1 movies !> movies[1:10,]!. . . !!returns rows 1 to 10

@mseckington

Page 47: Hacking data visualisations

!> movies[,1]!. . .!!returns 1 column => titles of all movies

@mseckington

Page 48: Hacking data visualisations

!> movies[,1]!. . .!!returns 1 column => titles of all movies !> movies$title!. . .!!same as movies[,1]!returns column with the label ‘title !

@mseckington

Page 49: Hacking data visualisations

!> movies[,1]!. . .!!returns 1 column => titles of all movies !> movies$title!. . .!!same as movies[,1]!returns column with the label ‘title !> movies[,1:10]!. . .!!returns columns 1 to 10

@mseckington

Page 50: Hacking data visualisations

!> hist(movies$year)

@mseckington

Page 51: Hacking data visualisations

!> hist(movies$year)

Histogram of movies$year

movies$yearFrequency

1900 1920 1940 1960 1980 2000

02000

4000

6000

8000

@mseckington

Page 52: Hacking data visualisations

!> hist(movies$year)!!> hist(movies$rating)

@mseckington

Page 53: Hacking data visualisations

!> hist(movies$year)!!> hist(movies$rating)

Histogram of movies$rating

movies$ratingFrequency

2 4 6 8 10

02000

4000

6000

8000

@mseckington

Page 54: Hacking data visualisations

!> hist(movies$year)!!> hist(movies$rating)!!> library(ggplot2)

@mseckington

Page 55: Hacking data visualisations

!> hist(movies$year)!!> hist(movies$rating)!!> library(ggplot2)!!> qplot(rating, !! !!! data=movies, !!! geom="histogram")

@mseckington

Page 56: Hacking data visualisations

!> hist(movies$year)!!> hist(movies$rating)!!> library(ggplot2)!!> qplot(rating, !! !!! data=movies, !!! geom=“histogram")!!> qplot(rating, !!!! data=movies, !!! geom="histogram", !! binwidth=1)

@mseckington

Page 57: Hacking data visualisations

!> m = ggplot(movies, aes(rating))!!> m + geom_histogram()

@mseckington

Page 58: Hacking data visualisations

!> m = ggplot(movies, aes(rating))!!> m + geom_histogram()!!> m + geom_histogram(!! ! ! aes(fill = ..count..))

@mseckington

Page 59: Hacking data visualisations

!> m = ggplot(movies, aes(rating))!!> m + geom_histogram()!!> m + geom_histogram(!! ! ! aes(fill = ..count..))!!> m + geom_histogram(!! ! ! colour = "darkgreen", !! ! ! fill = "white", !! ! ! binwidth = 0.5)!!

@mseckington

Page 60: Hacking data visualisations

!> m = ggplot(movies, aes(rating))!!> m + geom_histogram()!!> m + geom_histogram(!! ! ! aes(fill = ..count..))!!> m + geom_histogram(!! ! ! colour = "darkgreen", !! ! ! fill = "white", !! ! ! binwidth = 0.5)!!> x = m + geom_histogram(!! ! ! ! binwidth = 0.5)!> x + facet_grid(Action ~ Comedy)!

@mseckington

Page 61: Hacking data visualisations

!> library(twitteR)!!> setup_twitter_oauth(!! ! "API key”, "API secret", "Access token", "Access secret”)!!

@mseckington

Page 62: Hacking data visualisations

FUTURELEARN STATS

Page 63: Hacking data visualisations

!> fl = read.csv(!! ! "futurelearn_dataset.csv", ! ! header=TRUE)!!

@mseckington

Page 64: Hacking data visualisations

!> fl = read.csv(!! ! "futurelearn_dataset.csv", ! ! header=TRUE)!!> source_table = table(fl$age)!> pie(source_table)

@mseckington

Page 65: Hacking data visualisations

!> fl = read.csv(!! ! "futurelearn_dataset.csv", ! ! header=TRUE)!!> source_table = table(fl$age)!> pie(source_table)!!> pie(source_table, !! ! radius=0.6, !! ! col=rainbow(8))

@mseckington

Page 66: Hacking data visualisations
Page 67: Hacking data visualisations

!> library(twitteR)!!> setup_twitter_oauth(!! ! "API key”, "API secret", "Access token", "Access secret”)!!> tweets <- searchTwitter('futurelearn', n=100)

@mseckington

Page 68: Hacking data visualisations
Page 69: Hacking data visualisations

!> library(twitteR)!!> setup_twitter_oauth(!! ! "API key”, "API secret", "Access token", "Access secret”)!!> tweets <- searchTwitter('futurelearn', n=100)!!> library(“tm”)!!> tweet_text <- sapply(tweets, function(x) x$getText())!> tweet_corpus <- Corpus(VectorSource(tweet_text))!!

@mseckington

Page 70: Hacking data visualisations

!> library(twitteR)!!> setup_twitter_oauth(!! ! "API key”, "API secret", "Access token", "Access secret”)!!> tweets <- searchTwitter('futurelearn', n=100)!!> library(“tm”)!!> tweet_text <- sapply(tweets, function(x) x$getText())!> tweet_corpus <- Corpus(VectorSource(tweet_text))!!> tweet_corpus <- tm_map(tweet_corpus, !!! ! ! ! ! ! ! ! ! content_transformer(tolower))!> tweet_corpus <- tm_map(tweet_corpus, removePunctuation)!> tweet_corpus <- tm_map(tweet_corpus, !! !! ! ! ! ! ! ! ! function(x)removeWords(x,stopwords()))

Page 71: Hacking data visualisations
Page 72: Hacking data visualisations

!> library(wordcloud)!!> wordcloud(tweet_corpus)

@mseckington

Page 73: Hacking data visualisations

!> library(wordcloud)!!> wordcloud(tweet_corpus)

@mseckington

Page 74: Hacking data visualisations

What next?

Page 75: Hacking data visualisations

A QUICK INTRO TO R

Page 76: Hacking data visualisations

A QUICK INTRO TO R

Page 77: Hacking data visualisations
Page 78: Hacking data visualisations

WHAT NEXT?

@mseckington

Page 79: Hacking data visualisations

https://www.flickr.com/photos/jamjar/5491205608

Page 80: Hacking data visualisations

@mseckington

Page 81: Hacking data visualisations

Recap

Page 82: Hacking data visualisations

Data visualisations are awesome

@mseckington

Page 83: Hacking data visualisations

R is awesome

@mseckington

Page 84: Hacking data visualisations

Any questions? !

@mseckington