nyu politics data lab workshop: data visualization with r ... · why r? mature, widely used,...

101
NYU Politics Data Lab Workshop: Data Visualization with R and ggplot2 Pablo Barber´ a Department of Politics New York University email: [email protected] twitter: @p barbera October 15, 2013

Upload: others

Post on 27-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

  • NYU Politics Data Lab Workshop:Data Visualization with R and ggplot2

    Pablo Barberá

    Department of PoliticsNew York University

    email: [email protected]: @p barbera

    October 15, 2013

  • Introduction Grammar of graphics Scales, axes, legends Applications Beyond ggplot2

    Data Visualization with R and ggplot2

    Purpose of workshop: to introduce tools to generate elegant andeffective plots for our academic research.Why R?

    Mature, widely used, open-source, easily extensible (5Kpackages on CRAN repository)

    Object-oriented programming language.

    Many built-in basic and advanced statistical tools.

    Why ggplot2?

    Based on “Grammar of Graphics” (Wilkinson, 2005)

    → powerful, consistent, modular.Sensible defaults, but also easy to customize

    Excellent online resources (and easy to google)

    Pablo Barberá Data Visualization with R and ggplot2 October 15, 2013 2/97

  • Introduction Grammar of graphics Scales, axes, legends Applications Beyond ggplot2

    Outline

    1 Mastering the grammar of graphics

    Building up a plot layer by layerScales, axes, legendsThemes and other options

    2 Applications

    Annotated line plotsRegression coefficient plotsNetwork visualizationMaps and spatial analysisAnimated plots

    3 Beyond ggplot2

    Pablo Barberá Data Visualization with R and ggplot2 October 15, 2013 3/97

  • Introduction Grammar of graphics Scales, axes, legends Applications Beyond ggplot2

    Unemployment Rate in the United States, 1948–2013Tr

    uman

    Eis

    enho

    wer

    Ken

    nedy

    John

    son

    Nix

    on

    For

    d

    Car

    ter

    Rea

    gan

    Bus

    h I

    Clin

    ton

    Bus

    h II

    Oba

    ma

    2%

    4%

    6%

    8%

    10%

    1950 1960 1970 1980 1990 2000 2010

    Une

    mpl

    oym

    ent R

    ate

    Party of President Democratic Republican

    Pablo Barberá Data Visualization with R and ggplot2 October 15, 2013 4/97

  • Introduction Grammar of graphics Scales, axes, legends Applications Beyond ggplot2

    What makes Facebook posts about politics popular?

    Format

    Content

    Language

    Time

    AuthorAuthor is male

    Author is page

    Posted in morning

    Posted in afternoon

    Posted in evening

    Post in English

    Mentions "boehner"

    Mentions "furlough*"

    Mentions "obamacare"

    Post is a photo

    Post is a status update

    Post is a video

    −25% 0% 25% 50%% increase in popularity metric

    Effect on...

    likes count

    comments count

    shares count

    Data: 65K public Facebook posts about govt. shutdown

    Pablo Barberá Data Visualization with R and ggplot2 October 15, 2013 4/97

  • Introduction Grammar of graphics Scales, axes, legends Applications Beyond ggplot2

    Visualizing your Facebook network

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●● ●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●● ●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ● ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    cluster

    ●●●●●●●

    NYUEUIUPFMACaixaNYCOthers

    Pablo Barberá Data Visualization with R and ggplot2 October 15, 2013 4/97

  • Introduction Grammar of graphics Scales, axes, legends Applications Beyond ggplot2

    Geolocated tweets, colored by language

    ●●

    ●●

    ●●

    ●●

    ● ●

    ● ●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ● ●

    ●●●

    ●●●

    ● ●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ●●

    ●●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ● ●

    ●●

    ● ●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●

    ●●