![Page 1: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/1.jpg)
EntertheTidyverseBIO5312FALL2017
STEPHANIE J. SPIELMAN,PHD
![Page 2: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/2.jpg)
Whatisthe“tidyverse”?AcollectionofRpackageslargelydevelopedbyHadleyWickhamandothersatRstudio
Haveemergedasstaplesofmodern-daydatascienceinthepast5—10years
Wewillfocuson:• Visualization/plottingwithggplot2• Datamanagementand”wrangling”withdplyr andtidyr•DocumentpresentationwithRMarkdown
![Page 3: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/3.jpg)
FocusisontidydataframesEachvariableformsacolumn.Eachobservationformsarow.Eachtypeofobservationalunitformsatable.
Tidydataprovidesaconsistentapproachtodatamanagementthatgreatlyfacilitatesdownstreamanalysisandviz
![Page 4: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/4.jpg)
WorkingwithtidydataThepackagedplyr canmanipulateandmanagetidydata
Thepackagetidyr canrearrangedatatoconvertto/fromtidydata
Thepackageggplot2 isusedforvisualization/plotting
![Page 5: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/5.jpg)
Thefundamentalverbsofdplyrfilter() selectrowsselect() selectcolumnsmutate() createnewcolumnsgroup_by() establish adatagroupingtally() count observationsinagroupingsummarize() calculate summarystatisticarrange() arrangerows
Therearemorefunctionsbuttheseonesarekey!
![Page 6: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/6.jpg)
Thepipeoperator%>%“Pipes”outputfromonefunction/operationasinputtothenext
## Find the mean of iris sepal lengthsmean.sepal <- mean(iris$Sepal.Length)
## Using %>%mean.sepal <- iris$Sepal.Length %>% mean()
iris$Sepal.Length %>% mean() -> mean.sepal
iris %>% mean(Sepal.Length) -> mean.sepal
“forwardassignment”operatorfollowsthelogicalflowofpiping
## Start simple: display datahead(iris)
## Using %>%iris %>% head()
![Page 7: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/7.jpg)
dplyr demoCommandsindemoareonsjspielman.org/bio5312_fall2017/day2_tidyverse1
![Page 8: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/8.jpg)
Visualizingwithggplot2Thepackageggplot2 isagraphicspackagethatimplementsagrammarofgraphics◦ Operatesondataframes,notvectorslikeBaseR◦ Explicitlydifferentiatesbetweenthedataandtherepresentationofthedata
![Page 9: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/9.jpg)
Theggplot2 grammar
Grammar element* What isit
Data Thedataframebeingplotted
Geometrics Thegeometricshapethatwillrepresentthedata• Point,boxplot,histogram, violin,bar,etc.
Aesthetics Theaesthetics ofthegeometricobject• Color,size,shape,etc.
*Tableistinysubsetofwhatggplot2hastooffer
![Page 10: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/10.jpg)
Example:scatterplot> ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) + geom_point()
●●●
●●
●
●●
●● ●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●●●
● ●●●
●●
●●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
● ●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
2
4
6
5 6 7 8Sepal.Length
Petal.Length
![Page 11: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/11.jpg)
Example:scatterplot> ggplot( iris, aes(x = Sepal.Length, y = Petal.Length) ) + geom_point()
●●●
●●
●
●●
●● ●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●●●
● ●●●
●●
●●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
● ●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
2
4
6
5 6 7 8Sepal.Length
Petal.Length
• Passinthedataframeasyourfirstargument
• Aestheticsmapthedataontoplotcharacteristics,herexandyaxes
• Displaythedatageometricallyaspoints
![Page 12: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/12.jpg)
Example:scatterplotwithcolor> ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) + geom_point(color = "red" )
●●●
●●
●
●●
●● ●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●●●
● ●●●
●●
●●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
● ●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
2
4
6
5 6 7 8Sepal.Length
Petal.Length
![Page 13: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/13.jpg)
Example:scatterplotwithaes color> ggplot(iris, aes(x = Sepal.Length, y = Petal.Length, color = Species)) + geom_point()
●●●
●●
●
●●
●● ●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●●●
● ●●●
●●
●●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
● ●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
2
4
6
5 6 7 8Sepal.Length
Petal.Length Species
●
●
●
setosa
versicolor
virginica
• Placingcolorinsideaesethetic mapsittothedata.
![Page 14: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/14.jpg)
Example:scatterplotwithaes color,shape
> ggplot(iris, aes(x = Sepal.Length, y = Petal.Length, color = Species, shape = Species)) + geom_point()
●●●
●●
●
●●
●● ●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●●●
● ●●●
●●
●●
●●
●
●●●
●
●
●
●
●●
●
2
4
6
5 6 7 8Sepal.Length
Petal.Length Species
● setosa
versicolor
virginica
![Page 15: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/15.jpg)
Aestheticsmaybeplacedinsidetherelevantgeom
> ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) + geom_point(aes(color = Species, shape = Species))
●●●
●●
●
●●
●● ●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●●●
● ●●●
●●
●●
●●
●
●●●
●
●
●
●
●●
●
2
4
6
5 6 7 8Sepal.Length
Petal.Length Species
● setosa
versicolor
virginica
> ## Remember dplyr!> iris %>% ggplot(aes(x = Sepal.Length, y =
Petal.Length)) + geom_point(aes(color = Species, shape = Species))
![Page 16: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/16.jpg)
Aestheticsareformappingonly> ### Color all points blue?> ggplot(iris, aes(x = Sepal.Length, y = Petal.Length, color = "blue")) + geom_point()
●●●
●●
●
●●
●● ●
●●
●●
●●
●
●●
●●
●
●●
●●●●
● ●●●
●●
●●
●●
●●●●
●
●
●●
●●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●●
●●
●●
●
●
●●
●
●
● ●●
●
●●
●●
●
●
● ●●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●●
●●
●●
●●
●●
●
2
4
6
5 6 7 8Sepal.Length
Petal.Length
colour● blue
![Page 17: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/17.jpg)
Aestheticsareformappingonly> ### Color all points blue?> ggplot(iris, aes(x = Sepal.Length, y = Petal.Length, color = "blue")) + geom_point()
> ### Correctly color all points blue> ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) + geom_point(color = "blue")
●●●
●●
●
●●
●● ●
●●
●●
●●
●
●●
●●
●
●●
●●●●
● ●●●
●●
●●
●●
●●●●
●
●
●●
●●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●●
●●
●●
●
●
●●
●
●
● ●●
●
●●
●●
●
●
● ●●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●●
●●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●●
●
●●
●●
●●
●●
●●
●
2
4
6
5 6 7 8Sepal.Length
Petal.Length
colour● blue
![Page 18: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/18.jpg)
Example:multiplegeoms> ### Use some fake data:> fake.data <- data.frame(t = 1:10, y = runif(10, 1, 100))
> ggplot(fake.data, aes(x = t, y = y)) + geom_point() + geom_line()
●
●
●
●
●
●
●
●
●
●
0
25
50
75
100
2.5 5.0 7.5 10.0t
y
![Page 19: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/19.jpg)
Makesureaestheticmappingsareproperlyapplied
> ggplot(fake.data, aes(x = t, y = y, size = y)) + geom_point() + geom_line()
●
●
●
●
●
●
●
●
●
●
0
25
50
75
100
2.5 5.0 7.5 10.0t
y
y●
●
●
25
50
75
![Page 20: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/20.jpg)
Makesureaestheticmappingsareproperlyapplied
> ggplot(fake.data, aes(x = t, y = y, size = y)) + geom_point() + geom_line()
> ggplot(fake.data, aes(x = t, y = y)) + geom_point( aes(size=y) ) + geom_line()
●
●
●
●
●
●
●
●
●
●
0
25
50
75
100
2.5 5.0 7.5 10.0t
y
y●
●
●
25
50
75
![Page 21: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/21.jpg)
Histograms> ggplot(iris, aes(x = Sepal.Length)) + geom_histogram()
0.0
2.5
5.0
7.5
10.0
12.5
5 6 7 8Sepal.Length
count
![Page 22: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/22.jpg)
Histograms> ggplot(iris, aes(x = Sepal.Length)) + geom_histogram( fill = "orange" )
0.0
2.5
5.0
7.5
10.0
12.5
5 6 7 8Sepal.Length
count
![Page 23: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/23.jpg)
Histograms> ggplot(iris, aes(x = Sepal.Length)) + geom_histogram( fill = "orange", color = "brown" )
0.0
2.5
5.0
7.5
10.0
12.5
5 6 7 8Sepal.Length
count
![Page 24: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/24.jpg)
Histograms> ggplot(iris, aes(x = Sepal.Length)) + geom_histogram( fill = "orange", color = "brown" )
+ xlab("Sepal Length") + ylab("Count") + ggtitle("Histogram of iris sepal lengths")
0.0
2.5
5.0
7.5
10.0
12.5
5 6 7 8Sepal Length
Cou
nt
Histogram of iris sepal lengths
![Page 25: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/25.jpg)
Boxplots> ggplot(iris, aes(x = "", y = Sepal.Length)) + geom_boxplot()
5
6
7
8
x
Sepal.Length
![Page 26: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/26.jpg)
Boxplots> ggplot(iris, aes(x = "", y = Sepal.Length)) + geom_boxplot(fill = "green")
5
6
7
8
x
Sepal.Length
![Page 27: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/27.jpg)
Boxplots> ggplot(iris, aes(x = Species, y = Sepal.Length)) + geom_boxplot(fill = "green")
●5
6
7
8
setosa versicolor virginicaSpecies
Sepal.Length
![Page 28: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/28.jpg)
Boxplots> ggplot(iris, aes(x = Species, y = Sepal.Length, fill = Species)) + geom_boxplot()
●5
6
7
8
setosa versicolor virginicaSpecies
Sepal.Length Species
setosa
versicolor
virginica
![Page 29: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/29.jpg)
Boxplots:Customizingthefillmappings
> ggplot(iris, aes(x = Species, y = Sepal.Length, fill = Species)) + geom_boxplot() + scale_fill_manual(values=c("red", "blue", "purple"))
●5
6
7
8
setosa versicolor virginicaSpecies
Sepal.Length Species
setosa
versicolor
virginica
![Page 30: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/30.jpg)
scale_fill_manual()alsotweakslegend
> ggplot(iris, aes(x = Species, y = Sepal.Length, fill = Species)) + geom_boxplot() + scale_fill_manual(values=c("red", "blue", "purple"), name = "Species name", labels=c("SETOSA", "VIRGINICA", "VERSICOLOR"))
●5
6
7
8
setosa versicolor virginicaSpecies
Sepa
l.Len
gth Species name
SETOSA
VIRGINICA
VERSICOLOR
![Page 31: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/31.jpg)
Changingtheorder> ### Ordering depends on factor levels> levels(iris$Species)
[1] "setosa" "versicolor" "virginica"
> ### Change order of levels> iris$Species <- factor(iris$Species, levels=c("virginica", "setosa", "versicolor"))
[1] "virginica" "setosa" "versicolor"
> ### Replot> ggplot(iris, aes(x = Species, y = Sepal.Length, fill = Species)) +
geom_boxplot() + scale_fill_manual(values=c("red", "blue", "purple"))
●5
6
7
8
virginica setosa versicolorSpecies
Sepal.Length Species
virginica
setosa
versicolor
![Page 32: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/32.jpg)
GroupedboxplotsThiswillapplytoviolinplotsaswell.> ## Create another categorical variable for grouping purpopses> iris %>%
group_by(Species) %>%mutate(size = ifelse( Sepal.Width > median(Sepal.Width) , "big" , "small" )) -> iris2
> head(iris2) Source: local data frame [150 x 6]Groups: Species [3]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species size<dbl> <dbl> <dbl> <dbl> <fctr> <chr>
1 5.1 3.5 1.4 0.2 setosa big2 4.9 3.0 1.4 0.2 setosa small3 4.7 3.2 1.3 0.2 setosa small4 4.6 3.1 1.5 0.2 setosa small5 5.0 3.6 1.4 0.2 setosa big6 5.4 3.9 1.7 0.4 setosa big
Condition ValueifTRUE
ValueifFALSE
![Page 33: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/33.jpg)
Groupedboxplots> ggplot(iris2, aes( x = Species, fill=size, y=Sepal.Width)) + geom_boxplot()
●
●
●●
●
2.0
2.5
3.0
3.5
4.0
4.5
setosa versicolor virginicaSpecies
Sepal.W
idth size
big
small
![Page 34: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/34.jpg)
Groupedboxplots> ggplot(iris2, aes( x = size, fill = Species, y=Sepal.Width)) + geom_boxplot()
●
●●
●
●
2.0
2.5
3.0
3.5
4.0
4.5
big smallsize
Sepal.W
idth Species
setosa
versicolor
virginica
![Page 35: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/35.jpg)
Detour:scale_color_manual()customizescolor
> ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) + geom_point(aes(color = Species)) + scale_color_manual(values=c("cornflowerblue", "deepskyblue4", "lightcyan4"))
●●●
●●
●
●●
●● ●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●●●
●●●●●
●
●●
●●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
● ●
●
●
●●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
2
4
6
5 6 7 8Sepal.Length
Petal.Length Species
●
●
●
virginica
setosa
versicolor
![Page 36: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/36.jpg)
Detourround2:scale_<fill/color>_??Therearemany scalestousebesidesdefaultandcustom.◦ scale_<fil/color>_brewer()usespre-madecolorschemesfromcolorbrewer.org
◦ scale_color_gradient()cantakealowandhightofillalongaspectrum
Seehere:http://ggplot2.tidyverse.org/reference/#scales
![Page 37: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/37.jpg)
Violinplot> ggplot(iris, aes(x = Species, y = Sepal.Length, fill = Species)) + geom_violin()
5
6
7
8
virginica setosa versicolorSpecies
Sepal.Length Species
virginica
setosa
versicolor
![Page 38: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/38.jpg)
Barplot> ggplot(iris, aes(x = Species, fill = Species)) + geom_bar()
0
10
20
30
40
50
virginica setosa versicolorSpecies
count
Speciesvirginica
setosa
versicolor
![Page 39: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/39.jpg)
Stacked/groupedbarplot> head(iris2)
Source: local data frame [150 x 6]Groups: Species [3]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species size<dbl> <dbl> <dbl> <dbl> <fctr> <chr>
1 5.1 3.5 1.4 0.2 setosa big2 4.9 3.0 1.4 0.2 setosa small3 4.7 3.2 1.3 0.2 setosa small4 4.6 3.1 1.5 0.2 setosa small5 5.0 3.6 1.4 0.2 setosa big6 5.4 3.9 1.7 0.4 setosa big
![Page 40: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/40.jpg)
Stacked/groupedbarplot> ggplot(iris, aes(x = Species, fill = size)) + geom_bar()
0
10
20
30
40
50
setosa versicolor virginicaSpecies
count size
big
small
![Page 41: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/41.jpg)
Stacked/groupedbarplot> ggplot(iris, aes(x = Species, fill = size)) + geom_bar( position = "dodge" )
0
10
20
30
setosa versicolor virginicaSpecies
count size
big
small
![Page 42: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/42.jpg)
Densityplot> ggplot(iris, aes(x = Sepal.Length, fill = Species)) + geom_density()
Whatdoesthetailofthesetosa distributionlooklike?
0.0
0.4
0.8
1.2
5 6 7 8Sepal.Length
density
Speciessetosa
versicolor
virginica
![Page 43: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/43.jpg)
Densityplot> ggplot(iris, aes(x = Sepal.Length, fill = Species)) + geom_density( alpha = 0.5 )
0.0
0.4
0.8
1.2
5 6 7 8Sepal.Length
density
Speciessetosa
versicolor
virginica
![Page 44: Enter the TidyverseWhat is the “tidyverse”? A collection of R packages largely developed by Hadley Wickham and others at Rstudio Have emerged as staples of modern-day data science](https://reader036.vdocuments.us/reader036/viewer/2022081622/613c06adf8f21c0c82695650/html5/thumbnails/44.jpg)
ThemesGraybackgroundandgridnotworkingforyou?Meneither.
◦ Built-inotherthemes:http://ggplot2.tidyverse.org/reference/ggtheme.html
◦ Customizeyourtheme:http://ggplot2.tidyverse.org/reference/theme.html
◦ Usesomebodyelse'sthemes:◦ https://cran.r-project.org/web/packages/ggthemes/vignettes/ggthemes.html◦ https://cran.r-project.org/web/packages/cowplot/vignettes/introduction.html