Download - 03 the R Language Part 2
-
7/24/2019 03 the R Language Part 2
1/163
2014 RStudio, Inc. All rights reserved. Follow @rstudioapp
All Training materials are provided "as is" and withoutwarranty and RStudio disclaims any and all express and
implied warranties including without limitation theimplied warranties of title, fitness for a particularpurpose, merchantability and noninfringement.
The Training Materials are licensed under the CreativeCommons Attribution-Noncommercial 3.0 United StatesLicense. To view a copy of this license, visithttp://creativecommons.org/licenses/by-nc/3.0/us/or send aletter to Creative Commons, 171 Second Street, Suite300, San Francisco, California, 94105, USA.
http://creativecommons.org/licenses/by-nc/3.0/us/http://twitter.com/rstudioapp -
7/24/2019 03 the R Language Part 2
2/163
Studio
2014 RStudio, Inc. Follow @rstudioapp
Garrett GrolemundMaster Instructor, RStudio
August 2014
Retrieve and use informationin precise, e#cient ways
The R language (2 of 2)
http://twitter.com/rstudioapp -
7/24/2019 03 the R Language Part 2
3/163
2014 RStudio, In
1. Subsetting
2. R Packages
3. Logical tests
4. Missing values
Studio
-
7/24/2019 03 the R Language Part 2
4/163
2014 RStudio, In
How can you save just the fifth element of xto y
How can you change the fifth element of x to a
Question
x
-
7/24/2019 03 the R Language Part 2
5/163
Subsetting
-
7/24/2019 03 the R Language Part 2
6/163 2014 RStudio, In
With your neighbor, run the code on thefollowing slide IN YOUR HEADS
Your turn
vec
-
7/24/2019 03 the R Language Part 2
7/163 2014 RStudio, In
vec df
6 1 3 6 10 5John
Paul
George
Ringo
1940
1940
1942
1943
guitar
guitar
drums
bass
name birth instrument
vec[2]
vec[c(5, 6)]
vec[-c(5,6)]
vec[vec > 5]
# Predict what the following code will do
# DON'T RUN IT!
df[c(2, 4), 3]
df[ , 1]
df[ , "instrument"]
df$instrument
-
7/24/2019 03 the R Language Part 2
8/163 2014 RStudio, In
Subset notation
Studio
vec[2]
name of objectto subset
-
7/24/2019 03 the R Language Part 2
9/163 2014 RStudio, In
Subset notation
Studio
vec[?]
name of objectto subset
brackets(brackets always mean
subset)
-
7/24/2019 03 the R Language Part 2
10/163 2014 RStudio, In
Subset notation
Studio
vec[?]
name of objectto subset
brackets(brackets always mean
subset)
an index(that tells R which
elements to include)
d
-
7/24/2019 03 the R Language Part 2
11/163
2014 RStudio, In
Each dimension needs its own index!
Studio
vec[?] 6 1 3 6 10 5
S di
-
7/24/2019 03 the R Language Part 2
12/163
2014 RStudio, In
Each dimension needs its own index!
Studio
vec[?] 6 1 3 6 10 5
S di
-
7/24/2019 03 the R Language Part 2
13/163
2014 RStudio, In
Each dimension needs its own index!
Studio
vec[?]
df[?,?]
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
St di
-
7/24/2019 03 the R Language Part 2
14/163
2014 RStudio, In
Each dimension needs its own index!
Studio
vec[?]
df[?,?]
whichrowstoinclude
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
St dio
-
7/24/2019 03 the R Language Part 2
15/163
2014 RStudio, In
Each dimension needs its own index!
Studio
vec[?]
df[?,?]
whichrowstoinclude
whichcolumnsto include
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
Studio
-
7/24/2019 03 the R Language Part 2
16/163
2014 RStudio, In
Each dimension needs its own index!
Studio
vec[?]
separatedimensions
with acomma
df[?,?]
whichrowstoinclude
whichcolumnsto include
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
Studio
-
7/24/2019 03 the R Language Part 2
17/163
2014 RStudio, In
Each dimension needs its own index!
Studio
vec[?]
df[?,?]
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
But what should go in the indexes?
Studio
-
7/24/2019 03 the R Language Part 2
18/163
2014 RStudio, In
Four ways to subset
1. Integers
2. Blank spaces
3. Names
4. Logical vectors (TRUE and FALSE)
Studio
Studio
-
7/24/2019 03 the R Language Part 2
19/163
2014 RStudio, In
Integers (positive)
Positive integers behave just like ijnotation inlinear algebra
Studio
df[?,?]
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
Studio
-
7/24/2019 03 the R Language Part 2
20/163
2014 RStudio, In
Integers (positive)
Studio
df[2,?]
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
Positive integers behave just like ijnotation inlinear algebra
Studio
-
7/24/2019 03 the R Language Part 2
21/163
2014 RStudio, In
Integers (positive)
Studio
df[2,3]
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
Positive integers behave just like ijnotation inlinear algebra
Studio
-
7/24/2019 03 the R Language Part 2
22/163
2014 RStudio, In
Integers (positive)
df[2,3]
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
Positive integers behave just like ijnotation inlinear algebra
Studio
-
7/24/2019 03 the R Language Part 2
23/163
2014 RStudio, In
df[c(2?4),c(2?3)]
Integers (positive)
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
Positive integers behave just like ijnotation inlinear algebra
Studio
-
7/24/2019 03 the R Language Part 2
24/163
2014 RStudio, In
df[c(2,4),c(2?3)]
Integers (positive)
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
Positive integers behave just like ijnotation inlinear algebra
Studio
-
7/24/2019 03 the R Language Part 2
25/163
2014 RStudio, In
df[c(2,4),c(2,3)]
Integers (positive)
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
Positive integers behave just like ijnotation inlinear algebra
Studio
-
7/24/2019 03 the R Language Part 2
26/163
2014 RStudio, In
Integers (positive)
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
df[c(2,4),c(2,3)]
Positive integers behave just like ijnotation inlinear algebra
Studio
-
7/24/2019 03 the R Language Part 2
27/163
2014 RStudio, In
Integers (positive)
Positive integers behave just like ijnotation inlinear algebra
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
df[c(2,4),3]
?
Studio
-
7/24/2019 03 the R Language Part 2
28/163
2014 RStudio, In
Integers (positive)
Positive integers behave just like ijnotation inlinear algebra
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
df[c(2,4),3]
Studio
-
7/24/2019 03 the R Language Part 2
29/163
2014 RStudio, In
1. Colons are a useful way to create vectors
1:4
# 1 2 3 4
df[1:4, 1:2]
2. Repeating input repeats outputdf[c(1,1,1,2,2), 1:3]
Studio
-
7/24/2019 03 the R Language Part 2
30/163
2014 RStudio, In
Integers (zero)
As an index, zero will return nothingfrom adimension. This creates an empty object.
vec[0]
# numeric(0)
df[1:2, 0]
# data frame with 0 columns and 2 rows
Studio
-
7/24/2019 03 the R Language Part 2
31/163
2014 RStudio, In
Integers (negative)
Negative integers return everything but theelements at the specified locations.
You cannot use both negative and positiveintegers in the samedimension
Studio
-
7/24/2019 03 the R Language Part 2
32/163
2014 RStudio, In
Integers (negative)
Negative integers return everything but theelements at the specified locations.
You cannot use both negative and positiveintegers in the samedimension
vec[c(5,6)] 6 1 3 6 10 5
Studio
-
7/24/2019 03 the R Language Part 2
33/163
2014 RStudio, In
vec[-c(5,6)]
Integers (negative)
Negative integers return everything but theelements at the specified locations.
You cannot use both negative and positiveintegers in the samedimension
6 1 3 6 10 5
Studio
-
7/24/2019 03 the R Language Part 2
34/163
2014 RStudio, In
Integers (negative)
Negative integers return everything but theelements at the specified locations.
You cannot use both negative and positiveintegers in the samedimension
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
df[c(2:4), 2:3]
Studio
-
7/24/2019 03 the R Language Part 2
35/163
2014 RStudio, In
Integers (negative)
Negative integers return everything but theelements at the specified locations.
You cannot use both negative and positiveintegers in the samedimension
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
df[-c(2:4), 2:3]
Studio
-
7/24/2019 03 the R Language Part 2
36/163
2014 RStudio, In
Integers (negative)
Negative integers return everything but theelements at the specified locations.
You cannot use both negative and positiveintegers in the samedimension
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
df[-c(2:4),-(2:3)]
-
7/24/2019 03 the R Language Part 2
37/163
2014 RStudio, In
Your Turn
1. Fix these poorly written subset commands
vec(1:4)
vec[-1:4]vec[3, 4, 5]
Studio
-
7/24/2019 03 the R Language Part 2
38/163
2014 RStudio, In
() for functions, [] for subsetting
vec[1:4]
# 6 1 3 6
Dont mix positive and negative integers; distribute thnegative sign (e.g., -1:4 = -1 0 1 2 3 4).
vec[-(1:4)]
# 10 5
Pass multiple values for the same dimension as a ve
vec[c(3, 4, 5)]
# 3 6 10
-
7/24/2019 03 the R Language Part 2
39/163
2014 RStudio, In
Your Turn
What is wrong with these subsetting commandsWhat will they do?
mat[2]
df[1]
-
7/24/2019 03 the R Language Part 2
40/163
2014 RStudio, In
1
2
3
4
5
6
7
8
9
mat[2]
How R makes a matrix
-
7/24/2019 03 the R Language Part 2
41/163
2014 RStudio, In
How R makes a matrix
1 2 3 4 5 6 7 8 9vec
How R makes a matrix
-
7/24/2019 03 the R Language Part 2
42/163
2014 RStudio, In
How R makes a matrix1
2
3
4
5
6
7
8
9
vec
How R makes a matrix
-
7/24/2019 03 the R Language Part 2
43/163
2014 RStudio, In
How R makes a matrix
1
23
vec
4
5
6
7
8
9
How R makes a matrix
-
7/24/2019 03 the R Language Part 2
44/163
2014 RStudio, In
How R makes a matrix
1
23
4
56
7
8
9
vec
How R makes a matrix
-
7/24/2019 03 the R Language Part 2
45/163
2014 RStudio, In
How R makes a matrix
1
23
4
56
7
89
vec
How R makes a matrix
-
7/24/2019 03 the R Language Part 2
46/163
2014 RStudio, In
How R makes a matrix
1
23
4
56
7
89
vecmatrix
-
7/24/2019 03 the R Language Part 2
47/163
2014 RStudio, In
1
2
3
4
5
6
7
8
9
1 2 3 4 5 6 7 8 9
mat[2]
-
7/24/2019 03 the R Language Part 2
48/163
2014 RStudio, In
John
df[2]
Paul
George
Ringo
1940
1942
1943
1940
bass
drums
guitar
guitar
How R makes a data frame
-
7/24/2019 03 the R Language Part 2
49/163
2014 RStudio, In
List c("a","b","c","d") c(1, 2, 3, 4) c(T, F, T, F)
How R makes a data frame
-
7/24/2019 03 the R Language Part 2
50/163
2014 RStudio, In
List c("a",
"b",
"c",
"d")
c(
1,
2,
3,
4)
c(T,
F,
T,
F)
-
7/24/2019 03 the R Language Part 2
51/163
2014 RStudio, In
c(
"a",
"b",
"c",
"d")
c(
1,
2,
3,
4)
c(T,
F,
T,
F)
List
-
7/24/2019 03 the R Language Part 2
52/163
2014 RStudio, In
c(
"a",
"b",
"c",
"d")
c(
1,
2,
3,
4)
c(T,
F,
T,
F)
Listdata frame
-
7/24/2019 03 the R Language Part 2
53/163
2014 RStudio, In
John
df[2]
Paul
George
Ringo
1940
1942
1943
1940
bass
drums
guitar
guitar
c("guitar", "bass",
"guitar","drums")
c("John","Paul",
"George","Ringo")
c(1940, 1942,
1943, 1940)
Studio
-
7/24/2019 03 the R Language Part 2
54/163
2014 RStudio, In
Blank spaces
Blank spaces return everything
(i.e., no subsetting occurs on that dimension)
vec[ ] 6 1 3 6 510
Studio
-
7/24/2019 03 the R Language Part 2
55/163
2014 RStudio, In
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
Blank spaces
Blank spaces return everything
(i.e., no subsetting occurs on that dimension)
df[1,]
Studio
-
7/24/2019 03 the R Language Part 2
56/163
2014 RStudio, In
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
Blank spaces
Blank spaces return everything
(i.e., no subsetting occurs on that dimension)
df[ ,2]
Studio
-
7/24/2019 03 the R Language Part 2
57/163
2014 RStudio, In
If your object has names, you can ask forelements or columns back by name.
Names
vec[c("a","b","d")] 6 1 3 6 510
Studio
-
7/24/2019 03 the R Language Part 2
58/163
2014 RStudio, In
If your object has names, you can ask forelements or columns back by name.
names(vec)
-
7/24/2019 03 the R Language Part 2
59/163
2014 RStudio, In
If your object has names, you can ask forelements or columns back by name.
names(vec)
-
7/24/2019 03 the R Language Part 2
60/163
2014 RStudio, In
If your object has names, you can ask forelements or columns back by name.
names(vec)
-
7/24/2019 03 the R Language Part 2
61/163
2014 RStudio, In
Names
If your object has names, you can ask forelements or columns back by name.
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
df[ ,"birth"]
name birth instrument
N
Studio
-
7/24/2019 03 the R Language Part 2
62/163
2014 RStudio, In
Names
If your object has names, you can ask forelements or columns back by name.
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
df[ ,c("name","birth")]
name birth instrument
L i l
Studio
-
7/24/2019 03 the R Language Part 2
63/163
2014 RStudio, In
You can subset with a logical vector of the samelength as the dimension you are subsetting.Each element that corresponds to a TRUE willbe returned.
Logical
vec[c(FALSE,TRUE,FALSE,TRUE,TRUE,FALSE)]
6 1 3 6 510
L i l
Studio
-
7/24/2019 03 the R Language Part 2
64/163
2014 RStudio, In
You can subset with a logical vector of the samelength as the dimension you are subsetting.Each element that corresponds to a TRUE willbe returned.
Logical
vec[c(FALSE,TRUE,FALSE,TRUE,TRUE,FALSE)]
6 1 3 6 510
c(FALSE,TRUE,FALSE,TRUE,TRUE,FALSE)
L i l
Studio
-
7/24/2019 03 the R Language Part 2
65/163
2014 RStudio, In
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
You can subset with a logical vector of the samelength as the dimension you are subsetting.Each element that corresponds to a TRUE willbe returned.
Logical
df[c(FALSE,TRUE,TRUE,FALSE), ]
L i l
Studio
-
7/24/2019 03 the R Language Part 2
66/163
2014 RStudio, In
You can subset with a logical vector of the same
length as the dimension you are subsetting.Each element that corresponds to a TRUE willbe returned.
Logical
df[c(FALSE,TRUE,TRUE,FALSE), ]
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
c(FALSE,TRUE,TRUE,FALSE)
Studio
Subset notation
-
7/24/2019 03 the R Language Part 2
67/163
2014 RStudio, In
Subset notation
Your Turn
-
7/24/2019 03 the R Language Part 2
68/163
2014 RStudio, In
Your Turn
Write down as many ways to extract the name"John"from dfas you can. Make sure eachworks. You have two minutes.
John
Paul
George
Ringo
1940
1940
1941
1943
guitar
guitar
drums
bass
name birth instrument
# Answers
Studio
-
7/24/2019 03 the R Language Part 2
69/163
2014 RStudio, In
df[1, 1]
df[1, "name"]
df[1, -(2:3)]
df[1, c(TRUE, FALSE, FALSE)]
df[-(2:4), 1]
df[-(2:4), "name"]
df[-(2:4), -(2:3)]
df[-(2:4), c(TRUE, FALSE, FALSE)]df[c(TRUE, FALSE, FALSE, FALSE), 1]
df[c(TRUE, FALSE, FALSE, FALSE), "name"]
df[c(TRUE, FALSE, FALSE, FALSE), -(2:3)]
df[c(T, F, F, F), c(T, F, F)]
Your Turn
-
7/24/2019 03 the R Language Part 2
70/163
2014 RStudio, In
Your Turn
lst
-
7/24/2019 03 the R Language Part 2
71/163
2014 RStudio, In
Subsetting lists
sum(lst[1]) # Error!
# What is the di"
erence?
lst[c(1,2)]
lst[1]
lst[[1]]
lst c(1, 2) c("a", "b", "c")TRUE
Studio
-
7/24/2019 03 the R Language Part 2
72/163
2014 RStudio, In
If list xis a train carryingobjects, then x[[5]]is the
object in car 5; x[4:6]is a
train of cars 4-6.
http://twitter.com/#!/RLangTip/status/118339256388304896
Studio
http://twitter.com/#!/RLangTip/status/118339256388304896 -
7/24/2019 03 the R Language Part 2
73/163
2014 RStudio, In
lst c(1, 2) c("a", "b", "cTRUE
Studio
-
7/24/2019 03 the R Language Part 2
74/163
2014 RStudio, In
c(1, 2) c("a", "b", "cTRUE
Studio
-
7/24/2019 03 the R Language Part 2
75/163
2014 RStudio, In
c("a", "b", "cc(1, 2) TRUE
lst[c(1,2)]
Studio
-
7/24/2019 03 the R Language Part 2
76/163
2014 RStudio, In
lst[c(1,2)]
c(1, 2) c("a", "b", "cTRUE
c(1, 2) TRUE
# [[1]]
# [1] 1 2
#
# [[2]]
# [1] TRUE
Studio
-
7/24/2019 03 the R Language Part 2
77/163
2014 RStudio, In
c("a", "b", "cc(1, 2) TRUE
lst[1]
Studio
-
7/24/2019 03 the R Language Part 2
78/163
2014 RStudio, In
c("a", "b", "cTRUEc(1, 2)
c(1, 2)lst[1]
# [[1]]
# [1] 1 2
Studio
-
7/24/2019 03 the R Language Part 2
79/163
2014 RStudio, In
lst[[1]]
c("a", "b", "cc(1, 2) TRUE
Studio
-
7/24/2019 03 the R Language Part 2
80/163
2014 RStudio, In
c("a", "b", "cc(1, 2) TRUE
c(1, 2)lst[[1]]
Studio
-
7/24/2019 03 the R Language Part 2
81/163
2014 RStudio, In
c("a", "b", "cc(1, 2) TRUE
Wha
twillt h
lst[[1]][2]
Studio
-
7/24/2019 03 the R Language Part 2
82/163
2014 RStudio, In
lst[[1]][2]
c("a", "b", "cc(1, 2) TRUE
Studio
-
7/24/2019 03 the R Language Part 2
83/163
2014 RStudio, In
lst[[1]][2]
c("a", "b", "cc(1, 2) TRUE
c(1, 2)[2]
Studio
(" " "b" "( )
-
7/24/2019 03 the R Language Part 2
84/163
2014 RStudio, In
lst[[1]][2]
c("a", "b", "cc(1, 2) TRUE
c(1, 2)[2]
2
$
Studio
-
7/24/2019 03 the R Language Part 2
85/163
2014 RStudio, In
$The most common syntax for subsetting listsand data frames
names(lst)
-
7/24/2019 03 the R Language Part 2
86/163
2014 RStudio, In
lst c(1, 2) c("a", "b", "c")TRUE
lst$alpha
Studio
alpha beta gamma
-
7/24/2019 03 the R Language Part 2
87/163
2014 RStudio, In
lst c(1, 2) c("a", "b", "c")TRUE
lst$alpha
name of list
Studio
alpha beta gamma
-
7/24/2019 03 the R Language Part 2
88/163
2014 RStudio, In
lst c(1, 2) c("a", "b", "c")TRUE
lst$alpha
$name of list
Studio
alpha beta gamma
-
7/24/2019 03 the R Language Part 2
89/163
2014 RStudio, In
lst c(1, 2) c("a", "b", "c")TRUE
lst$alpha
$name of listname of element
(no quotes)
Studio
alpha beta gamma
-
7/24/2019 03 the R Language Part 2
90/163
2014 RStudio, In
lst c(1, 2) c("a", "b", "c")TRUE
lst$alpha
$name of listname of element
(no quotes)
c(1, 2)
Studio
alpha beta gamma
-
7/24/2019 03 the R Language Part 2
91/163
2014 RStudio, In
lst c(1, 2) c("a", "b", "c")TRUE
lst$alpha
c(1, 2)
Same
aslst
Studio
John 1940 guitar
name birth instrument
-
7/24/2019 03 the R Language Part 2
92/163
2014 RStudio, In
df$birth
Paul
George
Ringo 1940
1941
1943 guitar
drums
bass
Studio
John 1940 guitar
name birth instrument
-
7/24/2019 03 the R Language Part 2
93/163
2014 RStudio, In
df$birth
name of dataframe
Paul
George
Ringo 1940
1941
1943 guitar
drums
bass
Studio
John 1940 guitar
name birth instrument
-
7/24/2019 03 the R Language Part 2
94/163
2014 RStudio, In
df$birth
$name of data
frame
Paul
George
Ringo 1940
1941
1943 guitar
drums
bass
Studio
John 1940 guitar
name birth instrument
-
7/24/2019 03 the R Language Part 2
95/163
2014 RStudio, In
df$birth
$name of data
framename of column
(no quotes)
Paul
George
Ringo 1940
1941
1943 guitar
drums
bass
Studio
John 1940 guitar
name birth instrument
-
7/24/2019 03 the R Language Part 2
96/163
2014 RStudio, In
df$birth
$name of data
framename of column
(no quotes)
c(1940, 1941, 1943, 1940)
Paul
George
Ringo 1940
1941
1943 guitar
drums
bass
-
7/24/2019 03 the R Language Part 2
97/163
RPackages
R Packages
Studio
-
7/24/2019 03 the R Language Part 2
98/163
2014 RStudio, In
A collection of code and functions written
for the R language.
Usually focuses on a specific task orproblem.
Most of the useful R applications appear in
packages.
Start RStudio
Studio
-
7/24/2019 03 the R Language Part 2
99/163
2014 RStudio, In
ggplot2
Internet Your hard drive Your R session
Install your package withi ll k (" l 2")
Studio
-
7/24/2019 03 the R Language Part 2
100/163
ggplot2
2014 RStudio, In
install.packages("ggplot2")
ggplot2
Internet Your hard drive Your R session
install.packages("ggplot2")
Load your package withlib ( l t2)
( )EVERY
Studio
-
7/24/2019 03 the R Language Part 2
101/163
library(ggplot2)
2014 RStudio, InInternet Your hard drive Your R session
library(ggplot2)
ggplot2ggplot2
( )TIME
ggplot2
Studio
-
7/24/2019 03 the R Language Part 2
102/163
2014 RStudio, InInternet Your hard drive Your R session
library(ggplot2)
ggplot2ggplot2
ggplot2
qplot(1:10, 1:10)
Studio
-
7/24/2019 03 the R Language Part 2
103/163
2014 RStudio, In
## Error: could not find function "qplot"
You cannot use a function package until you load th
package
library(ggplot2)
qplot(1:10, 1:10)
##
Package summary
Studio
-
7/24/2019 03 the R Language Part 2
104/163
2014 RStudio, In
1.Download the package withinstall.packages("name")
% You only have to do this once
% You should be connected to the internet
2.Load the package withlibrary("name")
% You have to do this each time you start anR session.
There
areov
Rpa
c
Your Turn
-
7/24/2019 03 the R Language Part 2
105/163
2014 RStudio, In
We're going to use the ggplot2, maps,
RColorBrewer, and scales packages today.Load them with
library("ggplot2")
library("maps")
library("RColorBrewer")
Note: If you have not yet installed them, you'll need to install.packages(c("ggplot2", maps",
"RColorBrewer")) first.
-
7/24/2019 03 the R Language Part 2
106/163
Diamonds
Diamonds data
-
7/24/2019 03 the R Language Part 2
107/163
2014 RStudio, In
% ~54,000round diamonds from &
http://www.diamondse.info/
% comes in the ggplot2 package
% Carat, colour, clarity, cut
% Total depth, table, depth, &
width, height
% Price
Your turn
http://www.diamondse.info/ -
7/24/2019 03 the R Language Part 2
108/163
2014 RStudio, In
diamondsis huge!
Use subsetting to look at just the first sixrows of diamonds
Challenge: use subsetting to look at justthe lastsix rows
di d [1 6 ]
Studio
-
7/24/2019 03 the R Language Part 2
109/163
2014 RStudio, In
diamonds[1:6, ]
nrow(diamonds)
# 53940diamonds[53935:53940, ]
# Same as
head(diamonds)
tail(diamonds)
Studio
View
-
7/24/2019 03 the R Language Part 2
110/163
2014 RStudio, In
The View function can also help you examine a data
it opens a spreadsheet like data viewer.
View(diamonds)# notice: Capital V
Studio
Help pages
-
7/24/2019 03 the R Language Part 2
111/163
2014 RStudio, In
You can open the help page for any R object(including functions) by typing ?followed by theobject's name
?diamonds
table width
x
-
7/24/2019 03 the R Language Part 2
112/163
2014 RStudio, In
z
depth = z / diameter
table = table width / x * 100
x, y, z in mm
-
7/24/2019 03 the R Language Part 2
113/163
2014 RStudio, Inqplot(x, y, data = diamonds)
What is weirdabout these
values?
-
7/24/2019 03 the R Language Part 2
114/163
2014 RStudio, Inqplot(x, y, data = diamonds)
Can you removethem with
-
7/24/2019 03 the R Language Part 2
115/163
2014 RStudio, Inqplot(x, y, data = diamonds)
them withsubsetting?
-
7/24/2019 03 the R Language Part 2
116/163
Logicaltests
Studio
Logical comparisons
-
7/24/2019 03 the R Language Part 2
117/163
2014 RStudio, In
What will these return?
1 < 3
1 > 3
c(1, 2, 3, 4, 5) > 3
Your turnx
-
7/24/2019 03 the R Language Part 2
118/163
2014 RStudio, In
c( , , 3, , 5)
Your turnx
-
7/24/2019 03 the R Language Part 2
119/163
2014 RStudio, In
( , , , , )
Your turnx
-
7/24/2019 03 the R Language Part 2
120/163
2014 RStudio, In
( , , , , )
Your turnx
-
7/24/2019 03 the R Language Part 2
121/163
2014 RStudio, In
( )
Your turnx
-
7/24/2019 03 the R Language Part 2
122/163
2014 RStudio, In
Your turnx
-
7/24/2019 03 the R Language Part 2
123/163
2014 RStudio, In
Your turnx
-
7/24/2019 03 the R Language Part 2
124/163
2014 RStudio, In
%in%
Studio
-
7/24/2019 03 the R Language Part 2
125/163
2014 RStudio, In
# What does this do?
1 %in% c(1, 2, 3, 4)
1 %in% c(2, 3, 4)
c(3,4,5,6) %in% c(2, 3, 4)
%in%
Studio
-
7/24/2019 03 the R Language Part 2
126/163
2014 RStudio, In
%in%tests whether the object on the left is a
member of the group on the right.
1 %in% c(1, 2, 3, 4)
# TRUE
1 %in% c(2, 3, 4)
# FALSE
c(3,4,5,6) %in% c(2, 3, 4)
# TRUE TRUE FALSE FALSE
You can combine logical tests with & | xor ! any and
Studio
Boolean operators
-
7/24/2019 03 the R Language Part 2
127/163
2014 RStudio, In
You can combine logical tests with &, |, xor, !, any, andall
x > 2 & x < 9
TRUE & TRUE
TRUE
You can combine logical tests with & | xor ! any and
Studio
Boolean operators
-
7/24/2019 03 the R Language Part 2
128/163
2014 RStudio, In
You can combine logical tests with &, |, xor, !, any, andall
x > 2 & x < 9
TRUE&TRUE
TRUE
You can combine logical tests with & | xor ! any and
Studio
Boolean operators
-
7/24/2019 03 the R Language Part 2
129/163
2014 RStudio, In
You can combine logical tests with &, |, xor, !, any, andall
x > 2& x < 9
TRUE&TRUE
TRUE
You can combine logical tests with & | xor ! any and
Studio
Boolean operators
-
7/24/2019 03 the R Language Part 2
130/163
2014 RStudio, In
You can combine logical tests with &, |, xor, !, any, andall
x > 2 & x < 9
TRUE & TRUE
TRUE
&
Are both condition 1 and condition 2 true?
Studio
-
7/24/2019 03 the R Language Part 2
131/163
2014 RStudio, In
Are both condition 1 andcondition 2 true?
|
Is either condition 1 or condition 2 true?
Studio
-
7/24/2019 03 the R Language Part 2
132/163
2014 RStudio, In
Is either condition 1 orcondition 2 true?
xor
Is either condition 1 or condition 2 true but not both?
Studio
-
7/24/2019 03 the R Language Part 2
133/163
2014 RStudio, In
Is either condition 1 orcondition 2 true, but not both?
!
Negation
Studio
-
7/24/2019 03 the R Language Part 2
134/163
2014 RStudio, In
Negation
any
Is any condition TRUE?
Studio
-
7/24/2019 03 the R Language Part 2
135/163
2014 RStudio, In
Is anycondition TRUE?
all
Is everycondition TRUE?
Studio
-
7/24/2019 03 the R Language Part 2
136/163
2014 RStudio, In
y
Studio
Logical operators
-
7/24/2019 03 the R Language Part 2
137/163
2014 RStudio, In
Studio
Boolean operators
-
7/24/2019 03 the R Language Part 2
138/163
2014 RStudio, In
Studio
-
7/24/2019 03 the R Language Part 2
139/163
2014 RStudio, In
w
-
7/24/2019 03 the R Language Part 2
140/163
2014 RStudio, In
Turn these sentences into logical tests in R
Is wpositive?
Is xgreater than 10 and less than 20?
Is object ythe word February?
Is every value in za day of the week?
# Answers
Studio
-
7/24/2019 03 the R Language Part 2
141/163
2014 RStudio, In
# Answers
w > 010 < x & x < 20
y == "February"
all(z %in% c("Monday", "Tuesday", "Wednesday
"Thursday", "Friday", "Saturday", "Sunday"
Studio
-
7/24/2019 03 the R Language Part 2
142/163
2014 RStudio, In
# Common mistakes
x > 10 & < 20
y = "February"
all(z == "Monday" | "Tuesday" | "Wednesday".
Studio
x > 10 & < 20??
-
7/24/2019 03 the R Language Part 2
143/163
2014 RStudio, In
x > 10 & < 20??
??TRUE & error!
error!
Studio
x > 10 & < 20??
-
7/24/2019 03 the R Language Part 2
144/163
2014 RStudio, In
x 10& 20??
??TRUE &error!
Studio
x > 10 & < 20??
-
7/24/2019 03 the R Language Part 2
145/163
2014 RStudio, In
x 10 & 20??
??TRUE &error!
error!
Studio
x > 10 & < 20??
-
7/24/2019 03 the R Language Part 2
146/163
2014 RStudio, In
??TRUE & error!
error!
-
7/24/2019 03 the R Language Part 2
147/163
Logicalsubsetting
Logical subsetting
Studio
-
7/24/2019 03 the R Language Part 2
148/163
2014 RStudio, In
Combining logical tests with subsetting is a verypowerful technique!
x_zeroes
-
7/24/2019 03 the R Language Part 2
149/163
2014 RStudio, In
# Prints to screen
diamonds[diamonds$x > 10, ]
# Saves to new data frame
big 10, ]
# Overwrites existing data frame.Dangerous!
diamonds
-
7/24/2019 03 the R Language Part 2
150/163
2014 RStudio, In
# Uh oh!
rm(diamonds)
str(diamonds)
# Phew!
-
7/24/2019 03 the R Language Part 2
151/163
Missingvalues
Studio
Data errors
-
7/24/2019 03 the R Language Part 2
152/163
2014 RStudio, In
Typically removing the entire row becauseof one error is overkill. Better to selectivelyreplace problem values with missingvalues.
In R, missing values are indicated by NA
-
7/24/2019 03 the R Language Part 2
153/163
2014 RStudio, In
-
7/24/2019 03 the R Language Part 2
154/163
2014 RStudio, In
-
7/24/2019 03 the R Language Part 2
155/163
2014 RStudio, In
-
7/24/2019 03 the R Language Part 2
156/163
2014 RStudio, In
-
7/24/2019 03 the R Language Part 2
157/163
2014 RStudio, In
-
7/24/2019 03 the R Language Part 2
158/163
2014 RStudio, In
Missing values propagate
Use i () to check for missing values
Studio
NA Behavior
-
7/24/2019 03 the R Language Part 2
159/163
2014 RStudio, In
Use is.na()to check for missing values
a
-
7/24/2019 03 the R Language Part 2
160/163
2014 RStudio, In
na.rmargument to remove missing values
prior to computation.
b
-
7/24/2019 03 the R Language Part 2
161/163
2014 RStudio, In
change individual values within an object
summary(diamonds$x)
diamonds$x[diamonds$x == 0]
diamonds$x[diamonds$x == 0]
-
7/24/2019 03 the R Language Part 2
162/163
2014 RStudio, In
diamonds$y[diamonds$y == 0]
-
7/24/2019 03 the R Language Part 2
163/163
2014 RStudio, Inqplot(x, y, data = diamonds)