r objects all r entities exist as objects they can all be operated on as data we will cover: ...
DESCRIPTION
Vectors Other ways of creating columns of numbers (vectors): The seq function seq(1,10,1) = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 seq(1,4,0.5) = 1, 1.5, 2, 2.5, 3, 3.5, 4 x:y 1:10 = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 2 * 1:10 = 2, 4, 6, 8, 10, 12, 14, 16, 18, 20 The rep function rep(2,4) = 2, 2, 2, 2 ?seq() ?rep()TRANSCRIPT
R objectsR objects
All R entities exist as objects They can all be operated on as data We will cover:
Vectors Factors Lists Data frames Tables Indexing R packages and datasets
VectorsVectors
Think of vectors as being equivalent to a single column
of numbers in a spreadsheet You can create a vector using the c( ) function
(concatenate) as follows:
x <- c( ) For example:
x <- c(1,2,4,8) creates a column of the numbers 1,2,4,8
VectorsVectors
Other ways of creating columns of numbers (vectors): The seq function
seq(1,10,1) = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
seq(1,4,0.5) = 1, 1.5, 2, 2.5, 3, 3.5, 4 x:y
1:10 = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
2 * 1:10 = 2, 4, 6, 8, 10, 12, 14, 16, 18, 20 The rep function
rep(2,4) = 2, 2, 2, 2
?seq()
?rep()
Indexing Indexing
Referencing (indexing) specific ‘cells’ in a column:
Example:
if x is the vector 1, 2, 5 then
x [1] = 1, x [2] = 2, x [3] = 5
and
x [1:2] = 1, 2 first two listed items in x
x [2:3] = 2, 5 2nd & 3rd listed items in x
x [x>2] = 5 use of ‘>’ and ‘<‘ characters
Performing simple operations on vectorsPerforming simple operations on vectors
In R, when you carry out simple operations (+ - * /) on
vectors that have the same number of entries, R just
performs the normal operations on the numbers in the
vector, entry by entry
If the vectors don’t have the same number of entries,
then R will cycle through the vector with the smaller
number of entries
Performing simple operations on vectorsPerforming simple operations on vectors
Example:
Performing simple operations on vectorsPerforming simple operations on vectors
Examples:
Performing simple operations on vectorsPerforming simple operations on vectors
Example:
Performing simple operations on vectorsPerforming simple operations on vectors
Vectors (columns of numbers) can be assigned by putting
together other vectors, for example:
FunctionsFunctions
R functions take arguments (information that you put into
the function which goes between the brackets) and can
perform a range of tasks In the case of the ‘help’ function the task is to display
information from the R documentation files A comprehensive list of R functions can be obtained from
the R reference manual under the help menu
Simple statistic functionsSimple statistic functions
R comes with some useful functions:
sqrt ( ) square root
mean ( ) arithmetic mean
hist ( ) calculating & plotting histograms
R also comes with pre-loaded datasets, which we’ll discuss
later….
Basic statistic functions on vectorsBasic statistic functions on vectors
> X1 <- c(1.1, 4.3, 5, 2, 1, 4, 9.5)
> sum(X1) sum = 26.9> mean(X1) mean = 3.842857> median(X1) median = 4> var(X1) variance = 8.762857> sd(X1) standard deviation = 2.960212> summary(X1)Min. 1st Qu. Median Mean 3rd Qu. Max.1.000 1.550 4.000 3.843 4.650 9.500> quantile(X1)0% 25% 50% 75% 100%1.00 1.55 4.00 4.65 9.50
Mixing vectors and scalarsMixing vectors and scalars
R has the very convenient feature of having operators
that work with vectors It is even possible to mix vectors and scalars For example:
> X1 <- c(1.1, 4.3, 5, 2, 1, 4, 9.5)
> X1 + 1
[1] 2.1 5.3 6.0 3.0 2.0 5.0 10.5
> X1 * 2
[1] 2.2 8.6 10.0 4.0 2.0 8.0 19.0
Vectors to record dataVectors to record data
> x = c(45,43,46,48,51,46,50,47,46,45)> length(x)[1] 10> x = c(x,48,49,51,50,49) # append values to x> length(x)[1] 15> x[16] = 41 # add to a specified index> length(x)[1] 16> mean(x)[1] 47.1875> x[17:20] = c(40,38,35,40) # add to many specified indices> length(x)[1] 20> mean(x)[1] 45.4
FactorsFactors
A factor is a vector that encodes information about the
group to which a particular observation belongs Categorical data is often used to classify data into various
levels or factors To make a factor is easy, using the factor function
Factors – smoking survey exampleFactors – smoking survey example
A survey asks people if they smoke or not. The data is:
Yes, No, No, Yes, Yes
> x=c("Yes","No","No","Yes","Yes")
> x # print out values in x
[1] "Yes" "No" "No" "Yes" "Yes"
> factor(x) # print out value in factor(x)
[1] Yes No No Yes Yes
Levels: No Yes # notice levels are printed.
Notice the difference in how R treats factors with this example
Factors – student height exampleFactors – student height example
Suppose the recorded height of South African and British
students are as follows
heights <- c(1.7,1.95,1.63,1.54,1.29)
You make a new vector fac_heights, to record the nationality
that each observation pertains to
fac_heights <- factor(c(“GB”, “SA”, “GB”, “GB”, “SA”))
Useful when testing for differences between groups
Factors – gender survey exampleFactors – gender survey example
Consider a survey that has data on 691 females and 692 males
> gender <- c(rep("female",691), rep("male",692)) # create vector
> gender <- factor(gender) # change vector to factor
• Once stored as a factor, the space required for storage is reduced
• Values “female” and “male” are the levels of the factor
> levels(gender) # assumes gender is a factor
[1] "female" "male"
• Internally, the factor ‘gender’ is stored as 691 1’s, followed by 692 2’s. It has stored with it a table that looks like this:
ListsLists
A set of objects (e.g. vectors) can be combined under a
single name as a list (similar to a spreadsheet in Excel)
Example:
x <- c (1, 7, 8, 9, 10)
y <- c (“red”, “yellow”, “blue”, “green”)
example_list <- list (size = x, colour = y)
Note: vectors can consist of characters (i.e. letters/words)
instead of numbers, but never numbers AND characters
Data framesData frames
The function data.frame( ): This is a special kind of list, in which the entries in a
specific position in the elements of the list correspond to
one another Each element of the list has the same length It is a rectangular table, with rows and columns
Data framesData frames
Example 1: Simple data frames can be created Enter the following information at the prompt line:
h <- c (150, 170, 168, 179, 130)
w <- c (65, 70, 72, 80, 51)
patient_data <- data.frame (weight=w, height=h)
Type in patient_data to see what’s just been created…
Access of elements in data framesAccess of elements in data frames
Individual elements can be accessed using a pair of
square brackets “[ ]” and by specifying their index, or
name
Here are some ways to access a cell, row or column:
patient_data$height accesses a column
patient_data [ , i] accesses the ith column
patient_data [ i, ] accesses the ith row
patient_data$height [i] i is the cell position in height
column
patient_data [ i, j ] looking for the jth cell in the ith column
Data framesData frames
More complex tables can be created Data within each column must have the same type (e.g.,
number, text), but different columns may have different
types – like a spreadsheet, as in the example:
Data framesData frames
Accessing specific cells, or data:
Note: "$" is a shortcut; minus "-" sign means not.
TablesTables
We often view categorical data with tables
The table function allows us to look at tables Its simplest usage is table(x) where x is a categorical
variable
TablesTables
Example: smoking survey
A survey asks people if they smoke or not. The data is:
Yes, No, No, Yes, Yes
> x=c("Yes","No","No","Yes","Yes")
> table(x)
x
No Yes
2 3
The table command simply adds up the frequency of each unique value of the data
View a list of R packages: library()
Access datasets with the data function
data( ) provides a list of all the datasets
data (Titanic) loads the Titanic dataset
summary (Titanic) provides summary information about
the Titanic dataset
attributes(Titanic) provides more information
Titanic dataset name will display the
data
List all datasets in a package, e.g., data(package='stats')
R packages and datasetsR packages and datasets
List preloaded datasets in R: data( ) Display the “women” dataset : women
Now let’s access specific data…… Access data from each column:
women$height or women[ ,1]
women$weight or women[ ,2] Access data from individual rows:
women[1, ] or women[10,] etc. Try it…….
Working through some examplesWorking through some examples
Now that you can access sample data, let’s work with it: Get the mean weight and height of the women in our
example….. Remember the help function: help(mean) Also, R can show an example: example(mean)
Working through some examplesWorking through some examples
Common useful functionsCommon useful functions
print() # prints a single R object
cat() # prints multiple objects, one after the other
length() # number of elements in a vector, or of a list
mean()
median()
range()
unique() # gives the vector of distinct values
sort() # sort elements into order
order() # x[order(x)] orders elements of x
rev() # reverse the order of vector elements