introduction to r programming - lancaster

27
Introduction to R Programming Session 1. Getting Started with R Dr. Chao Zheng Mathematical Sciences & S3RI [email protected]

Upload: others

Post on 18-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Introduction toR ProgrammingSession 1. Getting Started with R

Dr. Chao ZhengMathematical Sciences & S3RI

[email protected]

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Introduction to the Introduction Sessions

The introduction sessions are designed for MSc Stats & OR students.During the sessions you will:▶ learn basic knowledge of R▶ have a (good) impression of R and its usefulness in statistics▶ see some examples of statistical analysis using R

Do NOT be afraid if you:▶ have no experience in R;▶ do not understand some of the R commands I demonstrate;▶ do not understand the statistical concepts I mention;

You are NOT expected to become familiar with R after these twosessions!

,Week 0 1/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Outline for Session 1.

1. R Introduction▶ What is R▶ Why using R▶ Who is using R

2. First Taste of R▶ Install R and RStudio▶ Basic commands in R▶ Feature of RStudio▶ Play with your first real-world dataset

,Week 0 2/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

1. R introduction

,Week 0 1. R introduction 3/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

What is R?▶ R is a language and a software environment for statistical computing

and graphics.▶ In that sense it is like: Matlab/SAS/Excel/Stata/SPSS/....▶ R is based on another programming language S (inspired by

Scheme), which is created by John Chambers in 1976, while at BellLabs.

▶ R was created by Ross Ihaka and Robert Gentleman at the Universityof Auckland, New Zealand, and is developed by the R Core Team.The first version (v1.0) of R was released on 29 February 2000.

▶ Latest version of R: v4.0.3

Figure: Ross Ihaka (left) and Robert Gentleman (right),

Week 0 1.1.1 Introduction to R 4/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Why Using R? — Lingua Franca of Statistics

It is ”THE” language for statisticians and data scientists.▶ Nearly all statistical methods are implemented in R, from the

elementary ones to the most state-of-the-art ones▶ The main features of R are:

◦ data handling and storage facility;◦ operators for matrix (and array) manipulation;◦ data analysis tools;◦ graphical facilities;

,Week 0 1.1.1 Introduction to R 5/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Why Using R? — Language of FutureIt is quickly becoming one of the most popular (statistics) languages.

Figure: Popularity of different programming languages. R and Python are languageswith growing trends. Note that most competitors in this figure are general-purposeprogramming language.

,Week 0 1.1.1 Introduction to R 6/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Why Using R? — Open-SourceR is an open-source programming language. This means that anyone canwork with R without any need for a license or a fee. Users are able toproduce their own add-on packages implementing new methods, anduploaded to an on-line repository called CRAN. Other users candownload these packages.

Figure: Number of R packages created,

Week 0 1.1.1 Introduction to R 7/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Why Using R? — Quality GraphR facilitates aesthetic and visually appealing graphs that set it apart fromother programming languages.

Figure: A gorgeous chart created by ggplot2 package in R, showing most traffickedcycle routes in London. Code is available here.

,Week 0 1.1.1 Introduction to R 8/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Why Using R? — Other Reasons

▶ Compatibility: R is highly compatible and can be paired with manyother programming languages like C, C++, Java, and Python.

▶ EaSy to get help: When you have a question with R and cannotfigure it out for yourself, do not forget resources available. Be awarethat often answers for your questions can be found on websites suchas StackOverflow and R-Bloggers...

▶ Eye-Catching Reports. With packages like Shiny and Markdown,reporting the analysis results with with the data, plots and R scriptsis extremely easy. You can choose your report format flexibly amongLatex/Word/Web apps. For example, your MATH6166/6173 Labsheets and solutions are all created by R Markdown.

,Week 0 1.1.1 Introduction to R 9/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Who is using R?

Figure: Break down the use of R by industry (left). Academics come first as Ris a language to do statistic related research. R is also the first choice in thehealthcare industry, followed by government and consulting. ’Big’ Companiesfrom different areas that are using R (Right).

,Week 0 1.1.1 Introduction to R 10/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Who is Using R?

Figure: ”The face you make when you create your first plot using R and proudlypresent it to a journalist.”

,Week 0 1.1.1 Introduction to R 11/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Is R Difficult to learn?

,Week 0 1.1.1 Introduction to R 12/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Is R Difficult to learn?

,Week 0 1.1.1 Introduction to R 13/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

2. First taste of R and RStudio

,Week 0 2. First taste of R and RStudio 14/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Installing R on your own machine

1. Go to www.r-project.org2. Click on download R3. Choose a mirror. It is recommended that you choose either one of

the 0-Cloud or UK mirrors.4. Click on Download R for … depending on your operating system

and follow the instructions.5. Install R follow the instructions.

,Week 0 2. First taste of R and RStudio 15/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

First Taste of ROpen your R and try to type the following commands

print("Hello World")

Math operators in R:2+2

2 + 2

6 - 3 + 2

3 * 4

4 / 2

log(10)

exp(3)

2^3

sqrt(1.5) ,Week 0 2. First taste of R and RStudio 16/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Vector and Matrix

function to create a vector — c()c(1, 2, 3, 4, 5, 6)

c(1:6)

c(1:100)

function to create a matrix — matrix()matrix(1:6, nrow=2, ncol=3, byrow=TRUE)

matrix(1:6, nrow=2, ncol=3, byrow=FALSE)

To look up (recall) previous command — press ”up”

,Week 0 2. First taste of R and RStudio 17/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Write Comments

Use # to write a comment# This is a comment that will not be run

matrix(1:6, nrow=2, ncol=3, byrow=TRUE) # matrix by row

matrix(1:6, nrow=2, ncol=3, byrow=FLASE) # matrix by column

,Week 0 2. First taste of R and RStudio 18/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Helpgetting help with a R function — help()help(matrix)

help(c)

help(print)

?matrix

,Week 0 2. First taste of R and RStudio 19/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Assign An ObjectUse <- to assign values to an object

a <- 4

a

a <- 2 + 2

a

2 + 2 -> a

a

A <- 14

A

Sometimes you can use the equal sign ”=” to assign an object, but this isnot recommended.

,Week 0 2. First taste of R and RStudio 20/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Generate Random NumbersTry the following commands that generate random number(s) fromN(µ, σ2)

rnorm(1)

rnorm(10)

rnorm(10, mean=2, sd=3)

help(rnorm)

Try the following commands that generate raNdom numbers fromUnif(a, b)runif(1)

runif(10)

runif(10, min=2, max=3)

,Week 0 2. First taste of R and RStudio 21/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Summary

Some times when we have a big table, and we want to just have a brieflook, we can use function — head() and summary()a <- rnorm(100000)

head(a)

tail(a)

head(a, n=10L)

b <- matrix(rnorm(10000), nrow=100) # 100 * 100 matrix

head(b)

summary(a)

summary(b)

,Week 0 2. First taste of R and RStudio 22/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Installing RStudio on your own machineRStudio is a piece of software that provides a free and more user-friendlyGUI (graphical user interface) to R. It does not replace R. Statisticalanalyses are still run by typing code! To work, you need to have R andRStudio installed on your machine.

You can install RStudio by:1. Go to www.rstudio.com2. Click on Download under RStudio3. Click on Download under RStudio Desktop4. Download the installer appropriate for your operating system.5. Install RStudio follow the instructions

,Week 0 2. First taste of R and RStudio 23/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

First Taste of RStudio

Open your RStudio and try the following:1. Create an R script file: File — New File — R Script2. Type the same commands you just tried for R3. Select and Run each of your command4. Clear your console/workspace5. Save your R script file: File — Save (or Save As)6. Customize your own RStudio layout and appearance

,Week 0 2. First taste of R and RStudio 24/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Load your first Dataset — Pokemonpokemon <- read.csv(file="Desktop/Pokemon.csv", header=TRUE)

head(pokemon)

str(pokemon)

summary(pokemon)

,Week 0 2. First taste of R and RStudio 25/26

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

...

.

Tasks for You

1. Install R and RStudio on your own computer.2. Try the commands you learned in this sessions.3. Explore a little bit of the Pokemon data using R

To download slides, R codes and the Pokemon data for this session,1. Go to blackboard.soton.ac.uk2. Log in your account3. Select the module Statistical Computing or Statistical

Computing for Data Scientists4. Choose Course Content5. Open the folder Introduction Session

If you are not able to open the module page on blackboard, you candownload above materials fromwww.maths.lancs.ac.uk/∼zhengc5/teaching.html.

,Week 0 2. First taste of R and RStudio 26/26