stat 534: statistical computing

40
STAT 534: Statistical Computing Hari Narayanan [email protected]

Upload: kyria

Post on 14-Jan-2016

21 views

Category:

Documents


1 download

DESCRIPTION

STAT 534: Statistical Computing. Hari Narayanan [email protected]. Course objectives. Write programs in R and C tailored to specifics of statistics problems you want to solve Familiarize yourself with: optimization techniques Markov Chain Monte Carlo (mcmc). Logistics. Class: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: STAT 534: Statistical Computing

STAT 534: Statistical Computing

Hari [email protected]

Page 2: STAT 534: Statistical Computing

Course objectives

• Write programs in R and C tailored to specifics of statistics problems you want to solve

• Familiarize yourself with:– optimization techniques– Markov Chain Monte Carlo (mcmc)

Page 3: STAT 534: Statistical Computing

Logistics• Class: – Tuesdays and Thursdays 12:00pm – 1:20pm

• Office hours:– Thursday 2:30pm – 4pm (Padelford B-301) or by appt

• Textbooks:– Robert & Casella : Introducing Monte Carlo Methods with

R– Kernighan & Ritchie : C Programming Language

• Evaluation:– 4 assignments– 2 quizzes– Final project

Page 4: STAT 534: Statistical Computing

Introduction to R

• R is a scripting language for statistical data manipulation and analysis

• R is the successor of S/S Plus• R standard for professional statisticians• R is free and available on major platforms

(Windows, Unix, Mac)• It is: general, object oriented• It is an interpreted programming language

Page 5: STAT 534: Statistical Computing

Getting R

• Main website: http://cran.r-project.org/• ~25 standard packages come with a default

download, many more contributed packages can be obtained from the main website

• Development environment/GUI:– Rstudio http://www.rstudio.com/

Page 6: STAT 534: Statistical Computing

First R interactive session• Type interactive commands at the prompt

> 2+35

> 2==4FALSE

> 5/0Inf

> 0/0NaN

• Note that R is case sensitive• Getting help

–help(FALSE)–?FALSE

• Ending session:–>q()

Page 7: STAT 534: Statistical Computing

R workspaces• R creates and manipulates objects: variables, arrays of numbers, list

of character, functions, structures build from these components:> a = 4> b = 5> objects() # list all the objects in this workspace[1] "a" "b"> ls() # same as objects()[1] "a" "b"> rm(a) # remove an object from this workspace> ls()[1] "b“

• Objects of the current session are stored in .Rdata in the current folder and command history is stored in .Rhistory–These are reloaded every time you start R from the same directory

Page 8: STAT 534: Statistical Computing

Assignment

• Multiple ways to assign values [ primitive values or results of the evaluation of an expression] to variables:> a = 2 + 3> a <- 2 + 3> 2 + 3 -> a> assign(“a”, 2+3)

Page 9: STAT 534: Statistical Computing

Vectors• Created using the c (concatenation) function:

> v = c(1,2)> v[1] 1 2

• A number by itself is considered a vector of length 1• No nesting

> u=c(-4, v, 4)> u[1] -4 1 2 4

• Missing values c(1, NA, 4)

Page 10: STAT 534: Statistical Computing

Operations on vectors• Regular arithmetic operations apply (+, -, *, /, ^). Shorter vector is

recycled to match needed length:> a=c(1,2) # becomes 1 2 1> b=c(1,2,3)> r=3*a+b-1Warning message:In 3 * a + b :longer object length is not a multiple of shorter object length> r[1] 3 7 5

• Other math functions can be applied element wise : sqrt, log, ..• Other functions: max, min, length, sum, prod, mean, var, sort

> sort(c(4,3,7))[1] 3 4 7

Page 11: STAT 534: Statistical Computing

Logical operations• Operators <, <=, >, >=, ==, &, |, !

> a=c(2,4)> r1=a>3[1] FALSE TRUE> r2=a>4[1] FALSE FALSE> r1 & r2[1] FALSE FALSE> r1 | r2[1] FALSE TRUE> ! r1[1] TRUE FALSE

• Can be used in arithmetic operations, FALSE coerced to 0, TRUE to 1> r1 + 1[1] 1 2> c(2,3) & c(0,1)[1] FALSE TRUE

Page 12: STAT 534: Statistical Computing

Generating vectors• : operator (high precedence in an expression)

> a=3> 1:a[1] 1 2 3> 1:a+1[1] 2 3 4> 1:(a+1)[1] 1 2 3 4

• seq function > seq(from=2, to=4) # named arguments same as seq(2,4)

[1] 2 3 4> seq(to=4, from=2)[1] 2 3 4> seq(from=2, length=3)[1] 2 3 4

• rep function > a=c(1,2)> rep(a, times=2)[1] 1 2 1 2> rep(a, each=2)[1] 1 1 2 2

Page 13: STAT 534: Statistical Computing

Manipulating vector data• Simple indexing:

> a=c(2,3,8)> a[1][1] 2> a[5][1] NA> a[-1][1] 3 8

• More complex:> a[1:2][1] 2 3> a[a>2 & a<7][1] 3> a[c(1,1)][1] 2 2

Page 14: STAT 534: Statistical Computing

Matrices• Associating a dimension vector with a vector allows it to be treated by R as an array/matrix:

> a=c(2,3,8)> attributes(a)NULL> dim(a) = c(3,1)> a [,1][1,] 2[2,] 3[3,] 8> attributes(a)$dim[1] 3 1> matrix(c(1,2,3,4,5,6), nrow=2) [,1] [,2] [,3][1,] 1 3 5[2,] 2 4 6> matrix(c(1,2,3,4,5,6), nrow=2, byrow=TRUE) [,1] [,2] [,3][1,] 1 2 3[2,] 4 5 6

Page 15: STAT 534: Statistical Computing

Operations on matrices• Addition/subtraction/element-wise multiplication : +,-,*• Matrix multiplication : %*%• Transpose : function t() e.g. t(matrix(c(1,2),nrow=1))• diag function:

– If argument is a number we get identity matrix> diag(2) [,1] [,2][1,] 1 0[2,] 0 1– If argument is a vector, we get diag matrix with elements of vector> diag(c(1,2)) [,1] [,2][1,] 1 0[2,] 0 2– If argument is a matrix, we get the elements of its major diagonal> m [,1] [,2][1,] 3 5[2,] 4 6> diag(m)[1] 3 6

Page 16: STAT 534: Statistical Computing

Indexing• Similar to indexing vectors, except we have an indexing vector for every dimension:

> m [,1] [,2] [,3][1,] 1 3 5[2,] 2 4 6> m[2,2][1] 4> m[c(2),c(2)] # indexing vectors[1] 4> m[c(1,2),c(2)] # first 2 rows and 2nd column[1] 3 4> m[c(1,2),c(2,3)] [,1] [,2][1,] 3 5[2,] 4 6> m[c(1,2),c(2,3)]=0> m [,1] [,2] [,3][1,] 1 0 0[2,] 2 0 0> m[c(TRUE,FALSE),TRUE] # keep first line and all columns[1] 1 0 0

Page 17: STAT 534: Statistical Computing
Page 18: STAT 534: Statistical Computing
Page 19: STAT 534: Statistical Computing
Page 20: STAT 534: Statistical Computing
Page 21: STAT 534: Statistical Computing
Page 22: STAT 534: Statistical Computing
Page 23: STAT 534: Statistical Computing
Page 24: STAT 534: Statistical Computing
Page 25: STAT 534: Statistical Computing
Page 26: STAT 534: Statistical Computing
Page 27: STAT 534: Statistical Computing
Page 28: STAT 534: Statistical Computing
Page 29: STAT 534: Statistical Computing
Page 30: STAT 534: Statistical Computing
Page 31: STAT 534: Statistical Computing
Page 32: STAT 534: Statistical Computing
Page 33: STAT 534: Statistical Computing
Page 34: STAT 534: Statistical Computing
Page 35: STAT 534: Statistical Computing
Page 36: STAT 534: Statistical Computing
Page 37: STAT 534: Statistical Computing
Page 38: STAT 534: Statistical Computing
Page 39: STAT 534: Statistical Computing
Page 40: STAT 534: Statistical Computing