intro to r lecture 1

32
Welcome to CME 195 Introduc4on to R Xiaotong Suo

Upload: danny-dewitt

Post on 11-Nov-2015

23 views

Category:

Documents


3 download

DESCRIPTION

First lecture of Introduction to R programming

TRANSCRIPT

  • Welcome to CME 195 Introduc4on to R

    Xiaotong Suo

  • Course overview

    Two parts of this short course: R basics R in sta4s4cs

  • Course prerequisite

    No programming course required. So if you already took CS 106B or the same level course, you probably will not like this course.

    Some knowledge of sta4s4cs would help but I will go over the basics in class.

  • Homework

    It is a short course. We will have 4 homework and you have to get at least 60% on each homework to pass this course.

    ?

  • Todays Agenda

    Introduc4on to R Variables Func4ons Special values in R Very brief introduc4on to vectors

  • What is R?

    R is a soUware for sta4s4cal compu4ng and data analysis. An implementa4on of the S language.

    R is freely distributed soUware( www.r-project.org ) with contribu4ons from developers from around the world. It is one of the main soUware for sta4s4cal compu4ng.

  • Ge[ng started

    Download R at www.r-project.org. R studio has nice interface and you can get it for free at www. rstudio.com.

  • Ge[ng started

    There are two ways to work in R: A conven4onal approach: you open a le and write program describing what you intend to do and run that program.

    An interac4ve approach: you interact with R and do whatever you want to do, one step at a 4me. We type in expressions and R evaluate them and return a value if needed.

    We combine both approaches most of the 4mes.

  • Variables

    A variable in computer science is a name given to some storage loca4on. In more prac4cal terms, it is a binding between a symbol and a value. x

  • Variables con4nued

    Both

  • Variables con4nued

    We can create a vector v, which holds many values, as follows: v -> c(1,2,3,4,5) (We will discuss more detailed about vectors in R next lecture) Here, c means concatena4on.

  • Variables con4nued

    It is important to understand Rs organiza4on. As you create new variables in R, there are kept in the computer memory. It is useful some4mes to know what variables are currently in memory and be able to save or delete them. ls() ls.str() both commands list exis4ng variables

  • Variable con4nued

    x

  • Working directory

    At the beginning of each R sec4on, a directory is akached to the sec4on called the working directory.

    To see the current working directory getwd()

    To set the working directory. setwd()

  • Working directory con4nued

    Whenever you try to read or save a le without the full path, the working directory (wd) will be used. Typically the wd is the directory from which you start R.

    At the end of an R session, you can choose to save all the objects in memory. A le .RData is then created for this purpose. Next 4me, star4ng R from the same directory, this le .RData will be automa4cally loaded.

  • Working directory con4nued

    You can load the .RData from another directory with load().

    Note that only the le .RData is automa4cally loaded whereas other le lename.RData are not. You need to load them with the func4on load

  • Working directory con4nued

    Another important concept to know is the search directories. That is the sequence of Environments in which R searches for whatever variable or func4on you request.

    You can see that hierarchy with search(). This hierarchy changes as you add or remove packages to your R session.

  • Working directory con4nued

    Another important concept to know is the search directories. That is the sequence of Environments in which R searches for whatever variable or func4on you request.

    You can see that hierarchy with search(). This hierarchy changes as you add or remove packages to your R session.

    Type ?environment in R to nd how to get, set and create environments.

  • Func4ons

    Beside variables, func4ons are the other most important concept in computer programming. A func4on is a piece of code that takes some input called arguments, performs a specic task and possibly returns a value. In order to properly use a func4on we must properly set up its arguments.

    In R we specify arguments either by name or by posi4on

  • Func4ons con4nued

    The func4on rnorm we used earlier: u

  • Func4ons con4nued

    There are a lot of build-in func4ons in R. Before wri4ng your own func4on, I would check whether there are exis4ng func4ons available rst.

    Some4mes it is hard to google summa4on in R. Instead, you can google summa4on in R cran

    If you know the build-in func4on name, but you are not sure how to use it, ?rnorm

  • Func4ons con4nued We can also dene a func4on:

    f

  • Special values in R

    NA is used to represent missing values and stands for not available. v -> c(1,2,3) length(v) = 4 R automa4cally lls a NA into the end of v since no value is provided.

  • Special values con4nued

    Inf and Inf: If a computa4on results in a number that is too big, R will return Inf for a posi4ve number and -Inf for a nega4ve number. 2^1024 -2^1024 1/0

  • Special values con4nued

    NaN: a computa4on will produce a result that makes likle sense. In these cases, R oUen returns NaN, which stands for not a number. Inf Inf 0/0

  • NULL

    NULL: A null object in R, represented by the symbol NULL. NULL is oUen used as an argument in func4ons to mean that no value was assigned to the argument. f1 = func4on(arg1, arg2 = NULL)

  • Data structures

    In order to work with a language we need to know the objects that language oers. R oers 5 basic objects: vectors, matrix, factor, dataframe and list.

  • Vectors

    A vector is a collec4on of objects which all have the same data type (also called mode). R supports many dierent mode: integer, double, logical, character and complex.

  • Vectors con4nued- crea4ng a vector

    To create a vector use the func4on vector, or simply create a new variable. x1

  • Vectors con4nued

    We use [] to access the elements of a vector. Thus x[1] is the rst element of x, etc... x3[1]+2 x3[3:10] X3[1] Note: the index in R is 1 based!

  • Vectors con4nued

    x1

  • Next 4me

    5 basic objects: more on vectors, matrices, factors, dataframes and lists.