r introduction to statistical analysis using r nick, caroline, tanya

14
Introduction to Statistical Analysis Using R R Nick, Caroline, Tanya

Upload: moris-greene

Post on 23-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: R Introduction to Statistical Analysis Using R Nick, Caroline, Tanya

Introduction to Statistical Analysis Using RR

Nick, Caroline, Tanya

Page 2: R Introduction to Statistical Analysis Using R Nick, Caroline, Tanya

What is R?

• R is a programming language for data analysis and graphics

• All information about R is found on http://www.R-project.org

• R system contains two major components:1.Base system – contains the R language

software and the high priority add-on packages listed on pg.3

2.User contributed add-on packages

Page 3: R Introduction to Statistical Analysis Using R Nick, Caroline, Tanya

Who uses R?

• All scientists especially those working in developing countries – It allows universal free access to state of the

art tools for statistical data analysis – Most widely used for teaching undergraduates

and graduates statistics b/c the students can use it free of cost

Page 4: R Introduction to Statistical Analysis Using R Nick, Caroline, Tanya

Installing R–Base System

1. Go to http://CRAN.R-project.org

2. Choose your computer from the list (Linux, MacOS X, or Windows)

3. Click on Base (Base or Contrib)

4. Click on R-2.6.1-win32.exe

5. Save R

Page 5: R Introduction to Statistical Analysis Using R Nick, Caroline, Tanya

Getting Started

• Changing prompt - pg.3• Example – using R as a pocket

calculator – pg.3• Storing vs. Printing• R is not space sensitive, but it is

case sensitive

Page 6: R Introduction to Statistical Analysis Using R Nick, Caroline, Tanya

Getting Help in R

• The Help system is a collection of manual pages describing each function and data set that comes with R

• Help/manual page is shown when the name of the function we would like to get help for is supplied to the help function– Ex. help(“mean”) or help(mean) or ?mean

Page 7: R Introduction to Statistical Analysis Using R Nick, Caroline, Tanya

Installing add-on packages

• All packages are available on: http://CRAN.R-project.org/src/contrib/PACKAGES.html

– Pick package from list and download

• To install add-on package:

1. install.packages(“package name”)

2. library(“package name”)

Page 8: R Introduction to Statistical Analysis Using R Nick, Caroline, Tanya

Forbes2000 Example

• Go to http://CRAN.R-project.org/src/contrib/PACKAGES.html and select HSAUR from the list

• Choose what pertains to your computer ex. Windows binary HSAUR 1.2-1.zip

• Save to desktop • Find Forbes2000 list in rawdata folder • Install in R :

– install.package(“HSAUR”)– library(“HSAUR”)

Page 9: R Introduction to Statistical Analysis Using R Nick, Caroline, Tanya

Working with Data Sets – Ex. Forbes 2000 list

• Vector – elementary structure for data handling in R; set of simple elements, all being objects of the same class– Ex. First 3 companies in Forbes - Forbes2000[,"name"]

[1:3]• Variable names – headings

– names(Forbes2000)– Finding structures of data set – useful for large data sets

• str(Forbes2000) • Dimensions

– dim(Forbes2000)– nrow(Forbes2000)– ncol(Forbes2000)

Page 10: R Introduction to Statistical Analysis Using R Nick, Caroline, Tanya

Simple Summary Statistics

• Mean – mean(Forbes2000 [,”sales”])

• Median – median(Forbes2000 [,”assets”])

• Range – range(Forbes2000 [,”sales”])

Page 11: R Introduction to Statistical Analysis Using R Nick, Caroline, Tanya

Importing Data Not Part of a Package

• When is this used?– Most data sets are not part of a down-loadable

package– Most people need to import their own data sets into R

• Example – Airport data (download to Desktop)

• In R:– File → Change Dir → Desktop → OK– name given < - read.table (“airport.csv”, header =

TRUE, sep = “,”, row.names =1

Page 12: R Introduction to Statistical Analysis Using R Nick, Caroline, Tanya

Making a Graph

• Graph of “Rank” of airport vs. “Shop”

• Plot (Rank ~ Shop, data = “name given”, pch =“O”)

Page 13: R Introduction to Statistical Analysis Using R Nick, Caroline, Tanya

Homework• Change Prompt “>” to “R>”• Import Airport Data Set from Excel• Print data set in R• Find the Dimensions, the number of Columns,

and the number of Rows in the data set• Find structure of data set • Find median of category “Shop” • Find mean of “Domestic”

Page 14: R Introduction to Statistical Analysis Using R Nick, Caroline, Tanya

Contact info

• Tanya – [email protected]

• Caroline – [email protected]

• Nick – [email protected]