r introduction to statistical analysis using r nick, caroline, tanya

Post on 23-Dec-2015

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Introduction to Statistical Analysis Using RR

Nick, Caroline, Tanya

What is R?

• R is a programming language for data analysis and graphics

• All information about R is found on http://www.R-project.org

• R system contains two major components:1.Base system – contains the R language

software and the high priority add-on packages listed on pg.3

2.User contributed add-on packages

Who uses R?

• All scientists especially those working in developing countries – It allows universal free access to state of the

art tools for statistical data analysis – Most widely used for teaching undergraduates

and graduates statistics b/c the students can use it free of cost

Installing R–Base System

1. Go to http://CRAN.R-project.org

2. Choose your computer from the list (Linux, MacOS X, or Windows)

3. Click on Base (Base or Contrib)

4. Click on R-2.6.1-win32.exe

5. Save R

Getting Started

• Changing prompt - pg.3• Example – using R as a pocket

calculator – pg.3• Storing vs. Printing• R is not space sensitive, but it is

case sensitive

Getting Help in R

• The Help system is a collection of manual pages describing each function and data set that comes with R

• Help/manual page is shown when the name of the function we would like to get help for is supplied to the help function– Ex. help(“mean”) or help(mean) or ?mean

Installing add-on packages

• All packages are available on: http://CRAN.R-project.org/src/contrib/PACKAGES.html

– Pick package from list and download

• To install add-on package:

1. install.packages(“package name”)

2. library(“package name”)

Forbes2000 Example

• Go to http://CRAN.R-project.org/src/contrib/PACKAGES.html and select HSAUR from the list

• Choose what pertains to your computer ex. Windows binary HSAUR 1.2-1.zip

• Save to desktop • Find Forbes2000 list in rawdata folder • Install in R :

– install.package(“HSAUR”)– library(“HSAUR”)

Working with Data Sets – Ex. Forbes 2000 list

• Vector – elementary structure for data handling in R; set of simple elements, all being objects of the same class– Ex. First 3 companies in Forbes - Forbes2000[,"name"]

[1:3]• Variable names – headings

– names(Forbes2000)– Finding structures of data set – useful for large data sets

• str(Forbes2000) • Dimensions

– dim(Forbes2000)– nrow(Forbes2000)– ncol(Forbes2000)

Simple Summary Statistics

• Mean – mean(Forbes2000 [,”sales”])

• Median – median(Forbes2000 [,”assets”])

• Range – range(Forbes2000 [,”sales”])

Importing Data Not Part of a Package

• When is this used?– Most data sets are not part of a down-loadable

package– Most people need to import their own data sets into R

• Example – Airport data (download to Desktop)

• In R:– File → Change Dir → Desktop → OK– name given < - read.table (“airport.csv”, header =

TRUE, sep = “,”, row.names =1

Making a Graph

• Graph of “Rank” of airport vs. “Shop”

• Plot (Rank ~ Shop, data = “name given”, pch =“O”)

Homework• Change Prompt “>” to “R>”• Import Airport Data Set from Excel• Print data set in R• Find the Dimensions, the number of Columns,

and the number of Rows in the data set• Find structure of data set • Find median of category “Shop” • Find mean of “Domestic”

Contact info

• Tanya – tgranch@luc.edu

• Caroline – cweber4@luc.edu

• Nick – ngundru@luc.edu

top related