introduction to r (lect 1)

Upload: fahad-nasir

Post on 02-Jun-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/11/2019 Introduction to R (LECT 1)

    1/15

    D R . S O H A I L A K R A M

    INTRODUCTION TO R

  • 8/11/2019 Introduction to R (LECT 1)

    2/15

    R

    R is a computer language that allows the user to programalgorithms and use tools that have been programmed byothers.

    R was originally written by Ross Ihaka and Robert Gentleman,at the University of Auckland.

    It is an implementation of the S

    language, which wasprincipally developed by John Chambers.

  • 8/11/2019 Introduction to R (LECT 1)

    3/15

    R

    R is an Open Source (and freely available)environment for statistical computing and graphics.

    The Comprehensive R Archive Network (CRAN) linksprovide binary downloads for Windows, for Mac OSX and for several flavours of Linux.

    Source code is also available.

  • 8/11/2019 Introduction to R (LECT 1)

    4/15

    R

    R is under active development - typically two majorreleases per year.

    R provides data manipulation, display facilities andmost statistical procedures. It can be extended withpackages containing data, code anddocumentation.

    Currently there are more than 2400 contributedpackages in the CRAN.

  • 8/11/2019 Introduction to R (LECT 1)

    5/15

    R HISTORY

    Statistical programming language S developed atBell Labs in 1976 (at the same time as UNIX)

    Intended to interactively support research and

    data analysis projects Exclusively licensed to Insightful ( S-Plus ) R: Open source platform similar to S developed by

    R . Gentleman and R. Ihaka (U of Auckland, NZ)

    during the 1990s Since 1997: international R-core developing team Updated versions available every couple months

  • 8/11/2019 Introduction to R (LECT 1)

    6/15

    WHAT CAN YOU DO WITH R ?

    You can ... do calculations

    perform statistical analysis (using availablecode)

    create powerful graphics

    write your own functions

  • 8/11/2019 Introduction to R (LECT 1)

    7/15

    WHAT R IS AND WHAT IT IS NOT

    R is a programming language a statistical package

    an interpreter Open Source

    R is not a database a collection of black boxes a spreadsheet software package commercially supported

  • 8/11/2019 Introduction to R (LECT 1)

    8/15

    OPEN SOURCE

    Provides full access to algorithms and their implementation. Gives you the ability to fix bugs and extend software. Provides a forum allowing researchers to explore and expand

    the methods used to analyze data Is the product of thousands of leading experts in the fields they

    know best. Ensures that scientists around the world - and not just ones in

    rich countries - are the co-owners to the software toolsneeded to carry out research.

    Promotes reproducible research by providing open andaccessible tools.

    Most of R is written in R! This makes it quite easy to see whatfunctions are actually doing.

  • 8/11/2019 Introduction to R (LECT 1)

    9/15

    R ADVANTAGES

    Fast and free. State of the art: Statistical researchers provide their

    methods as R packages. SPSS and SAS are years

    behind R! 2nd only to MATLAB for graphics. Mx, WinBugs, and other programs use or will use R . Active user community Excellent for simulation, programming, computer

    intensive analyses, etc. Forces you to think about your analysis. Interfaces with database storage software (SQL)

  • 8/11/2019 Introduction to R (LECT 1)

    10/15

    R DISADVANTAGES

    Not user friendly @ start - steep learning curve,minimal GUI.

    No commercial support; figuring out correct

    methods or how to use a function on your own canbe frustrating. Easy to make mistakes and not know these mistakes. Working with large datasets is limited by RAM Data prep & cleaning can be messier & more

    mistake prone in R vs. SPSS or SAS

  • 8/11/2019 Introduction to R (LECT 1)

    11/15

    R VS COMMERCIAL PACKAGES

    Many different datasets (and otherobjects) available at same time

    Datasets can be of any dimension

    Functions can be modified

    Experience is interactive, youprogram until you get exactly whatyou want

    One datasets available at agiven time

    Datasets are rectangular

    Functions are proprietary

    Experience is passive-youchoose an analysis and theygive you everything they thinkyou need

  • 8/11/2019 Introduction to R (LECT 1)

    12/15

    R VS COMMERCIAL PACKAGES

    One stop shopping - almostevery analytical tool you canthink of is available

    R is free and will continue toexist. Nothing can make it goaway, its price will neverincrease.

    Tend to be have limited scope,forcing you to learn additionalprograms; extra options cost moreand/or require you to learn adifferent language (e.g., SPSSMacros)

    They cost money. There is noguarantee they will continue toexist, but if they do, you can betthat their prices will always increase

  • 8/11/2019 Introduction to R (LECT 1)

    13/15

    INSTALLING R

    Go to http://cran.r-project.org/ and select either:

    MacOS X

    Windows and base

    Select to download the latest version: 2.14.1

    Install and Open.

    http://cran.r-project.org/http://cran.r-project.org/http://cran.r-project.org/http://cran.r-project.org/http://cran.r-project.org/http://cran.r-project.org/
  • 8/11/2019 Introduction to R (LECT 1)

    14/15

    GETTING STARTED

    The R GUI.

  • 8/11/2019 Introduction to R (LECT 1)

    15/15

    R PACKAGES

    Applications of R normally use a package; i.e., alibrary of special functions designed for a specificproblem.

    Hundreds of packages are available, mostly writtenby users. A user normally only loads a handful of packages

    for a particular analysis(e.g., library(MASS)).

    Standards determine how a package is structured,works well with other packages and creates newdata types in an easily used manner.

    Standardization makes it easy for users to learn new

    packages.