rstats

Upload: anandh-kumar

Post on 02-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Rstats

    1/42

    Introduction To R

  • 7/27/2019 Rstats

    2/42

    What is R?

    The R statistical programming language is a freeopen source package based on the S languagedeveloped by Bell Labs.

    The language is very powerful for writing programs.Many statistical functions are already built in.Contributed packages expand the functionality tocutting edge research.Since it is a programming language, generatingcomputer code to complete tasks is required.

  • 7/27/2019 Rstats

    3/42

    Getting Started

    Where to get R?Go to www.r-project.org Downloads: CRANSet your Mirror: Anyone in the USA is fine.

    Select Windows 95 or later.Select base.Select R-2.4.1-win32.exe

    The others are if you are a developer and wish to changethe source code.

    UNT course website for R:http://www.unt.edu/rss/SPLUSclasslinks.html

    http://www.r-project.org/http://www.biometrics.mtu.edu/CRAN/bin/windows/base/R-2.4.1-win32.exehttp://www.unt.edu/rss/SPLUSclasslinks.htmlhttp://www.unt.edu/rss/SPLUSclasslinks.htmlhttp://www.biometrics.mtu.edu/CRAN/bin/windows/base/R-2.4.1-win32.exehttp://www.biometrics.mtu.edu/CRAN/bin/windows/base/R-2.4.1-win32.exehttp://www.biometrics.mtu.edu/CRAN/bin/windows/base/R-2.4.1-win32.exehttp://www.biometrics.mtu.edu/CRAN/bin/windows/base/R-2.4.1-win32.exehttp://www.biometrics.mtu.edu/CRAN/bin/windows/base/R-2.4.1-win32.exehttp://www.r-project.org/http://www.r-project.org/http://www.r-project.org/
  • 7/27/2019 Rstats

    4/42

    Getting Started

    The R GUI?

  • 7/27/2019 Rstats

    5/42

    Getting Started

    Opening a script.This gives you a script window.

  • 7/27/2019 Rstats

    6/42

    Getting Started

    Basic assignment and operations. Arithmetic Operations:

    +, -, *, /, ^ are the standard arithmetic operators.

    Matrix Arithmetic.* is element wise multiplication%*% is matrix multiplication

    AssignmentTo assign a value to a variable use < -

  • 7/27/2019 Rstats

    7/42

    Getting Started

    How to use help in R?R has a very good help system built in.If you know which function you want help withsimply use ?_______ with the function in theblank.Ex: ?hist.If you dont know which function to use, then usehelp.search(_______). Ex: help.search(histogram).

  • 7/27/2019 Rstats

    8/42

    Importing Data

    How do we get data into R?Remember we have no point and click First make sure your data is in an easy toread format such as CSV (Comma SeparatedValues).Use code:

    D

  • 7/27/2019 Rstats

    9/42

  • 7/27/2019 Rstats

    10/42

    Working with data.

    Subsetting data.Use a logical operator to do this.

    ==, >,

  • 7/27/2019 Rstats

    11/42

    Basic Graphics

    Histogramhist(D$wg)

  • 7/27/2019 Rstats

    12/42

    Basic Graphics

    Add a title The main statementwill give the plot an

    overall heading.hist(D$wg ,main=Weight Gain )

  • 7/27/2019 Rstats

    13/42

    Basic Graphics

    Adding axis labels Use xlab and ylabto label the X and Yaxes, respectively.hist(D$wg ,main=WeightGain ,xlab=WeightGain , ylab=Frequency )

  • 7/27/2019 Rstats

    14/42

    Basic Graphics

    Changing colors Use the col statement.

    ?colors will give you

    help on the colors.Common colors maysimply put in using thename.hist(D$wg,

    main=WeightGain,xlab=WeightGain, ylab=Frequency,col=blue)

  • 7/27/2019 Rstats

    15/42

    Basic Graphics Colors

  • 7/27/2019 Rstats

    16/42

  • 7/27/2019 Rstats

    17/42

    Boxplots

    Change it!boxplot(D$wg,main='Weight Gain',ylab='WeightGain (lbs)')

  • 7/27/2019 Rstats

    18/42

    Box-Plots - Groupings

    What if we want several box plots side byside to be able to compare them.First Subset the Data into separate variables.

    wg.m

  • 7/27/2019 Rstats

    19/42

    Boxplots Groupings

  • 7/27/2019 Rstats

    20/42

    Boxplots - Groupings

    boxplot(wg.m$wg, wg.f$wg, main='Weight Gain (lbs)',ylab='Weight Gain', names = c('Male','Female'))

  • 7/27/2019 Rstats

    21/42

    Boxplot Groupings

    Do it by shiftwg.7a

  • 7/27/2019 Rstats

    22/42

    Boxplots Groupings

  • 7/27/2019 Rstats

    23/42

    Scatter Plots

    Suppose we have two variables and we wishto see the relationship between them.

    A scatter plot works very well.R code:

    plot(x,y)

    Exampleplot(D$metmin,D$wg)

  • 7/27/2019 Rstats

    24/42

    Scatterplots

  • 7/27/2019 Rstats

    25/42

    Scatterplots

    plot(D$metmin,D$wg,main='Met Minutes vs. Weight Gain',xlab='Mets (min)',ylab='Weight Gain (lbs)')

  • 7/27/2019 Rstats

    26/42

    Scatterplots

    plot(D$metmin,D$wg,main='Met Minutes vs. Weight Gain',

    xlab='Mets (min)',ylab='Weight Gain (lbs)', pch=2 )

  • 7/27/2019 Rstats

    27/42

    Line Plots

    Often data comes through time.Consider Dell stock

    D2

  • 7/27/2019 Rstats

    28/42

    Line Plots

  • 7/27/2019 Rstats

    29/42

  • 7/27/2019 Rstats

    30/42

    Line Plots

    plot(t1,D2$DELL,type="l",main='Dell Closing Stock Price',xlab='Time',ylab='Price $'))

  • 7/27/2019 Rstats

    31/42

    Overlaying Plots

    Often we have more than one variablemeasured against the same predictor (X).

    plot(t1,D2$DELL,type="l",main='Dell ClosingStock Price',xlab='Time',ylab='Price $'))

    lines(t1,D2$Intel)

  • 7/27/2019 Rstats

    32/42

    Overlaying Graphs

  • 7/27/2019 Rstats

    33/42

    Overlaying Graphs

    lines(t1,D2$Intel, lty=2 )

  • 7/27/2019 Rstats

    34/42

    Overlaying Graphs

  • 7/27/2019 Rstats

    35/42

    Adding a Legend

    Adding a legend is a bit tricky in R.Syntaxlegend( x, y, names, line types)

    X

    coordinateY

    coordinate

    Names of series incolumnformat

    Correspondingline types

  • 7/27/2019 Rstats

    36/42

    Adding a Legend

    legend(60,45,c('Intel','Dell'),lty=c(1,2))

  • 7/27/2019 Rstats

    37/42

    Paneling Graphics

    Suppose we want more than one graphic ona panel.We can partition the graphics panel to give usa framework in which to panel our plots.par(mfrow = c( nrow, ncol))

    Number of rows

    Number of columns

  • 7/27/2019 Rstats

    38/42

    Paneling Graphics

    Consider the followingpar(mfrow=c(2,2))hist(D$wg, main='Histogram',xlab='Weight Gain',ylab ='Frequency', col=heat.colors(14))boxplot(wg.7a$wg, wg.8a$wg, wg.9a$wg, wg.10a$wg,wg.11a$wg, wg.12p$wg, main='Weight Gain',ylab='Weight Gain (lbs)',xlab='Shift', names =c('7am','8am','9am','10am','11am','12pm'))plot(D$metmin,D$wg,main='Met Minutes vs. WeightGain', xlab='Mets (min)',ylab='Weight Gain(lbs)',pch=2)plot(t1,D2$Intel,type="l",main='Closing StockPrices',xlab='Time',ylab='Price $')lines(t1,D2$DELL,lty=2)

  • 7/27/2019 Rstats

    39/42

    Paneling Graphics

  • 7/27/2019 Rstats

    40/42

    Quality - ExcelClosing Stock Prices

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    50

    1 0 / 2 / 2 0 0 0

    1 / 2 / 2 0 0 1

    4 / 2 / 2 0 0 1

    7 / 2 / 2 0 0 1

    1 0 / 2 / 2 0 0 1

    1 / 2 / 2 0 0 2

    4 / 2 / 2 0 0 2

    7 / 2 / 2 0 0 2

    1 0 / 2 / 2 0 0 2

    1 / 2 / 2 0 0 3

    4 / 2 / 2 0 0 3

    7 / 2 / 2 0 0 3

    1 0 / 2 / 2 0 0 3

    1 / 2 / 2 0 0 4

    4 / 2 / 2 0 0 4

    7 / 2 / 2 0 0 4

    1 0 / 2 / 2 0 0 4

    1 / 2 / 2 0 0 5

    4 / 2 / 2 0 0 5

    7 / 2 / 2 0 0 5

    1 0 / 2 / 2 0 0 5

    1 / 2 / 2 0 0 6

    4 / 2 / 2 0 0 6

    7 / 2 / 2 0 0 6

    1 0 / 2 / 2 0 0 6

    1 / 2 / 2 0 0 7

    Time

    P r i c e

    $

    DELL

    Intel

  • 7/27/2019 Rstats

    41/42

  • 7/27/2019 Rstats

    42/42

    Summary

    All of the R code and files can be found at:http://www.cran.r-project.org/

    http://www.cran.r-project.org/http://www.cran.r-project.org/http://www.cran.r-project.org/http://www.cran.r-project.org/