basic descriptive statistics using r(2)

Upload: monari-geofrey

Post on 10-Apr-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Basic Descriptive Statistics Using R(2)

    1/4

    Last Modified January 26, 2007

    Basic Descriptive Statistics Using R

    In the following handout words and symbols in bold are R functions and words and

    symbols in italics are entries supplied by the user; underlined words and symbols areoptional entries (all current as of version R-2.4.1). Sample texts from an R session are

    highlighted with gray shading.

    Measures of Central Tendency

    mean(object) provides the mean of the objects elements

    > quarters = c(5.683, 5.620, 5.551, 5.549, 5.536,

    + 5.552, 5.548, 5.539, 5.554, 5.552, 5.684, 5.632

    > mean(quarters)

    [1] 5.583333

    median(object) provides the median of the objects elements

    > median(quarters)

    [1] 5.552

    mode there is no built in function for finding an objects mode; however, the

    command table(object) creates a frequency table for the objects elements and themode is the element in this table with the greatest frequency

    > table(quarters)

    5.536 5.539 5.548 5.549 5.551 5.552 5.554 5.62 5.632 5.683 5.684

    1 1 1 1 1 2 1 1 1 1 1

    midrange there is no built in function for reporting the midrange; the command

    shown below use the functions for an objects maximum (max) and minimum (min)to calculate and print the objects midrange

    > midrange = (max(quarters) + min(quarters))/2; midrange

    [1] 5.61

    1

  • 8/8/2019 Basic Descriptive Statistics Using R(2)

    2/4

    Last Modified January 26, 2007

    Measures of Spread

    var(object) provides the sample variance of the objects elements

    > var(quarters)

    [1] 0.003116606

    sd(object) provides the sample standard deviation of the objects elements

    > sd(quarters)

    [1] 0.05582657

    standard error of the mean there is no built in function for reporting the standard

    error of the mean; the command shown below use the functions for the objects

    standard deviation (sd) and number of elements (length), as well as the mathematicalfunction for finding a square root (sqrt) to calculate and print the objects standard

    error of the mean

    > sem = sd(quarters)/sqrt(length(quarters)); sem

    [1] 0.01611574

    range there is no built in function for reporting the range; the command shown

    below use the functions for an objects maximum (max) and minimum (min)

    elements to calculate and print the objects range

    > range = (max(quarters) min(quarters)); range

    [1] 0.148

    IQR(object) provides the objects interquartile range; note this value may differslightly from that provided by other programs because there is no single accepted

    definition for FU and FL

    > IQR(quarters)

    [1] 0.07425

    2

  • 8/8/2019 Basic Descriptive Statistics Using R(2)

    3/4

    Last Modified January 26, 2007

    Quantitative and Visual Representations of a Distributions Shape

    skew(object) provides the skewness for an object; this function is not included in R,but is available from the file skew&kurt.RData, which is available on the courses

    I-drive account.

    > skew(quarters)

    [1] 0.8508155

    kurt(object) provides the kurtosis for an object relative to that of a normal

    distribution; this function is not included in R, but is available from the file

    skew&kurt.RData, which is available on the courses I-drive account.

    > kurt(quarters)

    [1] -1.075001

    hist(object) creates a histogram of the objects elements with the number of

    compartments chosen by R.

    > hist(quarters)

    3

  • 8/8/2019 Basic Descriptive Statistics Using R(2)

    4/4

    Last Modified January 26, 2007

    boxplot(object 1, object 2, names, horizontal= TRUE) creates a boxplot of theobjects elements (for multiple objects, a boxplot is drawn for each); names is a

    vector containing the names of the objects, which adds labels on the x-axis when

    plotting more than one boxplot. Setting horizontal to TRUE (the default value isFALSE) creates a horizontal boxplot.

    > boxplot(quarters)

    4