basic descriptive statistics using r(2)
TRANSCRIPT
-
8/8/2019 Basic Descriptive Statistics Using R(2)
1/4
Last Modified January 26, 2007
Basic Descriptive Statistics Using R
In the following handout words and symbols in bold are R functions and words and
symbols in italics are entries supplied by the user; underlined words and symbols areoptional entries (all current as of version R-2.4.1). Sample texts from an R session are
highlighted with gray shading.
Measures of Central Tendency
mean(object) provides the mean of the objects elements
> quarters = c(5.683, 5.620, 5.551, 5.549, 5.536,
+ 5.552, 5.548, 5.539, 5.554, 5.552, 5.684, 5.632
> mean(quarters)
[1] 5.583333
median(object) provides the median of the objects elements
> median(quarters)
[1] 5.552
mode there is no built in function for finding an objects mode; however, the
command table(object) creates a frequency table for the objects elements and themode is the element in this table with the greatest frequency
> table(quarters)
5.536 5.539 5.548 5.549 5.551 5.552 5.554 5.62 5.632 5.683 5.684
1 1 1 1 1 2 1 1 1 1 1
midrange there is no built in function for reporting the midrange; the command
shown below use the functions for an objects maximum (max) and minimum (min)to calculate and print the objects midrange
> midrange = (max(quarters) + min(quarters))/2; midrange
[1] 5.61
1
-
8/8/2019 Basic Descriptive Statistics Using R(2)
2/4
Last Modified January 26, 2007
Measures of Spread
var(object) provides the sample variance of the objects elements
> var(quarters)
[1] 0.003116606
sd(object) provides the sample standard deviation of the objects elements
> sd(quarters)
[1] 0.05582657
standard error of the mean there is no built in function for reporting the standard
error of the mean; the command shown below use the functions for the objects
standard deviation (sd) and number of elements (length), as well as the mathematicalfunction for finding a square root (sqrt) to calculate and print the objects standard
error of the mean
> sem = sd(quarters)/sqrt(length(quarters)); sem
[1] 0.01611574
range there is no built in function for reporting the range; the command shown
below use the functions for an objects maximum (max) and minimum (min)
elements to calculate and print the objects range
> range = (max(quarters) min(quarters)); range
[1] 0.148
IQR(object) provides the objects interquartile range; note this value may differslightly from that provided by other programs because there is no single accepted
definition for FU and FL
> IQR(quarters)
[1] 0.07425
2
-
8/8/2019 Basic Descriptive Statistics Using R(2)
3/4
Last Modified January 26, 2007
Quantitative and Visual Representations of a Distributions Shape
skew(object) provides the skewness for an object; this function is not included in R,but is available from the file skew&kurt.RData, which is available on the courses
I-drive account.
> skew(quarters)
[1] 0.8508155
kurt(object) provides the kurtosis for an object relative to that of a normal
distribution; this function is not included in R, but is available from the file
skew&kurt.RData, which is available on the courses I-drive account.
> kurt(quarters)
[1] -1.075001
hist(object) creates a histogram of the objects elements with the number of
compartments chosen by R.
> hist(quarters)
3
-
8/8/2019 Basic Descriptive Statistics Using R(2)
4/4
Last Modified January 26, 2007
boxplot(object 1, object 2, names, horizontal= TRUE) creates a boxplot of theobjects elements (for multiple objects, a boxplot is drawn for each); names is a
vector containing the names of the objects, which adds labels on the x-axis when
plotting more than one boxplot. Setting horizontal to TRUE (the default value isFALSE) creates a horizontal boxplot.
> boxplot(quarters)
4