rstats
TRANSCRIPT
-
7/27/2019 Rstats
1/42
Introduction To R
-
7/27/2019 Rstats
2/42
What is R?
The R statistical programming language is a freeopen source package based on the S languagedeveloped by Bell Labs.
The language is very powerful for writing programs.Many statistical functions are already built in.Contributed packages expand the functionality tocutting edge research.Since it is a programming language, generatingcomputer code to complete tasks is required.
-
7/27/2019 Rstats
3/42
Getting Started
Where to get R?Go to www.r-project.org Downloads: CRANSet your Mirror: Anyone in the USA is fine.
Select Windows 95 or later.Select base.Select R-2.4.1-win32.exe
The others are if you are a developer and wish to changethe source code.
UNT course website for R:http://www.unt.edu/rss/SPLUSclasslinks.html
http://www.r-project.org/http://www.biometrics.mtu.edu/CRAN/bin/windows/base/R-2.4.1-win32.exehttp://www.unt.edu/rss/SPLUSclasslinks.htmlhttp://www.unt.edu/rss/SPLUSclasslinks.htmlhttp://www.biometrics.mtu.edu/CRAN/bin/windows/base/R-2.4.1-win32.exehttp://www.biometrics.mtu.edu/CRAN/bin/windows/base/R-2.4.1-win32.exehttp://www.biometrics.mtu.edu/CRAN/bin/windows/base/R-2.4.1-win32.exehttp://www.biometrics.mtu.edu/CRAN/bin/windows/base/R-2.4.1-win32.exehttp://www.biometrics.mtu.edu/CRAN/bin/windows/base/R-2.4.1-win32.exehttp://www.r-project.org/http://www.r-project.org/http://www.r-project.org/ -
7/27/2019 Rstats
4/42
Getting Started
The R GUI?
-
7/27/2019 Rstats
5/42
Getting Started
Opening a script.This gives you a script window.
-
7/27/2019 Rstats
6/42
Getting Started
Basic assignment and operations. Arithmetic Operations:
+, -, *, /, ^ are the standard arithmetic operators.
Matrix Arithmetic.* is element wise multiplication%*% is matrix multiplication
AssignmentTo assign a value to a variable use < -
-
7/27/2019 Rstats
7/42
Getting Started
How to use help in R?R has a very good help system built in.If you know which function you want help withsimply use ?_______ with the function in theblank.Ex: ?hist.If you dont know which function to use, then usehelp.search(_______). Ex: help.search(histogram).
-
7/27/2019 Rstats
8/42
Importing Data
How do we get data into R?Remember we have no point and click First make sure your data is in an easy toread format such as CSV (Comma SeparatedValues).Use code:
D
-
7/27/2019 Rstats
9/42
-
7/27/2019 Rstats
10/42
Working with data.
Subsetting data.Use a logical operator to do this.
==, >,
-
7/27/2019 Rstats
11/42
Basic Graphics
Histogramhist(D$wg)
-
7/27/2019 Rstats
12/42
Basic Graphics
Add a title The main statementwill give the plot an
overall heading.hist(D$wg ,main=Weight Gain )
-
7/27/2019 Rstats
13/42
Basic Graphics
Adding axis labels Use xlab and ylabto label the X and Yaxes, respectively.hist(D$wg ,main=WeightGain ,xlab=WeightGain , ylab=Frequency )
-
7/27/2019 Rstats
14/42
Basic Graphics
Changing colors Use the col statement.
?colors will give you
help on the colors.Common colors maysimply put in using thename.hist(D$wg,
main=WeightGain,xlab=WeightGain, ylab=Frequency,col=blue)
-
7/27/2019 Rstats
15/42
Basic Graphics Colors
-
7/27/2019 Rstats
16/42
-
7/27/2019 Rstats
17/42
Boxplots
Change it!boxplot(D$wg,main='Weight Gain',ylab='WeightGain (lbs)')
-
7/27/2019 Rstats
18/42
Box-Plots - Groupings
What if we want several box plots side byside to be able to compare them.First Subset the Data into separate variables.
wg.m
-
7/27/2019 Rstats
19/42
Boxplots Groupings
-
7/27/2019 Rstats
20/42
Boxplots - Groupings
boxplot(wg.m$wg, wg.f$wg, main='Weight Gain (lbs)',ylab='Weight Gain', names = c('Male','Female'))
-
7/27/2019 Rstats
21/42
Boxplot Groupings
Do it by shiftwg.7a
-
7/27/2019 Rstats
22/42
Boxplots Groupings
-
7/27/2019 Rstats
23/42
Scatter Plots
Suppose we have two variables and we wishto see the relationship between them.
A scatter plot works very well.R code:
plot(x,y)
Exampleplot(D$metmin,D$wg)
-
7/27/2019 Rstats
24/42
Scatterplots
-
7/27/2019 Rstats
25/42
Scatterplots
plot(D$metmin,D$wg,main='Met Minutes vs. Weight Gain',xlab='Mets (min)',ylab='Weight Gain (lbs)')
-
7/27/2019 Rstats
26/42
Scatterplots
plot(D$metmin,D$wg,main='Met Minutes vs. Weight Gain',
xlab='Mets (min)',ylab='Weight Gain (lbs)', pch=2 )
-
7/27/2019 Rstats
27/42
Line Plots
Often data comes through time.Consider Dell stock
D2
-
7/27/2019 Rstats
28/42
Line Plots
-
7/27/2019 Rstats
29/42
-
7/27/2019 Rstats
30/42
Line Plots
plot(t1,D2$DELL,type="l",main='Dell Closing Stock Price',xlab='Time',ylab='Price $'))
-
7/27/2019 Rstats
31/42
Overlaying Plots
Often we have more than one variablemeasured against the same predictor (X).
plot(t1,D2$DELL,type="l",main='Dell ClosingStock Price',xlab='Time',ylab='Price $'))
lines(t1,D2$Intel)
-
7/27/2019 Rstats
32/42
Overlaying Graphs
-
7/27/2019 Rstats
33/42
Overlaying Graphs
lines(t1,D2$Intel, lty=2 )
-
7/27/2019 Rstats
34/42
Overlaying Graphs
-
7/27/2019 Rstats
35/42
Adding a Legend
Adding a legend is a bit tricky in R.Syntaxlegend( x, y, names, line types)
X
coordinateY
coordinate
Names of series incolumnformat
Correspondingline types
-
7/27/2019 Rstats
36/42
Adding a Legend
legend(60,45,c('Intel','Dell'),lty=c(1,2))
-
7/27/2019 Rstats
37/42
Paneling Graphics
Suppose we want more than one graphic ona panel.We can partition the graphics panel to give usa framework in which to panel our plots.par(mfrow = c( nrow, ncol))
Number of rows
Number of columns
-
7/27/2019 Rstats
38/42
Paneling Graphics
Consider the followingpar(mfrow=c(2,2))hist(D$wg, main='Histogram',xlab='Weight Gain',ylab ='Frequency', col=heat.colors(14))boxplot(wg.7a$wg, wg.8a$wg, wg.9a$wg, wg.10a$wg,wg.11a$wg, wg.12p$wg, main='Weight Gain',ylab='Weight Gain (lbs)',xlab='Shift', names =c('7am','8am','9am','10am','11am','12pm'))plot(D$metmin,D$wg,main='Met Minutes vs. WeightGain', xlab='Mets (min)',ylab='Weight Gain(lbs)',pch=2)plot(t1,D2$Intel,type="l",main='Closing StockPrices',xlab='Time',ylab='Price $')lines(t1,D2$DELL,lty=2)
-
7/27/2019 Rstats
39/42
Paneling Graphics
-
7/27/2019 Rstats
40/42
Quality - ExcelClosing Stock Prices
0
5
10
15
20
25
30
35
40
45
50
1 0 / 2 / 2 0 0 0
1 / 2 / 2 0 0 1
4 / 2 / 2 0 0 1
7 / 2 / 2 0 0 1
1 0 / 2 / 2 0 0 1
1 / 2 / 2 0 0 2
4 / 2 / 2 0 0 2
7 / 2 / 2 0 0 2
1 0 / 2 / 2 0 0 2
1 / 2 / 2 0 0 3
4 / 2 / 2 0 0 3
7 / 2 / 2 0 0 3
1 0 / 2 / 2 0 0 3
1 / 2 / 2 0 0 4
4 / 2 / 2 0 0 4
7 / 2 / 2 0 0 4
1 0 / 2 / 2 0 0 4
1 / 2 / 2 0 0 5
4 / 2 / 2 0 0 5
7 / 2 / 2 0 0 5
1 0 / 2 / 2 0 0 5
1 / 2 / 2 0 0 6
4 / 2 / 2 0 0 6
7 / 2 / 2 0 0 6
1 0 / 2 / 2 0 0 6
1 / 2 / 2 0 0 7
Time
P r i c e
$
DELL
Intel
-
7/27/2019 Rstats
41/42
-
7/27/2019 Rstats
42/42
Summary
All of the R code and files can be found at:http://www.cran.r-project.org/
http://www.cran.r-project.org/http://www.cran.r-project.org/http://www.cran.r-project.org/http://www.cran.r-project.org/