introduction to r - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...contents...

42
Introduction to R

Upload: others

Post on 29-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

Introduction to R

Page 2: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

Contents – Introduction to R• Introduction to R

• Display datasets• Display First and Last 6 Observations• Display ADL Table Names• Read in Graduate Employment Survey Questionnaire Table• Arithmetic Operators• Function

• Calculate Windsorized Mean• Calculate OLS Estimates and SEs

• Descriptive Statistics• Measures of Central Tendency

• Mean• Median• Trimmed Mean

• Measure of Dispersion• Standard Deviation• Median Absolute Deviation

• Graphics in R• Graphic Settings• Scatter Plot + Line Plot• Line + Bar Chart• Pie Chart• 3 Dimensional Contingency Table

2

Page 3: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

Display a list of datasets available in JNB

Page 4: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

head() – first few observations of the datasettail() – last few observations of the dataset

Display First and Last 6 Observations

Page 5: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

5

Page 6: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

Display ADL Table Names

Page 7: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

Read in Graduate Employment Survey Questionnaireges_2016_questions

Page 8: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

Arithmetic Operators

8

Page 9: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

FunctionCalculate Windsorised Mean

Page 10: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

FunctionCalculate OLS Estimates and SEs

Page 11: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

Descriptive Statistics

11

Page 12: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

Mean

n

xxx

n

x

x n

n

i

i +++==

= ...211

lj𝑥 =1 + 2 + 3 + 5 + 6 + 9 + 10 + 20 + 34

9= 10

Susceptible to the influence of outliers

12

Page 13: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

Median

13

1 2 3 4 5 6 7 8 9 10

1 1 2 2 4 5 6 7 8 100000

Page 14: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

Trimmed Mean

kn

x

x

kn

ki

i

trimmed2

1

−=−

+=

Trimmed 5%14

Page 15: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

Standard Deviation

ix1 x2

1 7 7-7=0 0 12 12-7=5 25

2 8 8-7=1 1 2 2-7=-5 25

3 6 6-7=-1 1 0 0-7=-7 49

4 7 7-7=0 0 14 14-7=7 49

5 7 7-7=0 0 10 10-7=3 9

6 6 6-7=-1 1 9 9-7=2 4

7 8 8-7=1 1 5 5-7=-2 4

8 7 7-7=0 0 4 4-7=-3 9

Total 4 174

( )2xxi −ix xxi −

( )

( )985694.4

7

174

1

755929.07

4

1

1

2

2

1

2

1

==−

=

==−

=

=

=

n

xx

s

n

xx

s

n

i

i

n

i

i

( )2xxi −ix xxi −

15

Page 16: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

Median Absolute Deviation

16

( ) Median Median i xxMAD i −=x1 x1-Median |x1-Median |

2 2 -12 = -10 10

6 6 -12 = -6 6

6 6 -12 = -6 6

12 (Median) 12 -12 = 0 0

17 17 -12 = 5 5

25 25 -12 = 13 13

32 32 -12 = 20 20

Median 6

MAD = 6 σ ≈ 1.4826 * MAD = 1.4826 * 6 = 8.8956

0 5 6 6 10 13 20

16

Page 17: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

R Graphics

17

Page 18: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

18

Page 19: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

19

Page 20: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

20

Page 21: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

21

Page 22: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

22

High-Level Function Description

plot() Scatterplothist() Histogramboxplot() Boxplotqqplot(), qqnorm(), qqline() Quantile plotsinteraction.plot() Interaction plotsunflowerplot() Sunflower scatterplotpairs() Scatter plot matrixsymbols() Draw symbols on a plotdotchart(), Dot chartbarplot(), bar chartpie(), pie chartcurve() Draw a curve from a given functionimage() Create a grid of coloured rectangles with colours based

on the values of a third variablecontour(), filled.contour() Contour plotpersp() Plot 3-D surface

High-Level Plot Function

Page 23: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

23

Low-Level Plot Function Description

points() Add points to a figurelines() Add lines to a figuretext() Insert text in the plot regionmtext() Insert text in the figure and outer marginstitle() Add figure title or outer titlelegend() Insert legendaxis(), axis.Date() Customize axesabline() Add horizontal and vertical lines or a single linebox() Draw a box around the current plotrug() Add a 1-D plot of the data to the figurepolygon() Draw a polygonrect() Draw a rectanglearrows() Draw arrowssegments() Draw line segmentstrans3d() Add 2-D components to a 3-D plot

low-Level Plot Function

Page 24: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

24

Source: Key Household Income Trends, 2012

Year Gini Coefficient

2002 0.454

2003 0.457

2004 0.460

2005 0.465

2006 0.470

2007 0.482

2008 0.474

2009 0.471

2010 0.472

2011 0.473

2012 0.478

Singapore Gini Coefficient from Year 2002 to 2012

Page 25: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

25

# Gini Coefficient of Singapore From 2002 to 2012

# Scatter Plot

Year <- c(2002:2012)

Gini <- c(0.454,0.457,0.460,0.465,0.470,0.482,0.474,0.471,0.472,0.473,0.478)

plot(Year,Gini)

plot(x,y)

Page 26: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

26

plot(Year,Gini,main="Gini Coefficient\nBased on Household Income from Work per Household Member",

sub="Source: Key Household Income Trends, 2012")

main

sub

Page 27: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

27

plot(Year,Gini,main="Gini Coefficient\nBased on Household Income from Work per Household Member",

type = "b",pch=20,

sub="Source: Key Household Income Trends, 2012")

type=“b”

pch=20

Page 28: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

28

plot(Year,Gini,main="Gini Coefficient\nBased on Household Income from Work per Household Member",

type = "b",pch=15,

col="red",lwd=2,

ylab="Gini Coefficient",

ylim=c(0.44,0.49),

sub="Source: Key Household Income Trends, 2012")

pch=15

Page 29: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

29

plot(Year,Gini,main="Gini Coefficient",

type = "b",pch=15,

col="red",lwd=2,

cex.axis=1.2,cex.lab=1.5,cex.main=1.6,

ylab="Gini Coefficient",

ylim=c(0.44,0.49))

text(2008,0.44,"Based on Household Income from Work per Household Member",cex=0.7)

mtext("Source: Key Household Income Trends, 2012",side=1,line=4,at=2005)

Plot Region

Page 30: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

30

par(bg = "lightblue")

plot(Year,Gini,main="Gini Coefficient",

type = "b",pch=15,

col="red",lwd=2,

cex.axis=1.2,cex.lab=1.5,cex.main=1.6,

ylab="Gini Coefficient",

ylim=c(0.44,0.49))

text(2008,0.44,"Based on Household Income from Work per Household Member",cex=0.7)

mtext("Source: Key Household Income Trends, 2012",side=1,line=4,at=2005)

Page 31: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

31

Source: Zhang (2009) Lifelong education (learning) in China: Present situation and development trend. Convergence, 42(1), 49-63.

YearNumber of Examinees

(in 10,000s)Number of Graduates

(in 10,000s)

1996 858.21 26.02

1997 1014.31 28.88

1998 1180.81 34.54

1999 1305.16 42.20

2000 1327.68 48.94

2001 1330.43 64.10

2002 1285.10 129.42

2003 1155.91 70.45

2004 1234.53 78.81

2005 1058.04 254.26

Number of Examinees and Graduates in China between 1996 and 2005

Examinee <- c(858.21,1014.31,1180.81,1305.16,1327.68,1330.43,1285.10,1155.91,1234.53,1058.04)

Graduate <- c(26.02,28.88,34.54,42.20,48.94,64.10,129.42,70.45,78.81,254.26)

Year <- c(1996:2005)

Page 32: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

32

Line + Bar Chart

Page 33: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

33

barplot(Examinee)

par(new=TRUE)

plot(Year,Graduate) High-level Plot Function

barplot(Examinee)

plot(Year,Graduate)

plot(Year,Graduate)

barplot(Examinee)

Page 34: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

34

barplot(Examinee,

main="No. of Examinees and Graduates\nin China between 1996 and 2005")

par(new=TRUE)

plot(Year,Graduate,type="l")

Page 35: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

35

par(mar=c(5,6,4,6))barplot(Examinee,

main="No. of Examinees and Graduates\nin China between 1996 and 2005",

las=1,

names.arg=c(1996:2005))

par(new=TRUE)

plot(Year,Graduate,type="l",col="coral1",

xaxt="n",yaxt="n",xlab="",ylab="",

lwd=3)

axis(4,las=1)

5 Lines

6 Lines

4 Lines

6 L

ine

s

Page 36: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

36

Page 37: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

37

par(mar=c(5,6,4,6))

barplot(Examinee,

main="No. of Examinees and Graduates\nin China between 1996 and 2005",

las=1,

names.arg=c(1996:2005))par(new=TRUE)

plot(Year,Graduate,type="l",col="coral1",

xaxt="n",yaxt="n",xlab="",ylab="",

lwd=3)

axis(4,las=1)

names.arg a vector of names to be plotted below each bar or group of bars.

Page 38: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

38

par(mar=c(5,6,4,6))

barplot(Examinee,

main="No. of Examinees and Graduates\nin China between 1996 and 2005",

las=1,

names.arg=c(1996:2005))

par(new=TRUE)

plot(Year,Graduate,type="l",col="coral1",

xaxt="n",yaxt="n",xlab="",ylab="",

lwd=3)

axis(4,las=1)axis(4,las=1)

Page 39: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

39

par(mar=c(5,6,4,6))

barplot(Examinee,

main="No. of Examinees and Graduates\nin China between 1996 and 2005",

las=1,

col="pink",

names.arg=c(1996:2005))

par(new=TRUE)

plot(Year,Graduate,type="l",col="coral1",

xaxt="n",yaxt="n",xlab="",ylab="",

lwd=3)

axis(4,las=1)

Page 40: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

40

Examinee <- c(858.21,1014.31,1180.81,1305.16,1327.68,1330.43,1285.10,1155.91,1234.53,1058.04)

Graduate <- c(26.02,28.88,34.54,42.20,48.94,64.10,129.42,70.45,78.81,254.26)

Year <- c(1996:2005)

par(mar=c(5,6,4,6))

barplot(Examinee,col="pink",

main="No. of Examinees and Graduates\nin China between 1996 and 2005",

cex.main=1.5,

xlab="Year",

cex.lab=1.2,

ylim=c(0,1400),

las=1,

names.arg=c(1996:2005))

par(new=TRUE)

plot(Year,Graduate,type="l",col="coral1",

xaxt="n",yaxt="n",xlab="",ylab="",

lwd=3)

axis(4,las=1)

mtext("Number of Graduates",side=4,line=3,cex=1.2,col="coral1")

mtext("Number of Examinees",side=2,line=3,cex=1.2,col="pink")

text(2001,25,"Source:Yearbook of Educational Statistics in China, 2006")

Page 41: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

pie.sales <- c(0.12, 0.3, 0.26, 0.16, 0.04, 0.12)

names(pie.sales) <- c("Blueberry", "Cherry",

"Apple", "Boston Cream", "Other", "Vanilla Cream")

pie(pie.sales,col=rainbow(6))

41

heat.colors() terrain.colors() topo.colors() cm.colors()

Page 42: Introduction to R - nus.edu.sgnus.edu.sg/alset/wp-content/uploads/2020/07/...Contents –Introduction to R • Introduction to R • Display datasets • Display First and Last 6 Observations

Table <- matrix(

c(45, 5,16, 2,

1,33, 3, 7,

20,10,56, 4,

2, 3, 5,50),

ncol=4,byrow=T)

rows <- rep(1:4,4)

cols <- c(rep(1,4),rep(2,4),rep(3,4),rep(4,4))

dimnames(Table) = list(

c("Strongly Disagree", "Disagree","Agree","Strongly Agree"),

c("Strongly Disagree", "Disagree","Agree","Strongly Agree"))

library(scatterplot3d)

scatterplot3d(rows, cols, as.vector(Table),

type="h", pch=" ", angle=50,

lab=c(3,3), lwd=5,

main="3 Dimensional Contingency Table",

xlab="Current Satisfaction With Life",

ylab="Last Year Satisfaction with Life",

zlab="Observation",

x.ticklabs=rownames(Table),

y.ticklabs=colnames(Table),

y.margin.add=1.2,

color="red")

42

3 Dimensional Contingency Table