categorical data with r

46
Tabulating data with 2012-10-22 @HSPH Kazuki Yoshida, M.D. MPH-CLE student FREEDOM TO KNOW

Upload: kazuki-yoshida

Post on 27-Jun-2015

2.723 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Categorical data with R

Tabulatingdata with

2012-10-22 @HSPHKazuki Yoshida, M.D. MPH-CLE student

FREEDOMTO  KNOW

Page 2: Categorical data with R

Group Website is at:

http://rpubs.com/kaz_yos/useR_at_HSPH

Page 3: Categorical data with R

n Introduction

n Reading Data into R (1)

n Reading Data into R (2)

n Descriptive statistics

Previously in this group

Group Website: http://rpubs.com/kaz_yos/useR_at_HSPH

Page 4: Categorical data with R

Menu

n Categorical data

n How to tabulate

n Get sums and proportions

Page 5: Categorical data with R

Ingredients

n Tables

n Cross tables

n Stratified tables

n data()

n table(), summary()

n prop.table()

n addmargins()

n xtabs(), ftable()

n gmodels::CrossTable()

n epiR::epi.2by2()

n Creating categorical variables

Epi/Stat Programming

Page 6: Categorical data with R

Categorical data

gender

countryrace

ethnicity

cancer stage

disease severity

education level

Page 7: Categorical data with R

Open R Studio

Page 8: Categorical data with R

vcd epiRInstall and Load

Page 9: Categorical data with R

data(Arthritis)Load built-in dataset Named “Arthritis”

We will use “Arthritis” dataset in vcd package

Page 10: Categorical data with R

Arthritis[1:17 , ]

Extract 1st to 17th rows Show all columns

Indexing: extraction of data from data frame

Don’t forget commaColon in between

Page 11: Categorical data with R

Treatment vector in Arthritis data frame

Five vectors of same length tied together

Page 12: Categorical data with R

summarysummary(Arthritis)

summary of whole dataset

Page 13: Categorical data with R

Your turn

n summary(Arthritis)

adopted from Hadley Wickham

Page 14: Categorical data with R

Arthritis$Treatment

Accessing a single variable in data set

dataset name variable name

Page 15: Categorical data with R

Arthritis$Treatment

factor levels (categories)

Page 16: Categorical data with R

levelslevels(Arthritis$Treatment)

Check factor levels of a vector

Page 17: Categorical data with R

Your turn

n Arthritis$Improved

n levels(Arthritis$Improved)

adopted from Hadley Wickham

Page 18: Categorical data with R

This is an ordered factor

Page 19: Categorical data with R

factor

Page 20: Categorical data with R

factor is categorical variable in R

Page 21: Categorical data with R

tabletable(Arthritis$Improved)

Create a single variable summary

Page 22: Categorical data with R

Your turn

n table(Arthritis$Improved)

adopted from Hadley Wickham

Page 23: Categorical data with R

prop.tabletable(table.object)

Convert tables to proportions

Page 24: Categorical data with R

Your turn

n Improved.cat <- table(Arthritis$Improved)

n prop.table(Improved.cat)

adopted from Hadley Wickham

Page 25: Categorical data with R

xtabsxtabs(formula = ~ , data = Arthritis)

Create cross tables

Page 26: Categorical data with R

Your turn

n xtabs(~ Treatment +Improved, Arthritis)

n xtabs(~ Treatment +Improved +Sex, Arthritis)

adopted from Hadley Wickham

Page 27: Categorical data with R

1st dimention

2nd dimention3rd dim

ention

Page 28: Categorical data with R

addmarginsaddmargins(table.object)

Add margins to tables

Page 29: Categorical data with R

Your turn

n tab1 <- xtabs(~ Treatment +Improved, Arthritis)

n addmargins(tab1)

adopted from Hadley Wickham

Page 30: Categorical data with R

ftableftable(..., exclude = c(NA, NaN),

row.vars = NULL, col.vars = NULL)

Create flat tablesGood for ≥ 3 dimentional

Page 31: Categorical data with R

Your turn

n tab2 <- xtabs(~ Treatment +Improved +Sex, Arthritis)

n ftable(tab2)

adopted from Hadley Wickham

Page 32: Categorical data with R

prop.tabletable(cross.table.object)

Proportions again

Page 33: Categorical data with R

Your turn

n tab3 <- xtabs(~ Treatment +Improved, Arthritis)

n prop.table(tab3) # proportion to total

n prop.table(tab3, 1) # proportion to row sum

n prop.table(tab3, 2) # proportion to column sum

adopted from Hadley Wickham

1st dimension

2nd dimension

Page 34: Categorical data with R

chisq.testchisq.test(cross.table.object)

Chi-squared test

Page 35: Categorical data with R

fisher.testfisher.test(cross.table.object)

Fisher’s exact test

Page 36: Categorical data with R

Your turn

n tab3 <- xtabs(~ Treatment +Improved, Arthritis)

n chisq.test(tab3)

n fisher.test(tab3)

adopted from Hadley Wickham

Page 37: Categorical data with R

CrossTableCrossTable(tab.2d)

available ingmodels package

SAS-like cross tables

Page 38: Categorical data with R

Your turn

n tab3 <- xtabs(~ Treatment +Improved, Arthritis)

n CrossTable(tab3)

adopted from Hadley Wickham

Page 39: Categorical data with R
Page 40: Categorical data with R

epi.2x2epi.2x2(tab.2by2)

available inepiR package

2x2 table with RR RD OR

Page 41: Categorical data with R

Your turn

n tab.2by2 <- xtabs(~ Sex +Treatment, Arthritis)

n epi.2by2(tab.2by2, units = 1)

adopted from Hadley Wickham

Page 42: Categorical data with R

Creating factor

Page 43: Categorical data with R

factor factorData in Excel

Integer

Page 44: Categorical data with R

dat$Stage <- factor(dat$Stage)

To convert number vector to factor vector

Page 45: Categorical data with R

dat$Stage <- as.numeric(as.character(dat$Stage))

To convert back to number

Page 46: Categorical data with R