categorical data with r

Post on 27-Jun-2015

2.723 Views

Category:

Education

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Tabulatingdata with

2012-10-22 @HSPHKazuki Yoshida, M.D. MPH-CLE student

FREEDOMTO  KNOW

Group Website is at:

http://rpubs.com/kaz_yos/useR_at_HSPH

n Introduction

n Reading Data into R (1)

n Reading Data into R (2)

n Descriptive statistics

Previously in this group

Group Website: http://rpubs.com/kaz_yos/useR_at_HSPH

Menu

n Categorical data

n How to tabulate

n Get sums and proportions

Ingredients

n Tables

n Cross tables

n Stratified tables

n data()

n table(), summary()

n prop.table()

n addmargins()

n xtabs(), ftable()

n gmodels::CrossTable()

n epiR::epi.2by2()

n Creating categorical variables

Epi/Stat Programming

Categorical data

gender

countryrace

ethnicity

cancer stage

disease severity

education level

Open R Studio

vcd epiRInstall and Load

data(Arthritis)Load built-in dataset Named “Arthritis”

We will use “Arthritis” dataset in vcd package

Arthritis[1:17 , ]

Extract 1st to 17th rows Show all columns

Indexing: extraction of data from data frame

Don’t forget commaColon in between

Treatment vector in Arthritis data frame

Five vectors of same length tied together

summarysummary(Arthritis)

summary of whole dataset

Your turn

n summary(Arthritis)

adopted from Hadley Wickham

Arthritis$Treatment

Accessing a single variable in data set

dataset name variable name

Arthritis$Treatment

factor levels (categories)

levelslevels(Arthritis$Treatment)

Check factor levels of a vector

Your turn

n Arthritis$Improved

n levels(Arthritis$Improved)

adopted from Hadley Wickham

This is an ordered factor

factor

factor is categorical variable in R

tabletable(Arthritis$Improved)

Create a single variable summary

Your turn

n table(Arthritis$Improved)

adopted from Hadley Wickham

prop.tabletable(table.object)

Convert tables to proportions

Your turn

n Improved.cat <- table(Arthritis$Improved)

n prop.table(Improved.cat)

adopted from Hadley Wickham

xtabsxtabs(formula = ~ , data = Arthritis)

Create cross tables

Your turn

n xtabs(~ Treatment +Improved, Arthritis)

n xtabs(~ Treatment +Improved +Sex, Arthritis)

adopted from Hadley Wickham

1st dimention

2nd dimention3rd dim

ention

addmarginsaddmargins(table.object)

Add margins to tables

Your turn

n tab1 <- xtabs(~ Treatment +Improved, Arthritis)

n addmargins(tab1)

adopted from Hadley Wickham

ftableftable(..., exclude = c(NA, NaN),

row.vars = NULL, col.vars = NULL)

Create flat tablesGood for ≥ 3 dimentional

Your turn

n tab2 <- xtabs(~ Treatment +Improved +Sex, Arthritis)

n ftable(tab2)

adopted from Hadley Wickham

prop.tabletable(cross.table.object)

Proportions again

Your turn

n tab3 <- xtabs(~ Treatment +Improved, Arthritis)

n prop.table(tab3) # proportion to total

n prop.table(tab3, 1) # proportion to row sum

n prop.table(tab3, 2) # proportion to column sum

adopted from Hadley Wickham

1st dimension

2nd dimension

chisq.testchisq.test(cross.table.object)

Chi-squared test

fisher.testfisher.test(cross.table.object)

Fisher’s exact test

Your turn

n tab3 <- xtabs(~ Treatment +Improved, Arthritis)

n chisq.test(tab3)

n fisher.test(tab3)

adopted from Hadley Wickham

CrossTableCrossTable(tab.2d)

available ingmodels package

SAS-like cross tables

Your turn

n tab3 <- xtabs(~ Treatment +Improved, Arthritis)

n CrossTable(tab3)

adopted from Hadley Wickham

epi.2x2epi.2x2(tab.2by2)

available inepiR package

2x2 table with RR RD OR

Your turn

n tab.2by2 <- xtabs(~ Sex +Treatment, Arthritis)

n epi.2by2(tab.2by2, units = 1)

adopted from Hadley Wickham

Creating factor

factor factorData in Excel

Integer

dat$Stage <- factor(dat$Stage)

To convert number vector to factor vector

dat$Stage <- as.numeric(as.character(dat$Stage))

To convert back to number

top related