binary response harry r. erwin, phd school of computing and technology university of sunderland

7
Binary Response Harry R. Erwin, PhD School of Computing and Technology University of Sunderland

Upload: ada-wilkins

Post on 21-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Binary Response Harry R. Erwin, PhD School of Computing and Technology University of Sunderland

Binary Response

Harry R. Erwin, PhD

School of Computing and Technology

University of Sunderland

Page 2: Binary Response Harry R. Erwin, PhD School of Computing and Technology University of Sunderland

Resources

• Crawley, MJ (2005) Statistics: An Introduction Using R. Wiley.

• Freund, RJ, and WJ Wilson (1998) Regression Analysis, Academic Press.

• Gentle, JE (2002) Elements of Computational Statistics. Springer.

• Gonick, L., and Woollcott Smith (1993) A Cartoon Guide to Statistics. HarperResource (for fun).

Page 3: Binary Response Harry R. Erwin, PhD School of Computing and Technology University of Sunderland

Introduction

• These four demonstration sessions of this class address special types of data:– Counts– Proportions – Survival analysis– Binary responses (this lecture)

Page 4: Binary Response Harry R. Erwin, PhD School of Computing and Technology University of Sunderland

Binary Response

• Very common:– dead or alive– occupied or empty– male or female– employed or unemployed

• Response variable is 0 or 1.

• R assumes a binomial trial with sample size 1.

Page 5: Binary Response Harry R. Erwin, PhD School of Computing and Technology University of Sunderland

When to use Binary Response Data

• Do a binary response analysis only when you have unique values of one or more explanatory variables for each and every individual case.

• Otherwise lump: aggregate to the point where you have unique values. Either:– Analyse the data as a contingency table using Poisson errors,

or– Decide which explanatory variable is key, express the data as

proportions, recode as a count of a two-level factor, and assume binomial errors.

Page 6: Binary Response Harry R. Erwin, PhD School of Computing and Technology University of Sunderland

Applications to Spatial and Time Series Statistics

• You’re assuming you’re sampling from a spatial point process. The null hypothesis is that events occur uniformly over space and with a Poisson distribution (memory-less) over time.

• The usual approach is described on the next slide. This addresses both location and rate of events simultaneously. Consider lumping to study the geographic or time-dependent distribution of the event rate separately.

• The problem is similar to how we model neurone spiking.

Page 7: Binary Response Harry R. Erwin, PhD School of Computing and Technology University of Sunderland

Modelling Binary Response

• Single vector with the response variable

• Use glm with family = binomial• Think about a log-log link instead of logit.

• Fit the usual way.

• Test significance using 2.

• Don’t worry about overdispersion.

• Book example.