introduction to research design statlab workshop, fall 2010 jeremy green nancy hite

19
Introduction to Introduction to Research Design Research Design Statlab Workshop, Fall Statlab Workshop, Fall 2010 2010 Jeremy Green Jeremy Green Nancy Hite Nancy Hite

Upload: karina-alewine

Post on 15-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

Introduction to Introduction to Research DesignResearch Design

Statlab Workshop, Fall 2010Statlab Workshop, Fall 2010

Jeremy GreenJeremy Green

Nancy HiteNancy Hite

Page 2: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

Outline of a paperOutline of a paper

IntroductionIntroductionTheoryTheoryData DescriptionData DescriptionAnalysisAnalysisConclusionConclusion

Page 3: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

Identifying a QuestionIdentifying a Question

Tradeoff between work in and resultsTradeoff between work in and results Easy to do, trivial resultsEasy to do, trivial results Result is interesting, but difficulty is highResult is interesting, but difficulty is high

New tools open up new questionsNew tools open up new questions New statistical or computational tools New statistical or computational tools

make formerly difficult questions make formerly difficult questions approachableapproachable

New theory opens up new questionsNew theory opens up new questions

Page 4: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

IntroductionIntroductionTopicTopic

Most general levelMost general level

QuestionQuestion What is the question you want to What is the question you want to

answer?answer? Be specificBe specific Ask only what you can answerAsk only what you can answer

Review the LiteratureReview the Literature ““Stay the course”Stay the course”

Page 5: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

TheoryTheoryCategorize your theoryCategorize your theory

Descriptive vs. causalDescriptive vs. causal

Write down your theoryWrite down your theory In paragraph formIn paragraph form Using a statistical modelUsing a statistical model

Page 6: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

HypothesisHypothesisIdentify testable hypotheses.Identify testable hypotheses.- how well does the hypothesis test the how well does the hypothesis test the

theorytheory

- what is the counterfactual argument- what is the counterfactual argument- What is the scope of the hypothesis What is the scope of the hypothesis

testtest- Spurious factors, contamination, Spurious factors, contamination,

endogenous factorsendogenous factors

Page 7: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

Do you need statistics after Do you need statistics after all?all?

Page 8: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

Methodological Concerns: Methodological Concerns: Consort ChecklistConsort Checklist

Page 9: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

VariablesVariablesDependent Variable (Dependent Variable (response, response,

outcome, criterion)outcome, criterion)

Independent Variables (Independent Variables (explanatory explanatory or predictor variables)or predictor variables)

Treatment VariableTreatment Variable Covariates / Confounding Variables Covariates / Confounding Variables

Categorical and Continuous Categorical and Continuous VariablesVariables

Remember: Types of variables we choose, Remember: Types of variables we choose, determine the statistics we usedetermine the statistics we use

Page 10: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

You need DataYou need DataThink about analyses early!Think about analyses early!Collecting your own dataCollecting your own data

Retrospective, prospective, experimental & Retrospective, prospective, experimental & observational methodsobservational methods

Can find most data you’ll need on-line!Can find most data you’ll need on-line!Statlab Webpage Statlab Webpage (http://statlab.stat.yale.edu)(http://statlab.stat.yale.edu)

AdvisorsAdvisors Yale StatCat (http://ssrs.yale.edu/statcat/)Yale StatCat (http://ssrs.yale.edu/statcat/) ICPSR (http://www.icpsr.umich.edu)ICPSR (http://www.icpsr.umich.edu) Reference Librarian (Julie Linden)Reference Librarian (Julie Linden)

Page 11: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

So, you want to make a So, you want to make a surveysurvey

Extensive on-line resources and softwareExtensive on-line resources and software Question types determine analysesQuestion types determine analyses

Open vs. close ended questions, Likert scales, rank Open vs. close ended questions, Likert scales, rank order dataorder data

Assumptions of normalityAssumptions of normality

ValidityValidity Internal & External validityInternal & External validity

Pilot testingPilot testing You need variance to analyze!You need variance to analyze!

Sample sizeSample size It depends; power, effect size, cost (UCLA power It depends; power, effect size, cost (UCLA power

calculator)calculator)

Page 12: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

Once You’ve Found or Once You’ve Found or Collected your dataCollected your data

Download the data and documentationDownload the data and documentation StatTransfer (Statlab)StatTransfer (Statlab)

Determine data file typeDetermine data file type Probably a text file (.txt, .dat, .raw)Probably a text file (.txt, .dat, .raw)

Converting text & delimited filesConverting text & delimited files

Choose a statistical software programChoose a statistical software program SPSS, Stata, SAS, Matlab, Excel, R, C+SPSS, Stata, SAS, Matlab, Excel, R, C+

++

Page 13: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

Managing your dataManaging your dataBack up all Master Data FilesBack up all Master Data Files

CDR/CDRW, USB Key CDR/CDRW, USB Key

CodebookCodebook All codes All codes Adding variables, cases, computing new Adding variables, cases, computing new

variablesvariables

Keep a roadmap Keep a roadmap Keep a log of all analyses with what you Keep a log of all analyses with what you

have donehave done Save syntax filesSave syntax files

Page 14: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

Data Entry - CodebookData Entry - Codebook Always create a codebook that contains: Always create a codebook that contains:

Instructions for entering data Instructions for entering data

Instructions for making decisions when data are ambiguousInstructions for making decisions when data are ambiguous

Instructions for handling missing observationInstructions for handling missing observation

Numerical codes you will use for categorical dataNumerical codes you will use for categorical data General troubleshooting informationGeneral troubleshooting information

Treat it as a working documentTreat it as a working document

Page 15: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

Cleaning your dataCleaning your data In order to minimize errors while manually entering data, In order to minimize errors while manually entering data,

you can set ranges in Excel so that if a value outside the you can set ranges in Excel so that if a value outside the range is entered, the cell will change color. range is entered, the cell will change color.

To to this go to Format - Conditional Formatting and To to this go to Format - Conditional Formatting and specify the ranges for which a different format should specify the ranges for which a different format should show up.show up.

Also, you can use the data validation options. Go to Data - Also, you can use the data validation options. Go to Data - ValidationValidation

Page 16: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

Keeping Track of Data Keeping Track of Data SetsSets

Ever time you make changes to your data, save it with the Ever time you make changes to your data, save it with the current datecurrent date

Keep a document with a list of the major changes with Keep a document with a list of the major changes with each versioneach version

A good idea is to keep a folder with the original data sets A good idea is to keep a folder with the original data sets and create different subfolders as you make changes to the and create different subfolders as you make changes to the data set. Sometimes it is also a good idea to keep a data set. Sometimes it is also a good idea to keep a working directory for currently active files working directory for currently active files

Always make backup copiesAlways make backup copies

Page 17: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

Keeping Track of Keeping Track of Syntaxes and OutputsSyntaxes and Outputs

Save all the syntax you writeSave all the syntax you write

Save all the output you produce and try to annotate it as Save all the output you produce and try to annotate it as much as possible much as possible

Save your syntax and output with the data file name a brief Save your syntax and output with the data file name a brief description of the analyses and the current date description of the analyses and the current date

Save syntax and output in a separate folder from your dataSave syntax and output in a separate folder from your data

Page 18: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

So, how do I analyze my So, how do I analyze my data?data?

CorrelationCorrelation Correlation allows you to quantify relationships Correlation allows you to quantify relationships

between variables (r, r-squared)between variables (r, r-squared) Regression allows prediction of dependent variable Regression allows prediction of dependent variable

based on one or more independent variablesbased on one or more independent variables

Group differencesGroup differences t-test & ANOVAt-test & ANOVA Chi-square for categorical and frequency dataChi-square for categorical and frequency data

Significance v. effect sizeSignificance v. effect sizeMore Complex ModelsMore Complex Models

Page 19: Introduction to Research Design Statlab Workshop, Fall 2010 Jeremy Green Nancy Hite

Take Away MessagesTake Away Messages1)1) Determine your question, methods Determine your question, methods

and statistics before you startand statistics before you start

2)2) Keep a codebook of everythingKeep a codebook of everything

3)3) Keep a log of all commands issuedKeep a log of all commands issued

4)4) Save data at every stepSave data at every step

5)5) Ask for helpAsk for help

6)6) Don’t get in over your headDon’t get in over your head