p reparing for data analysis - 2011 mbbs h onours p rogram jenny zhang research fellow school of...

7
PREPARING FOR DATA ANALYSIS - 2011 MBBS HONOURS PROGRAM Jenny Zhang Research Fellow School of Medicine The University of Queensland

Upload: isabel-richard

Post on 02-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: P REPARING FOR DATA ANALYSIS - 2011 MBBS H ONOURS P ROGRAM Jenny Zhang Research Fellow School of Medicine The University of Queensland

PREPARING FOR DATA ANALYSIS - 2011 MBBS HONOURS PROGRAM

Jenny Zhang

Research Fellow

School of Medicine

The University of Queensland

Page 2: P REPARING FOR DATA ANALYSIS - 2011 MBBS H ONOURS P ROGRAM Jenny Zhang Research Fellow School of Medicine The University of Queensland

OVERVIEW

Analytical Plan Hypotheses Variables Data management Statistical methods

Examples of database in Excel and SPSS Examples of code book and description of

your variables

Page 3: P REPARING FOR DATA ANALYSIS - 2011 MBBS H ONOURS P ROGRAM Jenny Zhang Research Fellow School of Medicine The University of Queensland

HYPOTHESES

Define the most appropriate hypotheses to fully answer the research question.

What is hypotheses? - Hypotheses are statements in quantitative research in which the

investigator makes a prediction or a conjecture about the outcome of a relationship among attributes or characteristics.

Examples:

There is correlation between GPA and academic performance. (a correlation study)

Socioeconomic position will be related to the preventive health services utilization. (a cross sectional study)

Average iron status will be different for children whose family cooks in iron pots compared to children whose family cooks in aluminum pots. (an experimental study)

Page 4: P REPARING FOR DATA ANALYSIS - 2011 MBBS H ONOURS P ROGRAM Jenny Zhang Research Fellow School of Medicine The University of Queensland

VARIABLES

The attributes, characteristics, or dimension being measured or studied.

Dependent variables (DVs)The outcome variable where an effect is measured.

Independent variables (IVs)An independent variable is that variable whichis presumed to affect or determine a dependent variable.

Example: You are interested in how stress affects heart rate in humans. Your independent variable would be the stress and the dependent variable would be the heart rate. You can directly manipulate stress levels in your human subjects and measure how those stress levels change heart rate.

Page 5: P REPARING FOR DATA ANALYSIS - 2011 MBBS H ONOURS P ROGRAM Jenny Zhang Research Fellow School of Medicine The University of Queensland

DATA MANAGEMENT

Data coding Coding your questions (code book) and label your variables -HSU Database.doc

Data entryData are entered into the dataset against each ID number. Excel database - Jennysur_original_dataanalysis.xls - analysis_1.xls

SPSS database - TTDASS_example.sav - Jennysur_original_data analysis.sav

Data verification 10 % of the sample size, re-enter to detect any entry mistakes.

Page 6: P REPARING FOR DATA ANALYSIS - 2011 MBBS H ONOURS P ROGRAM Jenny Zhang Research Fellow School of Medicine The University of Queensland

DATA MANAGEMENT Data cleaning

1. Perform frequency distribution of all variables to check for - Duplicated ID- Invalid values for each variable against the coding protocol - Extreme values (determine plausibility and final inclusion or exclusion from analysis)

TTDASS_example.sav

2. Consistency checks - cross-checks as determined by your research team, e.g. the filter questions for impossible combinations and inconsistent values and meanings; baseline data and follow up data; term 1-5.

Data storage

De-identified and kept confidential; (questionnaire) safely stored in a locked filing cabinet.

Page 7: P REPARING FOR DATA ANALYSIS - 2011 MBBS H ONOURS P ROGRAM Jenny Zhang Research Fellow School of Medicine The University of Queensland

STATISTICAL METHODS Choose the best way to analyse the data for each hypothesis and rationale for choice of any analysis

Descriptive statistics - Describe characteristics of participants with respect to all variable available. E.g. age, sex, country of born. - Describe statistics of all variables

Bivariate analysis (among two variables) - comparisons between 2 or more groups - The relationship between outcome variable and each of independent variable

Multivariable analysis (1 outcome variable + 2 more independent variables) - General Linear Model - Logistical Regression model