research methodology for nursing students: data analysis
TRANSCRIPT
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Data Analysis
Dr Helmy Hazmi MD MComMed (Epid & Biostats)
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
The p value
The statistical significance.
P < 0.05 is significant.
P > 0.05 is not significant.
0.05
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Newbie’s meaning of significant
A significant association between 2 variables.
A significant difference in mean between 2 variables.
A significant difference in mean before and after and intervention.
A significant correlation between 2 variables.
Chi square test.
Independent t test.
Paired t test.
Correlation test.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Hypothesis
An assumption.
A thesis
A prediction
For further investigation.
Use statistics to reject or accept hypothesis.
Null hypothesis
Alternative hypothesis
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Null hypothesis, H0
States that:
There is no association between the variables of interest.
There is no difference in mean between the variables of interest.
Research examples:
There is no association between smoking and lung cancer.
There is no difference in mean BMI between the girls and boys in UNIMAS.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Alternative hypothesis, Ha
States that:
There is a significant association between variables of interest.
Thee is a significant difference in mean between the variables of interest.
Research examples:
There is a significant association between smoking and lung cancer.
There is a significant difference in mean BMI between boys and girls in UNIMAS.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
What does significant really mean?
5% - amount of false alarms allowed. Relationship occurs at random / by chance.
95% - due to real association / difference. Not by chance.
Truth about the population
H0 TRUE(accept null hypothesis)
Ha TRUE (reject null hypothesis)
Decision based on sample (your study)
Accept Ho Correct decision Type 2 error, β
Reject HoType 1 error, α
(miss diagnosed/ false alarm/ false positive)
Power (1 – β)
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
The P - value
Based on α.
Based on the allowable type 1 error.
Critical value of α is 5% / 0.05
0.05 is the cut off point
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
If p value is 0.02
Percentage of type 1 error = 2%
False alarm = 2%
By chance (secara kebetulan) = 2%
Below critical limit, 5%
98% correct decision.
98% reflects true relationship.
Significant relationship.
Reject null hypothesis
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
If the p – value is 0.08
Percentage of type 1 error = 8%
False alarm = 8%
Relationship by chance = 8%
Above critical limit of 5%
92% correct decision.
92% true relationship.
Relationship not significant.
Accept null hypothesis.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Between smoking and lung ca, p=0.03
False alarm = 3%
Type 1 error made = 3%
By chance = 3%
Below 0.05.
97% correct decision.
97% real relationship.
Relationship is significant.
Reject null hypothesis.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Though there is an association between smoking and lung cancer, we do not know the direction yet.
Does smoking cause lung cancer? What is the odds of smokers to get lung cancer?
Further analysis needed.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Univariate statistical test
Relationship between an
INDEPENDENT
variable and a
DEPENDENT
variable.
Other possible concurrent factors omitted.
Smoking and lung cancer?
HPV infection and cervical cancer?
Knowledge and practice?
Sleep hours and exam score?
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
A guide to choosing a statistical test
Variable 1 Variable 2 Statistical test
Categorical Categorical Chi square test
Categorical (2 levels)
Numerical Independent t test
Numerical Numerical Correlation test
Pre intervention Post intervention Statistical test
Categorical Categorical Chi square test
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Chi square test
Measure association
Between 2 categorical variables.
Traditionally, a 2 by 2 table.
Can go more, but risk small samples.
This is a 2 by 2table
Lung cancerYES
Lung cancer NO
Smoking YES A B
Smoking NO c D
This is a 3 by 2table
Lung cancerYES
Lung cancer NO
Ex smoker A B
Currently smoking
C D
Never smoke E F
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Assumptions for chi square test
Categorical variables only.
Samples are random.
Mutually exclusive.
Cells with expected cell counts of <5 should not be more than 20% of total cells.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Go to
http://publichealthunimas.blogspot.com/
“download centre”
Download data .sav
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Chi square steps
Analyse: Gender vs hypertension status
Analyse > Descriptive > Crosstab
Enter independent var in row | enter dependent var in column
Statistics > check [chi square] > continue
Cells > check [counts] [percentages] > continue
Ok
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
The chi square value is read here
The p value is read here.
One of the assumption is to be checked here. If the % of expected cell counts of less than 5 is >20, perform fisher exact test and read the p value from fisher exact test row.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Interpretation
The p value is 0.521.
More than 0.05.
There is no significant association
between gender and hypertension
status
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Independent t test
Categorical vs numerical variables.
Category must be 2 levels only.
Compare the mean between 2 levels.
Examples:
Compare mean BMI between males and females.
Compare blood glucose (mmol/l) between patients attending PK Mosque Rd and PK Tanah Puteh.
Compare mean age between high income and low income earners.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Assumptions of ind t test
Samples are random.
2 levels are mutually exclusive.
Normal distribution.
Dependent variables must be numerical
Homogeneity of variance.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Homogeneity of variance
Homogenous = same
Heterogenous = different
Variances are the same.
𝑠2 = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Homogeneity of variance
S2 = 100
S2 = 200
Mean1 = mean2 BMI 23.0
Pop 2 has more people > BMI 23.0
Both populations are heterogenous.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Ind T Test steps
Analyse > compare means > ind samples t test
Insert numerical var in test variable
insert categorical var in grouping var
Define groups
Define groups for grouping var > continue > ok
Analyse: Gender vs age
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Know your coding beforehand.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
The assumption of equality of variance is read here
The 95% Confidence interval is read here.The p-value is read here.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Interpretation
Mean age for male = 49.6
Mean age for females = 46.5
The p-value is 0.049
p-value is lower than 0.05
There is a significant
difference in mean age
between males and females.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Paired t test
Almost like independent t test.
Both groups are related
Not mutually exclusive.
Common examples:Mean BMI before and after exercise.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Assumptions of Paired t test
Samples are random
The two groups are related
Numerical dependent variable
Observed differences are normally distributed.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Related means
Not mutually exclusive
Occurrence affect the other.
“Before and after” test
Husband and wife
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Notice how the data are entered
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Steps
Analyse: BMI pre intervention vs BMI post intervention
Analyse > compare means > paired samples t test
Insert numerical var in test variable
insert categorical var in grouping var
Define groups
Define groups for grouping var > continue > ok
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Check for assumption
Transform > compute variable
“BMI pre – BMI post”
Create the distribution curve.
Observe.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Analyse > Descriptive > Explore
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
The mean difference are normally distributed
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
The p-value is read here.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Interpretation
The mean BMI before intervention
25.7
The mean BMI after intervention
23.8
P-value = 0.003
p-value less than 0.05
There is a significant
drop/difference of mean BMI
before and after the exercise intervention.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Correlation
Numerical vs numerical.
Strength of the relation
Direction of the correlation
Uses correlation coefficient
P value too.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Correlation coefficient magnitude
Positive correlation Negative correlation NO correlation
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Strength of correlation
Pearson correlation
coefficient, r =
0.00 – 0.30: Negligible
0.031 – 0.50: Weak
0.51 – 0.80: Moderate
0.81 – 1.00: Strong
1.00: perfect correlation
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
r2 (extra information)
The r square.
Coefficient of determination.
Tells the fitness of data with the statistical model.
Square root it to get coefficient of correlation.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Relationship between r and p value
r = 0.9 r = 0.7 r = 0.3
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Correlation assumption
Random samples
2 variables are normally distributed.
Mutually exclusive
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Steps:
Analyse > correlate > bivariate
Enter variables of interest
OK
Analyse: correlation between weight and height
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Read Pearson correlation coefficient, r, here
Read p- value here.
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Interpretation
The Pearson correlation coefficient between weight and height 0.545.
The p-value is < 0.001
Less than 0.05
There is a significant correlation between
weight and height
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
For further understanding of your data
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
How to present in table
Descriptive
Chi square test
Independent t test
Paired t test
Correlation
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
*** there should be “df” after chi square:
Chi square (df)
Department of Community Medicine and Public Health Universiti Malaysia Sarawak
Independent t testAnd correlation