introduction to spss for statistical analysis · 3 • become familiar with the spss environment...

Post on 02-Aug-2020

10 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

INTRODUCTION TO SPSS FOR STATISTICAL

ANALYSIS

ARTHUR MARQUES

MATTY JULLAMON

2

• Workshop materials

https://guides.library.ubc.ca/library_research_commons/rworkshop

• Save the example data in “Desktop” folder (to find the data easily)

Set up

3

• Become Familiar with the SPSS environment

• Learn how to prepare and manage data in SPSS

• Learn how to perform descriptive statistics and inferential statistics

using SPSS

Learning Objectives

4

Overview of quantitative research

Conclusion

Software programs

Research purpose

Research design

Data collection

Statistical analysis- Manipulating data

- Analyzing dataSPSS R STATA

SAS Matlab

Mplus HLM7 …

5

Software for statistical analysis

“User-friendly”

“Common & basic functions”Might not be flexible for complex data & analysis

“Active online community”

6

SPSS environment

7

SPSS environment is composed of 3 main windows:

• Data Editor window

(Data View + Variable View)

• Output window

• Syntax window

SPSS environment

8

• Data Editor - Data View: present whole “data”

SPSS environment

9

• Data Editor - Variable View: present information of all “variables”

SPSS environment

10

• Data Editor - Variable View: “Key” information of variables

SPSS environment

Variable name

Decimal points of data

E.g. 0 decimal point → 1

1 decimal point → 1.0

Description of variable

11

• Data Editor - Variable View: “Key” information of variables

SPSS environment

Meaning of values in variable

E.g. 0 = male, 1 = female

Numbers indicating

missing data

12

• Data Editor - Variable View: “Key” information of variables

SPSS environment

Type of data in variable

- Scale = continuous data

- Nominal = categorical data

- Ordinal = ordinal data

https://stats.idre.ucla.edu/other/mult-pkg/whatstat/what-is-the-difference-between-categorical- ordinal-and-interval-variables/

For more information abut scale, nominal, and ordinal options,

13

• Drop-down menu in Data Editor:

SPSS environment

14

• Output: present “history of your analysis” and all “outputs”

SPSS environment

15

• Syntax: write “syntax”

SPSS environment

16

Data preparation in SPSS

17

• Cross-sectional design:

A set of variables measured from each person in one time point

• A set of variables:

- Gender (Male = 0, Female = 1)

- Age (range 10 – 80)

- Marital status (Married, common law = 1, Widow, divorce, separate = 2,

Single, never married = 3)

- Employment (no job = 1, part time = 2, full time = 3)

- Quality of life_total (range 0 – 20)

- Distress_total (range 0 – 20)

- Self-esteem items (range 0 – 3)

• Missing data are coded as 999

Data preparation with SPSS using example data

18

• Situation 1 - Open data in “SPSS format” (.sav):

• Situation 2 - Import data in “different formats” (excel, text…):

File > Open > Data…

Data import/entry

19

• Enter your data in SPSS Data Editor – Data View:

Data import/entry

20

• A set of variables:

- Gender (Male = 0, Female = 1)

- Age (range 10 - 80)

- Marital status (Married, common law = 1,

Widow, divorce, separate = 2,

Single, never married = 3)

- Employment (No job = 1, Part time = 2,

Full time = 3)

- Quality of life_total (range 0 - 20)

- Distress_total (range 0 – 20)

- Self-esteem items (range 0 – 3)

• Missing data is coded as 999

Checking information of variables in Variable View

E.g., Meaning of valuesClick it

21

• Employment

o No job = 1

o Part time = 2

o Full time = 3

Variable View > Values >Add

Editing value label

1

2

3

22

Data management in SPSS

23

• Make modifications to your raw data

• Common data management tasks:

1. Merging the categories

2. Changing string to numeric data

3. Computing a new summary variable

Data management

24

• Recode function in SPSS

• Used for “Merging the categories” & “Changing string to numeric”

• Example

“Employment” with 3 categories → “Employment_new” with 2 categories

recoded into

category “no job” → “no job”

category “part time” → “having job”

category “full time” → “having job”

Data management

25

• Transform > Recode into different variables

Data management

26

• Compute function in SPSS

• Used for “computing a new summary variable”

• Example

“Esteem_Q1” ~ “Esteem_Q10” → “Esteem_total”

Sum up

Esteem_Q1 + … + Esteem_Q10 → Total score of Esteem

Data management

27

• Transform > compute variable

Data management

28

Descriptive statistics in SPSS

29

• Descriptive statistics provide a summary of your data

• Purpose of looking at descriptive statistics:

(1) Check whether valid data are loaded properlyE.g., unexpected values (e.g., 999, -2) in “Age” variable (range 10-80)

(2) Explore data

E.g., potential group differences, associations between variables

(3) Sample description

E.g., % of gender, mean and standard deviation of quality of life score

Descriptive statistics

30

Descriptive statistics in SPSS:

Descriptive statistics

31

• Frequencies for “categorical data”

Descriptive statistics - Frequencies

32

• Frequencies for “categorical data” – Descriptive statistics

Descriptive statistics - Frequencies

33

• Frequencies for “categorical data” - Bar plots

Descriptive statistics - Frequencies

Employment

No job Part-time Full-time

Marital status

Married Widow Single

Gender

Male Female

Freq

ue

ncy

34

• Frequencies for “continuous data”

Descriptive statistics - Frequencies

35

• Frequencies for “continuous data” – Descriptive statistics

Descriptive statistics - Frequencies

36

• Frequencies for “continuous data” – Histograms

Descriptive statistics - Frequencies

Age Quality of life Distress

Values of “Age” Values of “Quality of life” Values of “Distress”

Freq

ue

ncy

37

• Scatter/Dot plots: Graphs > Legacy Dialogs > Scatter/Dot…

→ Useful to explore associations between variables

Descriptive statistics - Graphs

38

• Scatter plots: output

Descriptive statistics - Graphs

39

Inferential statistics in SPSS

40

Inferential statistics in SPSS

• Inferential statistics we are covering today…

For group comparisons:

• Independent sample T test

• One-way ANOVA

For association:

• Pearson correlation

41

Independent sample T-test

• Independent T-test compares means between two groups

• It is often used to see whether there is group difference in

continuous data between two groups (e.g., gender, treatment vs. control)

• Example

• Model assumptions

(1) Independence, (2) Normality, (3) Equal variance

8 7 5 4 11 3 9 8 7 13 11 10 13 11 15 10 17 12

Males Females

42

Analyze > Compare Means > Independent-Sample T Test …

Independent sample T-test

43

• Output: Test for equal variance assumption

• Conclusion:

Variances of male group and female group are not significantly

different

Note. Given alpha level = 0.05

Independent sample T-test

44

• Output: Results of independent T-test

• Conclusion:

There was no statistically significant difference in level of quality of

life between males and females, t(198) = -1.738, p = 0.084.

Note. Given alpha level = 0.05

Independent sample T-test

45

Independent sample one-way ANOVA

• Independent sample one-way ANOVA compares means between

more than two groups

• It is often used to see whether there are group differences in

continuous data between more than two groups

• Example

• Model assumption:

(1) Independence, (2) Normality, (3) Equal variance

8 7 5 4 11 3 9 8 7 13 11 10 13 11 15 10 17 12

Married Widow/Sep Single

46

• Analyze > Compare Means > One-Way ANOVA …

Independent sample one-way ANOVA

47

• Output: Test for equal variance assumption

• Conclusion:

Variances of married, widow, and single groups are not significantly

different

Note. Given alpha level = 0.05

Independent sample one-way ANOVA

48

• Output: Overall group difference (omnibus test results)

• Conclusion:

There was statistically significant group differences in level of quality

of life between martial status groups, F(2, 197) = 19.827, p <0.001.

Note. Given alpha level = 0.05

Independent sample one-way ANOVA

49

• Output: Which groups differ? (post hoc test results)

• Conclusion:

The level of quality of life for married group was significantly higher

than widow group (p < 0.001).

Single group showed significantly higher level of quality of life than

widow group (p < 0.001)

Independent sample one-way ANOVA

50

Pearson’s correlation

• Pearson’s correlation is used to examine associations between

variables (represented by continuous data) by looking at the

direction and strength of the associations

• Example

• Checking outlier

→ “Graphs > Legacy Dialogs > Scatter/Dot…”

association?association?

association?

Distress

Quality of life

Self-esteem

51

• Analyze > Correlate > Bivariate

Pearson’s correlation

52

• Output:

• Conclusion:

There were statistically significant negative correlations between quality

of life and distress (r = - 0.708, p < 0.001) and between self-esteem and

distress (r = -0.685, p < 0.001).

There was statistically significant positive correlation between quality of

life and self-esteem (r = 0.660, p < 0.001).

Pearson’s correlation

53

EXERCISE

Does the level of distress significantly differ by employment group (no

job, part-time, full-time)?

• What statistical analysis should we use?

• What’s the DV?

• What’s the IV?

54

Ordinary least squares linear regression (for your reference)

• Ordinary least squares (OLS) or Linear regression is used to

explain/predict the phenomenon of interest (continuous data)

• Example

• Model assumptions

(1) Independence, (2) Normality, (3) Equal variance, (4) Linearity

IV 1(Distress level)Dependent V

(Quality of life) Explain/

Predict?

IV 1(Distress level)

IV 2(Self-esteem)

Dependent V

(Quality of life)

Explain/

Predict? IV 3(Gender)

Simple OLS regression Multiple OLS regression

55

• Analyze > Regression > Linear …

Ordinary least squares linear regression (for your reference)

DistressEsteem_totalGenderAge

56

• Output

• Conclusion:

Approximately, 55% of the variability in the quality of life was explained

by the variables in the regression model.

The overall regression model significantly explained the quality of life.

Ordinary least squares linear regression (for your reference)

57

• Output:

• Conclusion:

Distress and self-esteem significantly predicted the level of quality of

life.

We would expect -0.416 points decrease in quality of life for every one

point increase in distress, assuming all the other variables are held

constant.

Ordinary least squares linear regression (for your reference)

Y = 12.041 + (-0.416)(Distress) + 0.221(Esteem) + 0.564(Gender)

58

• SPSS environment

• Data preparation in SPSS

• Data management in SPSS

• Descriptive statistics in SPSS

• Inferential statistics in SPSS

→ Try your own quantitative analysis in SPSS!

Summary

59

RESEARCH COMMONS: AN INTERDISCIPLINARY RESEARCH-DRIVEN LEARNING ENVIRONMENT

• Literature review

• Systematic Review Search Methodology

• Citation Management

• Thesis Formatting

• Nvivo Software Support

• SPSS Software Support

• R Group

• Multi-Disciplinary Graduate Student Writing Group

60

SPSS SERVICES BY RESEARCH COMMONS

• Workshops

• One-on-one Consultation

Request form to book SPSS consultation:

http://bit.ly/UBCRCconsult

61

THANK YOU!

QUESTION, COMMENT, IDEAS

top related