office for faculty excellence - piratepanelcore.ecu.edu/ofe/statisticsresearch/spss 2 9 15...

Post on 01-Feb-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

By Hui Bian

Office for Faculty Excellence

1

• My office is located in 1001 Joyner library, room 1006

• Email: bianh@ecu.edu

• Tel: 252-328-5428

• You can download sample data files from: http://core.ecu.edu/ofe/StatisticsResearch/

2

• Exercise: recode variables Q33, Q43, and Q49

–Recode 1 = 0 days/times into 0 = non-use

–Recode >= 2 (other categories) into 1 = use

3

• The coding for the new recoded variables

• Go to Transform > Recode into Different Variables

• Click Old and New Values button to get

• Exercise: compute a new variable named Drug_N to assess total number of drugs that adolescents used during the last 30 days.

–Only three drugs are assessed: Q33r, Q43r, and Q49r

–The total number of drugs should be between 0 and 3.

• Go to Transform > Compute Variable

–Target Variable: type Drug_N

–Numeric Expression: SUM(Q33r,Q43r,Q49r)

• Function group: Statistical

• Functions and Special Variables: Sum

–If button: check Include all cases

• Recode variables: convert a string variable into a numeric variable

– Example: Q2 (Gender From CSV data file) is a string variable. Convert this variable into a numeric variable Q2r with two categories: Female = 1 and Male = 2.

–Go to Transform > Recode into Different Variables

10

11

• Click Old and New Values button

12

• Sort cases by variables: Data > Sort Cases

• You can use Sort Cases to find missing.

13

• Select cases –Example. Select Females for analysis.

–Go to Data > Select Cases

–Under Select: Check If condition is satisfied

–Click If button

– In the blank window type Q2 = 1

–Click Continue, click OK

15

You should see a new variable: filter_$ (Variable view), deleting this variable means deleting the selection.

17

18

Slashes mean Unselected cases. They are excluded from the data analysis.

• Select cases –Exercise. Select cases who used any of

cigarettes, alcohol, and marijuana during the last 30 days.

–Go to Data > Select Cases –Check “If condition satisfied” –Click If button –Type Q33 > 1 | Q43 > 1 | Q49 > 1, click

Continue, click OK.

19

• If we run Frequency of Drug_use, we should only get the frequency of drug users

• For example, we have both baseline and posttest data files and want to merge them into one file.

• Before merge files, we need to sort cases by matching variable first. In this example, code is the matching variable.

23

• Use baseline data file as active dataset.

• Open both baseline and posttest data files (or just open baseline data file).

• Go to Data > Merge Files: two choices: Add cases and Add variables.

• For this example, we choose Add variables (we want to add posttest variables into the file).

25

26

• Convert Multivariate to Univariate Format

–Multivariate structure: that is all values for each subject appear in one row under column’s names defined as the same for all subjects.

27

• Use data: restructure data_multivariate.sav

– Each subject has seven time-point data (depression: pre, dep1-dep6)

28

• Go to Data > Restructure

29

• We only have one variable (depression variable) that needs to be transposed.

30

31

32

33

34

35

• Data screening

–Understand your variables

–Do the variables meet the statistical assumptions when use parametric tests?

–Check outliers

• Distribution diagnosis: Graphs

–Histograms

–Stem-and-Leaf Plots

–Box Plots

–Normal Q-Q Plots

• Three SPSS functions used for data screening

–Frequencies

–Descriptives

–Explore

• Go to Analyze > Descriptive Statistics, you should see:

–Frequencies

–Descriptives

–Explore

• Example: run Frequencies of Q49 (How many times use marijuana 30 days)

• Central tendency

–A measure that is most representative of all scores in a distribution.

–Mean, median, mode

• Dispersion(variability):

–A measure of the spread of scores in a distribution.

–Variance, standard deviation, range

• Percentile Values

–SPSS reports 25th, 50th, and 75th percentile values.

–You also can tell SPSS to get any percentile values

• Distribution

• Skewness and kurtosis are statistics that characterize the shape and symmetry of the distribution.

• The normal distribution is symmetric and has skewness and kurtosis values of zero.

• Skewness: is a measure of symmetry

–Positive skewness: a long right tail.

–Negative skewness: a long left tail.

• Kurtosis: a measure of the extent to which observations cluster around a central point. – Leptokurtic data values are more peaked (positive

kurtosis) than normal distribution.

– Platykurtic data values are flatter and more dispersed along the X axis (negative kurtosis) than normal distribution.

• Click Charts to get a histograms

• SPSS output of Frequency analysis

Note: Standard error is an estimate of the standard deviation of a statistic. Range is the difference between the highest value and the lowest value.

• SPSS output of Frequency analysis

• SPSS output of Frequency analysis

• Descriptives function

– Example: Run Descriptives of Q49

• SPSS output of Descriptives analysis

• Second example: Out of three drugs (Q33, Q43, and Q49), we want to know which drug was used most frequently among high school students.

• We use Descriptives function to sort variables by Mean or by Sum.

• Go to Analyze > Descriptive Statistics > Descriptives > Click Options

Under Display Order, check Descending means

• SPSS output

• Syntax for sort by Sum

–DESCRIPTIVES VARIABLES=Q33 Q43 Q49

/STATISTICS=SUM MEAN STDDEV /SORT=SUM (D).

• Explore function

–Example: run Explore for Q6 and Q49

• Explore function: click Statistics and Plots to get:

• SPSS output: descriptives, similar to the results from Frequencies and Descriptives

• SPSS output: histograms

• SPSS output: stem-and- leaf graph of Q6

• SPSS output: rotated stem and leaf graph

• SPSS output: normal Q-Q plots

• Box plots

Meyers, Gamst, & Guarino (2006)

• SPSS output: box plots

• Explore Q6 by sex (Q2)

• SPSS output

• Histograms

• Stem-and-leaf plots

• Normal Q-Q plots

• Box plots

• Test means: t tests and Analysis of variance

– T tests

• one sample t test

• Independent-samples t test

• Paired-samples t test

– Analysis of variance (ANOVA)

• One-way/two-way between subject design

• One-way/two-way within subject design

• Mixed design

• Go to Analyze > Compare Means

• Student’s T test – The method assumes that the results follow the

normal distribution (also called student's t-distribution) if the null hypothesis is true.

– The paired t-test is used when you have a paired design.

– The independent t-test is used when you have an independent design.

• Independent-samples t test – Example: we want to know if there is a

difference between sex groups (Q2) in height (Q6).

–Go to Analyze > Compare Means > Independent-Samples T Test

• Test variable: Q6 (dependent variable)

• Grouping variable: Q2 (two groups: female and male)

• Coding of Q2: 1= Female and 2= Male

Click Define Groups, type 1 for Group 1 and 2 for Group 2 based upon the coding of Q2

• SPSS output

• Mean height of females = 1.62, SD = .07

• Mean height of males = 1.76, SD = .09

• t = -94.28, df = 12470.68, p = .00

• Conclusion: there is significant difference between female and male groups in height.

• Analysis of Variance (ANOVA) –Used to compare means of two or more

than two groups

–One-way ANOVA (between subjects): there is only one factor variable

– Example: we want to know if there is difference in height (Q6) among four grade groups (Q3)

• Original coding of Q3

• We need to recode Q3 in order to get rid of the last category.

–Then the new variable has four categories

–Go to Transform > Recode into a different variable

• Recoding Q3 into Q3r

• Go to Analyze > General Linear Model > Univariate

• SPSS output

• SPSS output

F(3, 12545) = 60.83, p = .00. There was a difference in height among four grade levels.

• Post Hoc tests

–We have already obtained a significant omnibus F-test with a factor of four levels.

–We need to know which means are significantly different.

• Click Post Hoc button

• Results of Post Hoc tests

Meyers, L. S., Gamst, G., & Guarino, A. J. (2006). Applied multivariate research: design and interpretation. Thousand Oaks, CA: Sage Publications, Inc.

90

top related