spss introductory session data entry and descriptive stats

120
SPSS & Quantitative Data Analysis Kulbir Singh Birak

Upload: e1033930

Post on 13-Jan-2017

518 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Spss introductory session data entry and descriptive stats

SPSS & Quantitative Data AnalysisKulbir Singh Birak

Page 2: Spss introductory session data entry and descriptive stats

SPSS is a computer program for analysing quantitative data.

This can range from basic descriptive statistics such as the mean, mode, median and range to powerful tests of significance (So whether we accept or reject a hypothesis).

What the data looks like, and what that means if anything.

What is SPSS?

Page 3: Spss introductory session data entry and descriptive stats

You can access SPSS on the vast majority of PC’s at UCS, in these labs, the Waterfront PC’s and the library PC’s

Additionally, if you wish you can borrow a copy of SPSS to install on your own home PC or laptop. There are 16 copies in the library you just need to borrow the disc and input the license code that comes with it (license’s do come to an end and when they do you can just come and borrow a new version of SPSS or attain a new license code)

Over night loan only or you can bring your laptops in and do there and then

Windows version only, no Apple version

SPSS Access

Page 4: Spss introductory session data entry and descriptive stats

Overview

Why do numbers matter in research design?

Numbers allow you to do two basic things:- Count how often

“something” happens

- Count how big an issue “something” is

Page 5: Spss introductory session data entry and descriptive stats

Overview

Once you can count the extent (how often) and nature (for quantitative research a numerical descriptor of an attribute) you can already do some pretty important things. You can answer questions such as:

How common is an issue? For instance, are black

children over-represented in care? Are black adults over-represented in psychiatric hospital?

How serious is a particular issue? Or how is it distributed within a sample? For instance, how serious

are the concerns about children in families allocated a social worker?

Page 6: Spss introductory session data entry and descriptive stats

OverviewOnce you can count stuff you can start to answer other important and interesting questions, for instance:

Page 7: Spss introductory session data entry and descriptive stats

Students may often come to you with various questions about SPSS and difficulties that they are having

If you are lucky enough to catch them early on a lot of unnecessary frustration and stress about analysing data can be avoided.

The most important thing a student can do before they even consider methodology, methods or analysis is to have a clear research question/aim and hypotheses in place that conceptualise and operationalise the variables they wish to study.

SPSS and Quantitative Data

Page 8: Spss introductory session data entry and descriptive stats

Some Basic Definitions

A variable is the “thing” that you’re interested in studying e.g. depression, gender differences, social deprivation,

specific crime rates, levels of emotionality (how emotional someone

is) or different types of food!

Page 9: Spss introductory session data entry and descriptive stats

• Things like depression, gender differences, social deprivation, specific crime rates, levels of emotionality and food type, etc. are called “variables” because they vary.● Some people are more depressed than others● Some people are men, and others are women● Some Social policies may be more successful than others● We may see different crimes committed in different

contexts, areas ● Some people are less emotional than others● Food types can range from pizza to hamburgers to filet

mignon, or might be Thai, Ethiopian, Polish or American cuisine, etc., etc.

Page 10: Spss introductory session data entry and descriptive stats

TO “CONCEPTUALISE” A VARIABLE MEANS TO MAKE CLEAR WHAT YOU MEAN BY THE VARIABLE….

• For example, for the variable “food type,” you need to be clear about whether you mean

• (1) vegetarian or meat, OR• (2) breakfast, lunch or dinner foods, OR• (3) Ethiopian, Thai or American foods, OR • (4) something else!

Page 11: Spss introductory session data entry and descriptive stats

TO “OPERATIONALISE” A VARIABLE IS TO DECIDE HOW YOU WILL MEASURE IT

• For example, if the variable you’re interested in is depression:● Will you ask people to rate themselves, and if

so, on what sort of a scale?● Alternatively, will you measure depression by

facial expression? By some behaviour that you observe? In some other way?

Page 12: Spss introductory session data entry and descriptive stats

TO “OPERATIONALISE” A VARIABLE IS TO DECIDE HOW YOU WILL MEASURE IT

• If the variable you’re studying is intelligence & you don’t think Exam scores are a good measure of intelligence, what measure WILL you use?

• Asking these sorts of questions is completing the process of “operationalising” your variables.

• Conceptualisation & Operationalisation are necessary for a Quantitative approach

Page 13: Spss introductory session data entry and descriptive stats

Exploratory Descriptive Causal/RelationshipExploratory research is undertaken when few or no previous studies exist. The aim is to look for patterns, hypotheses or ideas that can be tested and will form the basis for further research.

Typical research techniques would include case studies, observation and reviews of previous related studies and data.

Data from exploratory studies tends to be qualitative.

Expands on the Exploratory

Descriptive research can be used to identify and classify the elements or characteristics of the subject, e.g. number of days youth offenders remained out of trouble.

Quantitative techniques are most often used to collect, analyse and summarise data.

Causal and Relationship research focuses on being able to predict/hypothesise cause and effect between observed behaviours, or relationships between aspects of behaviour/society/crime rates.

The idea is that Causal and Relationship research is moving a step beyond descriptive research and the quantitative data collected can be used and analysed in a manner that allows the researcher to infer a significant effect/difference or relationship

TYPES OF QUANTITATIVE RESEARCH

Page 14: Spss introductory session data entry and descriptive stats

Aims and Objectives

Page 15: Spss introductory session data entry and descriptive stats

• The Quantitative approach sets out at the start of a study with a research question and a hypothesis/prediction

• Hypotheses are formal statements of predictions derived from evidence from earlier research and/or theory.

• The null hypothesis (H0) is a statement of ‘no difference/effect/change’ between the variables

• The experimental hypothesis (H1) is a statement of difference/relationships between variables

QUANTITATIVE DESIGNS AND HYPOTHESES

Page 16: Spss introductory session data entry and descriptive stats

• Experimental Hypothesis: Students who study for tests in study groups will score significantly better on their exams than students who did not study in study groups

• Null Hypothesis: There will be no significant difference in exam results between students who do and do not study in study groups

EXAMPLE OF HYPOTHESIS

Page 17: Spss introductory session data entry and descriptive stats

This clarity in the question and hypothesis can make life markedly easier for yourselves and the student in the long run.

However, I appreciate that this is not always the easiest/or will not be the case for you more often than not.

So what I will be covering with you today is a brief introduction to the SPSS interface and as to how we would go about doing the initial basics of data entry and beginning to explore descriptive data.

If we can I’ll also take you through examples of some basic significance testing (otherwise I’ll put up so available)

Page 18: Spss introductory session data entry and descriptive stats

A light but important session. Going over the basics of how to input data, label your

variables so it is clear and how to create codebooks. It’s all about building up your confidence with the

interface, and developing good practise. It’s about doing the basics so as to avoid confusion

later on, e.g. inputting the data correctly for different types of analysis.

Data Entry and Descriptives

Page 19: Spss introductory session data entry and descriptive stats

Hopefully should be familiar with the idea of descriptive data.

As the name suggests they are what we use to describe the data we have.

There’s no point in knowing that the IQ scores between two groups are significantly different if we don’t have a way of describing the scores, and the difference.

Measures of central tendency: Mean, mode, median etc.

Measures of dispersion: Standard deviation etc.

Descriptive Stats

Page 20: Spss introductory session data entry and descriptive stats

Levels of Measurement

In 1946 Stevens proposed a theory of scales of measurement. Nominal data (lowest level of

measurement) Ordinal data (unable to differentiate

points on scale) Interval data (points on scale equal

distance apart) Ratio data (equal distance between

points on scale)

Page 21: Spss introductory session data entry and descriptive stats

Nominal

Provides the least exact information Participants are placed in categories

Data that is categorical e.g. gender, colours, shoe type, play behaviour

Variable must fit into one category Measure of frequency

Numbers may be used but only as category labels

Central tendency is described using the mode Data is represented using a frequency table

or bar chart

Page 22: Spss introductory session data entry and descriptive stats

Examples: Nominal Data

Type of Bicycle Mountain bike, road bike, chopper, folding, BMX.

Ethnicity White British, Afro-Caribbean, Asian, Chinese,

other, etc. (note problems with these categories).

Smoking status smoker, non-smoker

Page 23: Spss introductory session data entry and descriptive stats

Ordinal

Simplest true scale, orders measurements along a continuum Represent rank position in a group e.g. 1st, 2nd, 3rd …10th No information on difference between positions

Central tendency is described in terms of the median

Dispersion can be measured using the range or inter-quartile range (middle 50% of the distribution)

Page 24: Spss introductory session data entry and descriptive stats

Ordinal Data

A type of categorical data in which order is important.

Class of degree-1st class, 2:1, 2:2, 3rd class, fail

Degree of illness- none, mild, moderate, acute, chronic.

Opinion of students about stats classes-Very unhappy, unhappy, neutral, happy, ecstatic!

Page 25: Spss introductory session data entry and descriptive stats

Interval and ratio variables

According to Fielding & Gilbert (2000) these are often used interchangeably, and incorrectly by social scientists (pg15)

Interval, ordered categories, no inherent concept of zero (Clark 2004), we can calculate meaningful distance between categories, few real examples of interval variables in social sciences (Fielding & Gilbert 2000:15)

Ratio. A meaningful zero amount (e.g. income), possible to calculate ratios so also has the interval property (e.g. someone earning £20,000 earns twice as much as someone who earns £10,000) (Fielding & Gilbert 2000:15)

Difference between interval and ratio usually not important for statistical analysis (Fielding & Gilbert 2000:15)

Page 26: Spss introductory session data entry and descriptive stats

Interval variables- Examples

Fahrenheit temperature scale- Zero is arbitrary- 40 Degrees is not twice as hot as 20 degrees.

IQ tests. No such thing as Zero IQ. 120 IQ not twice as intelligent as 60.

Question- Can we assume that attitudinal data represents real, quantifiable measured categories? (ie. That ‘very happy’ is twice as happy as plain ‘happy’ or that ‘Very unhappy’ means no happiness at all). Statisticians not in agreement on this.

Page 27: Spss introductory session data entry and descriptive stats

Ratio variables-Examples Can be discrete or continuous data. The distance between any two adjacent units

of measurement (intervals) is the same and there is a meaningful zero point (Papadopoulos, 2001)

Income- someone earning £20,000 earns twice as much as someone who earns £10,000.

Height Unemployment rate- measured as the number

of jobseekers as a percentage of the labour force (Papadopoulos, 2001).

Page 28: Spss introductory session data entry and descriptive stats
Page 29: Spss introductory session data entry and descriptive stats

If you are still a little worried about your understanding of Quantitative Data please see the Key Information Handout in the Folder. By David Bowers (Learning Development) A reasonable summary of information about

quantitative data. Data types, appropriate measures of central

tendency etc.

Key Information Handout

Page 30: Spss introductory session data entry and descriptive stats

Everything we do today is about good practice.

Following the steps today, and developing correct inputting skills, will save you lots of problems and heartache later.

SPSS is fussy when it comes to the way data is entered.

Importance of Good Practice

Page 31: Spss introductory session data entry and descriptive stats

As SPSS is a Quantitative Data analysis software you often have to reduce information down to a numerical state

A Codebook allows you to keep a record of these reductions and decisions

A record of your own. Separate from SPSS. Electronic or on paper. A list of variables, full names, and how you

have coded data.

Codebook

Page 32: Spss introductory session data entry and descriptive stats

The codes you give data to allow SPSS to analyse it.

You can’t enter text so some variables need to be converted.

E.g. Gender: Female may become 1, Male may become

2.

Relationship Status: Single may become 1, Married 2, Divorced 3, Widowed 4…

Coding

Page 33: Spss introductory session data entry and descriptive stats

SPSS is fussy when it comes to the names you give variables.

Can’t give them a full description in the main view.

So you can give detailed labels in the special variable view.

Along with a codebook it helps keep the information clear.

Labelling

Page 34: Spss introductory session data entry and descriptive stats

Available on email that was circulated to you all

File: Data Entry Exercise 1 - Optimism Data We’ll be creating a codebook, setting up

SPSS according to the codebook, and then entering the data.

1st Exercise

Page 35: Spss introductory session data entry and descriptive stats

Good habits

Create a new Folder on your Desktop

Right-click on Desktop> New > Folder > “SPSS”

New Data Folder

Page 36: Spss introductory session data entry and descriptive stats

Start>All Programs>IBM SPSS Statistics 19.

Depending on version may have a slightly different name.

GIVE IT TIME SPSS IS RENOWNED FOR TAKING AN AGE TO OPEN UP – CLICKING AGAIN ONLY SLOWS IT DOWN MORE AS IT’LL THEN TRY TO OPEN ANOTHER SPSS WINDOW

Open SPSS

Page 37: Spss introductory session data entry and descriptive stats

Open SPSS

Page 38: Spss introductory session data entry and descriptive stats

Optimism Scale data from 4 participants

Firstly, we are going to prepare a codebook

Coding Data

Page 39: Spss introductory session data entry and descriptive stats

Optimism Hand-out

Page 40: Spss introductory session data entry and descriptive stats

Rules for naming of variables Variable names:

must be unique (i.e. each variable in a data set must have a different name);

must begin with a letter (not a number); cannot include full stops, spaces or other

characters (!, ? * "); cannot include words used as commands by

SPSS (all, ne, eq, to, le, lt, by, or, gt, and, not, ge, with)

Coding Data

Page 41: Spss introductory session data entry and descriptive stats

Optimism scale items op1 to 4 Enter number circled 1 (strongly disagree)

to 5 (strongly agree)

Coding Data

Page 42: Spss introductory session data entry and descriptive stats

Now we have a codebook to keep things clear we can set up SPSS so it is ready for the data.

SPSS has 3 views: Data, Variable and Output.

By switching to Variable we can define the variables we need.

Creating a data file and inputting data

Page 43: Spss introductory session data entry and descriptive stats

Defining Variables

Page 44: Spss introductory session data entry and descriptive stats

Variable View

Page 45: Spss introductory session data entry and descriptive stats

Naming Variables

Page 46: Spss introductory session data entry and descriptive stats

Decimals

Page 47: Spss introductory session data entry and descriptive stats

Labels

Page 48: Spss introductory session data entry and descriptive stats

Values

Enter the relevant value and label as per your codebook, then click add. When all have been entered, click OK

Define the meaning of the values used in the codebook (Gender) and click add for each.

Page 49: Spss introductory session data entry and descriptive stats

Values

Page 50: Spss introductory session data entry and descriptive stats

Values

When entering likert data always use the limits of the scale (1-5) even if you know that participants may not have entered some responses. You also need to decide whether you are going o just enter the range or every labeled point.

Page 51: Spss introductory session data entry and descriptive stats

Values

Page 52: Spss introductory session data entry and descriptive stats

Data comes in different types. Categorical (Nominal in SPSS) Ordinal Scale/Interval (Scale in SPSS) Different types/measures suit different tests, different measures of central tendency, different forms of visualisation. Makes knowing what type of data you have KEY for successful data analysis.

Measures

Page 53: Spss introductory session data entry and descriptive stats

Measures

Scale refers to interval/ratio level of measurement - There is some debate about data type in relation to likert data … for our purposes, leave this as Scale

Nominal refers to catergorical

Page 54: Spss introductory session data entry and descriptive stats

Measures

Page 55: Spss introductory session data entry and descriptive stats

Now you have the variables set up ready for the data you can start to enter the actual data

Go to the Data View

Inputting Data According to the Codebook

Page 56: Spss introductory session data entry and descriptive stats

Inputting Data According to the Codebook

Page 57: Spss introductory session data entry and descriptive stats

Inputting Data According to the Codebook

Page 58: Spss introductory session data entry and descriptive stats

Saving the File

Page 59: Spss introductory session data entry and descriptive stats

Saving the File

Page 60: Spss introductory session data entry and descriptive stats

You’ve saved the data so now it is ‘safe’ You can have a play around with it and try

a few different things. Delete a case Insert a case between existing cases Delete a variable Insert a variable between existing variables Try during the workshop/at home so you

get more confident with SPSS.

Playing around with the data

Page 61: Spss introductory session data entry and descriptive stats

Available on LearnUCS.

Different experimental designs require a different style of inputting.

The structure you use will be different between Repeated (Within-Group) and Independent (Between-Group) experimental designs.

Use the wrong structure and the analysis will fall down. It will be meaningless at best.

2nd Exercise: Inputting Repeated and Independent

Measures

Page 62: Spss introductory session data entry and descriptive stats

So, to recap Repeated Measures. The same participants

experience all treatments/are in all the groups/conditions.

If you wanted to investigate the effect of music on taking an IQ test participants would experience the no music condition, and the music condition.

Hopefully with some counterbalancing.

Repeated Measures

Page 63: Spss introductory session data entry and descriptive stats

Repeated Measures

Page 64: Spss introductory session data entry and descriptive stats

Repeated Measures

Page 65: Spss introductory session data entry and descriptive stats

Again to recap.

Participants are split. One group will experience one treatment/be in one group/condition.

Another group will experience the other.

Each condition will have a unique, non-shared, set of participants.

Independent

Page 66: Spss introductory session data entry and descriptive stats

Independent

Page 67: Spss introductory session data entry and descriptive stats

Independent

Page 68: Spss introductory session data entry and descriptive stats

Independent

Page 69: Spss introductory session data entry and descriptive stats

Independent

Page 70: Spss introductory session data entry and descriptive stats

Independent

Page 71: Spss introductory session data entry and descriptive stats

Independent

Page 72: Spss introductory session data entry and descriptive stats

A quick trick to show you. Good for those who aren’t fond of a

screen full of numbers. If you have coded your variables

correctly there is a button you can press that will make the numbers in your data view appear as the names coded.

For example the 1’s and 2’s for gender could appear as Male and Female.

Labelling Trick

Page 73: Spss introductory session data entry and descriptive stats
Page 74: Spss introductory session data entry and descriptive stats

Data Entry Exercise 1 – Optimism Data Input Data Entry Exercise 2 – Repeated and Independent Extra Data Entry Exercises

Exercise 3 – Giving electric shocks Exercise 4 – Shooting people

We’ve gone through 1 and 2 here. Try them on your own. 3 and 4 for extra practice. Make sure you are comfortable with data input, coding and labelling.

Exercises

Page 75: Spss introductory session data entry and descriptive stats

The theory and step-by-step guide will be covered in the slides following immediately below. If you complete the first exercise move onto exercise 2.

Descriptive Exercise 1: survey.savThe data is from a survey of staff about stress and emotions.Generate the frequencies for 1) marital status and 2) level of education

Descriptive Exercise 2: staffsurvey.savThe data is from a staff survey with likert scales for agreement and importance of factors.Generate appropriate descriptive statistics to answer the following questions:

(a) What percentage of the staff in this organisation are permanent employees? (Use the variable employstatus.)(b) What is the average length of service for staff in the organisation? (Use the variable service.)(c) What percentage of respondents would recommend the organisation to others as a good place to work? (Use the variable recommend.)

Lab Exercises

Page 76: Spss introductory session data entry and descriptive stats

The theory and step-by-step guide will be covered in the slides following immediately below. If you complete the first exercise move onto exercise 2.

Descriptive Exercise 1: survey.savThe data is from a survey of staff about stress and emotions.Generate the frequencies for 1) marital status and 2) level of education

Descriptive Exercise 2: staffsurvey.savThe data is from a staff survey with likert scales for agreement and importance of factors.Generate appropriate descriptive statistics to answer the following questions:

(a) What percentage of the staff in this organisation are permanent employees? (Use the variable employstatus.)(b) What is the average length of service for staff in the organisation? (Use the variable service.)(c) What percentage of respondents would recommend the organisation to others as a good place to work? (Use the variable recommend.)

Lab Exercises

Page 77: Spss introductory session data entry and descriptive stats

When you are trying to find your descriptive stats you need to make sure you use the right ones.

Certain types of data/measure, suit certain types of measures of central tendency and dispersion.

Use the wrong ones and your description of the results will be confusing, wrong and won’t match your inferential statistics.

Types of Variables & Descriptives

Page 78: Spss introductory session data entry and descriptive stats

Also known as Nominal variables in SPSS. Data that has been classified and categorised. So gender, a participant will belong to a

particular category of gender. Marital Status. Anything that you can create a discrete

classification of. You can even take a scale variable like age, and force it into categories (18 and under, 18 – 25, 25 – 35 etc.).

Categorical Variables

Page 79: Spss introductory session data entry and descriptive stats

Measure of Central tendency to use for Categorical data is the mode.

Frequency of occurrence or amount. So using gender as an example you would

use the mode. 2 of the sample might be male, and 8 female. Mode = Female. 20% male, 80% female

Categorical

Page 80: Spss introductory session data entry and descriptive stats

In SPSS you should use the Frequency option when you want the descriptive stats for a categorical variable.

Go to Descriptive Exercise 1 on LearnUCS.

Categorical and Frequency

Page 81: Spss introductory session data entry and descriptive stats

Save survey.sav to your SPSS folder on the Desktop from LearnUCS

Have a look at survey.sav questionnaire from LearnUCS

Open survey.sav dataset

Descriptive Exercise 1 - Survey

Page 82: Spss introductory session data entry and descriptive stats

Survey Questionnaire

Page 83: Spss introductory session data entry and descriptive stats

Frequencies

Page 84: Spss introductory session data entry and descriptive stats

Frequencies

Page 85: Spss introductory session data entry and descriptive stats

Frequencies

Page 86: Spss introductory session data entry and descriptive stats
Page 87: Spss introductory session data entry and descriptive stats

Frequency Output

Page 88: Spss introductory session data entry and descriptive stats

Frequency Output

Page 89: Spss introductory session data entry and descriptive stats

This is where graphs and the results from tests (descriptive and inferential) will appear.

Also notes about when you have saved and opened files too.

If you want to keep what is in the output you must save it specifically.

Saving the data/variable will not save what is in the output, and vice versa.

Output pages

Page 90: Spss introductory session data entry and descriptive stats

Aside from Categorical measures we also have Ordinal Scale/Interval (sometimes know as ratio too) These are also generally known as continuous

variables. Usually the mean or median are the measures

of central tendency used, and the standard deviation, or error, the measure of dispersion.

Other measures

Page 91: Spss introductory session data entry and descriptive stats

Ranked or ordered data. Sometimes Likert scales.

Has some similarity to categorical data (You might consider grade brackets to be categories; A, B, C, D, etc).

But importantly they are ranked, so there is meaning to the position. A is better than B, B better than C and so on.

The median is used here. Central point with an equal amount above/below.

Ordinal

Page 92: Spss introductory session data entry and descriptive stats

The median is used here. Central point with an equal amount above/below So if you had a collection of grades… 20 people had an A 10 had a B 10 had a C 10 had a D Then B would be the median grade, as 20 people

had higher, and 20 people had lower.

Ordinal

Page 93: Spss introductory session data entry and descriptive stats

Imagine we wished to find the median for the highest educational level attained by a population

In descriptive exercise 1 (survey) we would click on ‘Analyze’

Using Explore to See the Median

Page 94: Spss introductory session data entry and descriptive stats

Using Explore to See the Median

Select ‘Descriptive Statistics’ and then ‘Explore’ from the Drop-down menus

Page 95: Spss introductory session data entry and descriptive stats

Using Explore to See the Median

1. When the below box opens move ‘highest educ completed’ from the left pane to the ‘Dependent List’ section

2. Click on ‘Statistics’ and choose ‘Outliers’ and ‘Continue’ 3. Click on ‘Plots’ and choose ‘Histograms’ and ‘Normality Plots with tests’ and ‘Continue’

4. Click on ‘OK’

Page 96: Spss introductory session data entry and descriptive stats

Using Explore to See the Median

The resulting ‘Output’ in the Output window will show you a number of descriptive stats.We can see the median is 4 for the ‘highest educ completed’ which means ‘some additional training’ is the median for the highest education completed for 439 participants who took part in the survey.

Page 97: Spss introductory session data entry and descriptive stats

Interval – a scale with artificial limits, no true zero, and usually some form of cap.

Intervals are of equal size. IQ scores for example. Ratio – has a true zero, constant intervals and

potentially little or no cap. So timing scores on a task for example. SPSS doesn’t really differentiate between the two. Basically if it is a form of score it is likely to be scale.

Scale

Page 98: Spss introductory session data entry and descriptive stats

The mean is the normal measure of central tendency, and the measure of dispersion the standard deviation.

So 5 people take a maths test. They score 10, 20, 18, 12 and 5. The average would be 13 (total/number of

cases)

Scale

Page 99: Spss introductory session data entry and descriptive stats

In SPSS we just need the descriptive option, rather than the frequency option.

So for example if we wished to find the mean and standard deviation for ‘age’, ‘total optimism’, ‘total mastery’, ‘total perceived stress’ and ‘total perceived control of internal states’ (PCOISS), for participants who answered the survey we are using for exercise 1.

Scale Descriptives

Page 100: Spss introductory session data entry and descriptive stats

Descriptives

Page 101: Spss introductory session data entry and descriptive stats

Descriptives

Page 102: Spss introductory session data entry and descriptive stats

Descriptives

Page 103: Spss introductory session data entry and descriptive stats

Descriptives Output

Page 104: Spss introductory session data entry and descriptive stats

Sometimes information will be left out of a questionnaire, or the value lost, but you will still need to conduct an analysis.

What happens if someone doesn’t fill in the age box on a questionnaire?

Rather than get rid of all their data you can use the ‘Exclude cases pairwise’ option.

It excludes the case (person) only if they are missing the data required for the specific analysis. They will still be included in any of the analyses for which they have the necessary information.

Missing Data

Page 105: Spss introductory session data entry and descriptive stats

Exclude cases listwise A more extreme option. If the participant is missing any data then

this option should remove them entirely from the analysis.

A matter of judgement as to which to use.

Missing Data

Page 106: Spss introductory session data entry and descriptive stats
Page 107: Spss introductory session data entry and descriptive stats

Descriptive Exercise 1 – Survey Descriptive Exercise 2 – Staff Survey

Exercises

Page 108: Spss introductory session data entry and descriptive stats

Adapted from Green, J. & D’Oliveira, M. (1999). Learning to use statistical tests in psychology. Buckingham, UK: Open University Press.

Differences ?

Categorical & FrequencyData? Relationships ?

How many Independent variables?

START

Within orBetween

participants in each condition?

Two or more

Parametric: Unrelated t-test

Non-param:Mann Whitney

Between

How many experimental conditions?

One

Factorial Within Subjects (Repeated Measures) ANOVA

Within

Factorial Mixed Design (Split-Plot) ANOVA

Both True

Between

Factorial Between Groups ANOVA

3 or more

Within orBetween

participants in each condition?

Two

Within orBetween

participants in each condition?

Parametric: Non-param:Oneway FriedmanWithin Ss or(Repeated Page’s Lmeasures) Trend TestANOVA

Within Between

Parametric: Non-param:Oneway Kruskal-Between Wallis orGroup JonckheereANOVA Trend Test

Parametric: Non-Param: Related Wilcoxont-test

Within

Parametric: Non-param:Pearson's r Spearman's r

Flowchart for choosing basic statistics

Summarising Univariate Data?

Descriptive statistics(mean, standard deviation,variance, etc)

1 or 2 sample Chi-square

Within

McNemar

Between

Page 109: Spss introductory session data entry and descriptive stats

Coolican, H. (2014). Research Methods and Statistics in Psychology (6th ed.). Hove, UK: Psychology Press. A good introduction to the quantitative statistics incorporated in

the social sciences. A comprehensive coverage of the statistics covered in research methods at this level in a clear and comprehensive format.

Pallant, J. (2013). SPSS: Survival Manual (5th ed.). Maidenhead, UK: Open University Press A textbook that is of help with the statistical programme SPSS

whatever your level, as it takes you through the analysis in a step-by-step clear and concise manner that allows you to learn while you put into practice.

Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics (4th ed.). London, UK: Sage An easy to engage with text that covers research methods and

statistics in a fashion that makes it easy to read and follow.

Recommended Reading

Page 110: Spss introductory session data entry and descriptive stats

You can use the below link to access the UCS library page that has some useful videos showing how to use SPSS http://libguides.ucs.ac.uk/c.php?g=264784&p=1954991

There is also a course that you can do (set up by Jen Versey our Psychology technician and David Mullett from the library support team) https://www.coursesites.com/webapps/Bb-sites-course-creatio

n-BBLEARN/courseHomepage.htmlx?course_id=_383196_1

There is always the IBM SPSS guide that you can access through the help option in SPSS as a starting point.

Web Resources

Page 111: Spss introductory session data entry and descriptive stats

Descriptive Statistics

Descriptive statistics – are statistics that describe data. They essentially summarise the data.

They can be either numerical or graphic Numerical statistics come in 2 forms

Measurement of central tendency Measurement of dispersion

Page 112: Spss introductory session data entry and descriptive stats

Measure of Central Tendency

Three measures of central tendency/ score, which we use is dependent on our level of measurement. They are;

Mean Arithmetic average/mean. Sum of all scores divided by

the number of scores Median

The score that falls in the exact centre of the distribution (middlemost score)

Mode The most common/frequently occurring score

Page 113: Spss introductory session data entry and descriptive stats

‘the mean’ Formula for the mean is_ Σxx = N_x = the meanΣ = the sum ofx = the scoresN = the number of scores in set Advantages

Powerful statistic used in estimating population parameters for significant differences and correlations. Most sensitive, and works at an interval level.

Disadvantages Can be overly sensitive causing it to easily distort due to outlier values

Page 114: Spss introductory session data entry and descriptive stats

‘the median’

The measure of central tendency for ordinal data Shorthand may be Guildford’s (1956) Mdn It is the central value of a set A formula used to find the median is N + 1k = 2 For odd number data sets this will reveal the central number For even number data sets this will reveal the two points of data that

the median falls between When you have a number of values the same in the data set you can

use the same method although it is not strictly correct. However, luckily for us as social scientists there are statistical packages that will take care of this for us

Page 115: Spss introductory session data entry and descriptive stats

‘the mode’

The measure of central tendency for nominal scale data. We are unable to calculate mean and median with this type of data, but we can see what occurred most often/highest frequency

There can be two modes, which we call bi-modal Advantages

Most typical, unaffected by extremes, can be more informative than mean with discrete scales

Disadvantages Does not account for differences between values, can’t be used in

estimates of population parameters, not all that useful for small sets of data, for bi-modal two modal values reported, difficult to estimate accurately when data grouped into class intervals

Page 116: Spss introductory session data entry and descriptive stats

Measures of Spread/Dispersion

High Variability

Low Variability

Page 117: Spss introductory session data entry and descriptive stats

‘the range’

Report of the top/highest value and the bottom/lowest value

To calculate what the range is (the difference between) you subtract the lower value from the higher value and add 1

Advantage Includes extremes, easy to calculate

Disadvantages Can be distorted by extremes, can be unrepresentative of the

distribution. Doesn’t tell us whether values close to spaced out from mean

Page 118: Spss introductory session data entry and descriptive stats

‘the interquartile and semi-interquartile range’

The interquartile range allows us a better insight into how values fall in relation to the central tendency

Instead of the full range, the interquartile range represents the distance between the central 50%, removing the bottom and top 25%. The values are known as the 1st and 3rd quartiles or the 25th and 75th percentiles

Page 119: Spss introductory session data entry and descriptive stats

Interquartile range

Q1 M Q33 3 4 5 6 8 10 13 14 16 19 The interquartile range is: Q3 – Q1

Semi-interquartile is half of that: Q3 – Q1

2 Advantages

Representative of central group of values, useful for ordinal data

Disadvantages No account of extremes, inaccurate where there are large

class intervals

Page 120: Spss introductory session data entry and descriptive stats

Standard deviation and variance

These estimate from a sample how the values of a population are distributed

Standard deviation provides us with an average score telling us how different the scores are from the mean

Formula for standard deviation (std, SD, stdev)

)(1

2

nXx

s 1

n

s2d Or