4 role of statistics in research

7/21/2019 4 Role of Statistics in Research

1/8

The

Role

of

Statistics

in

Research

Scnles

or

MTRSUREMENT

Nominal

Scale

Ordinal

Scale

lnterval

Scale

Ratio

Scale

lmportance

of

Scales

of

Measurement


2/8

Pr()icct

is

nrcant

ttl1;enerar

lize.

Apopulation

ci.rn

Lre

us

l.rr1laclly

clcfirccJ

as

all

f

the

people

in

the

world,

Jr

"":l-1il

living

organisnrr;

,r.

ih"

population

an

be

more

narrowly

defined-all_18-

,22?yiur-olds

or

a'orin"

psychor_

gy

majors

at

a

particular

schSgl.

Typically,'u

."r"urcher

cannot

test

a1

of

he

members

of

a

target

populatior-,.

irrt"uj,

itl

a

smalr

percentage

of

that

opulation-a

ru*pl"

tf'th.

members-can

te

tested

il;

sample

is

il":iil:::fr:r."'''.

the

entire

populatio.,

,o

,hu,

if

we

identify

a

characteris_

the

wom"r.,_fr"_tor

example,

that

tn"

mur.,

lr-,

the

sample

are

taller

than

r.r,he"-ii*ffi

'riff

,f.;r[:Hit;TJHrJH**r*Ti*,*'*ji;

e

that

a

char'acteristic

of

our

sampre

*"

u"

g;eralized

to

the

popuration.

et's

assume

that

a

."r"ur.hu,

nur-.oii".r"a

a

memory

experiment,

nd

the

mnemonic

(memo.,

,::l-llry")

;rr*

performs

bltt".

than

the

ontrol

group.

perh-1ps

the

purucrpants

in

the

averase

or

18

out

or

zo

*o.ir,

undthe

.."r.i#iilr'Jlit3:J:"-"r"#1.l

16

out

of

20

v'ords.

At

this

foi;;

,h;;#"inu,

nor

yer

supporred

the

lternative

hypothesis

thatinr,"-o.i.

instructions

lead

to

bette

erformance,

even

though

ts

i,

greater

,n""

16.

The

."r"urJnt"T::lnow

that

the

s,ample

of"purii.ipants

in

the

r

i?1il"#*:y#r,f

."::::,:.,'m[:lnthe.il;';:ilT;ff

:]lJ?ff;

:Iffi

JTfi*'f*ff

JJ:"#:ft

:nft

::'i:T,",i,"Jffi

::.'ff#*[.

This

chapter

offers

a

nontechnical

introduction

to

some

statistical

oncepts'

The

purpose

is

to

familiariz

yr"

*iin

some

of

the

terms

and

ii:X'JT*,'l

orgu'iri'g

and

anaryzins

d;;;

,iu,irr,.ully

so

that

you

wiu

vour

studies,

t";#,tTiils;if,

,T:";:'lutt'u'

r'o,

encounter

as

part

of

('lt.t1rl1'1'

l;,,,,,

ScaTES

oF

T{easUREMENT

'l'lrt'

ltolt'ol'Stirtistics

ilr

l{t.st..ll'r'll

b5

Ltst'cl in this

example are

that

the number

you

assign to

yotrr hirp1rir11.**

lnust

be between l

and 10, where

l refers

to a lack

of

happiness

alrtl l0

rcferrs

to an

abundance

of happiness.

Not all measurement

systems are

equivalent.

Some measurements

can be mathematically

manipulated-for

instance,

by adding

a constant

or by taking

the

square root

of each number-but

still keep its

primary

characteristics.

Other

systems are

very intolerant

of any mathematical

manipulation;

adding

a

constant

or taking

the square root renders

the

data

meaningless.

Measurement

systems can

be

assigned

to

one

of four

scales

of measurement

that vary

by

the level

of mathematical

manipula-

tion

they can tolerate. These

four

scales

of measurement

(also

called lev-

els of measurement)

are

the nominal,

ordinal, interval,

and ratio

scales.

Nominal

Scale

The

nominal

scale

of

measurement

merely

classifies objects

or

indi-

viduals

as

belonging

to different

categories. The

order

of

the

categories is

arbitrary

and

unimportant.

Thus,

participants might

be

categorized as

male

or female, and the male

category

may

be assigned

the number 1

and

the female

category assigned

the number

2. These

numbers

say

nothing

about

the importance

of one

category as

compared

to the other. The

num-

bers

could

just

as well

be 17.35

and29.46.

Other

examples

of

nominal

scales

of

measurement

are numbers

on basketball

players'

jerseys

or the

numbers

assigned

by the

Department

of Motor

Vehicles

to the license

plates

of cars.

Numbers, when

used in

a nominal

scale

of

measurement,

serve as'labels

only', and

provide no information

on the magnitude

or

amount

of the

charatteristic being

measured.

Ordinal

Scale

An

ordinal scale

differs from

a nominal

scale

in

that the

order of the

categories

is important.

A

grading system with

the grades A, B,

C, D, and

F

is an

ordinal

scale.

The

order of the

categories reflects

a decrease in

the

amount

of the stuff being

measured-in

this

case, knowledge.

Note,

how-

ever, that the

distance

between the

categories

is not necessarily

equal.

Thus, the difference between

one

A

and

one

B

is

not

necessarily the same

as

the difference

between

another A

and another B.

Similarly, the differ-

ence between

any A and B is

not necessarily

the same as

the difference

betweenaBandaC.

'Rank-order

dataiis

also measured

on an

ordinal scale. An

observer

may rank-order

participants according

to attractiveness

or

a researcher

-rnay

ask

tasters to rank-order

a number

of crackers according

to saltiness.

iWhen

your eyesight is

tested and

you

are

asked to

choose which

of

twtl

lenses

results in

a clearer image,you

are being

asked to

provide

ordinirlf

,data.

Again, when

data are rank-ordered,

a

statement is beirrg

r)I.)tlt',

about

th{magnitude

or

amount

of the characteristic

being m('asurt'tl)l',rrr-'

the intervals

between

units need

not

be equivalent. If

sevcn pt'o1rlt'.rrt.

,"r"jl.iltllltlt'

satisfactory

work

depends

o1

u-sing

the

right

toots.

rn

appropri",",."ir?ll,

l?ffiXt"j-T

Tt

statistical

to;l;

,r,1"-r,

the most

ue'+**""o*.ln"ffi

i,:r'Tii,"JH:,*i;Ji[il'iff

',l,r::fi*i{j

nigue

is

to

identify

the

typ"

if

oata

bein

g

analyzed.

In

psychorogy,

researchers

assume

that

anything

that

exists_be

it

a

3ff

t|3t"'ff

;i:T,Tl':::,.,iJ,'"r"*r.,"rgr.,,;;:p,yihorogi.urconstruct,

ff

trT*"ffi

:'JJ:T*:,'il,",'J::T'1fr'"il.;'il,:H,il""H;:li;

rgsr)

n",i"f

r''w

happy

il;:l,.T:itr"

::J"

H

if,'iJ::

ry;i::

easurement.

It

entails

identifying

a

.r,uru.i".itic,

your

happiness,

and

uantifying

the

amount

or

huppin"r,

yo.r-

u."

experiencing.

The

rules


3/8

(r(r

(

lt,tlrlt'r.

l,orrr.

t''tllk-.rtlert'c-l

tll't

attractivencss,

the

clifference

i.

attr..rcti\,(,.(,ss

[rt,lr.vt,t,.

lrt'first;rtrcl

seconcl

person

is

not

necessarily

t5e

sarn..

.s

the

c.litit,rt,rrct,

r.tween

the

second

a'd

the

third.

The

firsf

and

sec'ncl

pcrsons

may

b.t6

e

very

attractive,

yjrh-

only

the

smalr"rt

iilr"rence

between

them,

wrrirc

he

third

person

might

be

,.rbstuntially

less

attractive

than

the

seconc,.

Interual

Scale

The

interval

scale

of

measurement

is

characterized

by

equal

units

of

easurement

throughout

the

scale.

Thus,

measurements

made

with

an

nterval

scale

provide information

about

loth

the

order

and

the

relative

uantity

of

the

characteristic

being

..,"ur.r.J.

r,r"..ral

scales

of

measure-

ent,

however,

do

not

have

a

true

zero

value.

A

true

zero

means

that

one

of

the

characteristic

being

measu."d

l".r,uins.

Temperature

mea-

urements

in

degrees

Fahrenh.eit

or

in

d"gr"",

Celsius

(also

called

centi-

rade)

correspond

to

interval

scales.

ThE

dlrtu.,."

between

degrees

is

qual

over

the

full

length

of

the

scale;

the

difference

between

20"

and

40o

s

the

same

as

that

betri'een

40o

and

60".

In.,"i,n".

scale,

however,

is

there

true

zero;

zero

simply

represents

another

point

on

the

,.u1",

and

nega_

ive

numbers

are

potribl"

and.

meaningfJ

tg".u.rr"

there

is

no

true

zero

n

these

scales,

it

is

inappropriate

to

sa'y

tnui+0.

is

twice

as

warm

as

20o.

;'In

other

words,

ratios

.ir,.roi

be

compuied

with

intervar

scale

data.)

There

is

a

controversy

among psychological

researchers

regarding

nterval

and

ordinal

scaler

i.,

,"luJion"to

,atfig.

suppose

that

a

partici_

ant

is

asked

to

rate

something

on

a

scare

-itn

pi.ti..,tu,

".a

points,

uch

as

1

to

7

or

0

to

5.

For

exanipre,

u

p"rro.,

might

be

asked

the

folrow_

ng

rating

question:

How

satisfied

are

you

with

your

friendships?

7

2

3

4

5

6'7

B

9

10

ery

dissatisfied

very

satisfied

The

end

numbers

usuaily

have

labels,

but

the

middre

numbers

sometimes

o

not'

The

controversy

arises

as

to

whether

the

ratings

should

be

consid-

red

ordinal

data

or

interval

data.

what

nu,

,r"rr"r

been

ascertained

is

hether

the

scales

that

peopre

use

in

their

heads

have

units

oi

"qrrut

,rr".

f

the

gnits

are

equal

it-t

tir",

the

data

could

be

regarded

as

interval

data;

if

he;z

are

.r.,"qrur,

the

data

should

u"

,"gurd"Jas

ordinal

data.

This

is

a

oint

of

contention

because

interval

dala

often

permit

the

use

of

more

owbrful

statistics

than

do

ordinal

data.

There

is

stilr

no

consensus

about

the

nature

of

rating

scale

data.

In

ome

research

areas,

ratings

tend

to

be

t."uteJ

cautiously

and

are

consicl-

red

ordinal

data.

In

otheiareas-such

as

langrug"

and

memory

studies,

here

participants

may

be

asked

to

rate

ho--fu-iliar

a

phrase

is

or

how

trong

their

feeling

of

knowing

is-ratings

t"r,J

to

be

treatecr

as

interval

ata'

The

particuJai

philosophi

or

any

paiti..,to,l

are;r

of

study

is

p.ssiblv

est

ascertained

from

prerriorrsresearch

in

that

area.

J

.

I lrt'

liolr'

ol

sl,ttistit's

itl l{t'st',tt't

lt

b7

Ratio Scale

I'lrr,

ratio scale

of

measurement

provides

information

about

tlrder;

all

rrrrits,rrt,of cqLri-ll

size

throughoutthe

scale,

and

there

is

a true

zero

verlue

tlr.rt ru'1-rrescr-rts

an

atbsence

of

the characteristic

being

measured.

The true

z(,1'() itll1lws

rrttios of

values

to be

formed.

Thus,

a person

who

is S0-years-

oltl is tr,r,icc

as olcl

as a person

who

is

25.

Age

in

years

is a

ratio scale.

Each

vt'ar

rcpresents

the

same

amount

of time

no

matter

where

it occurs

on

the

scalc;

tlre

year

between

20 and

27

years

of

age

is

the same

amount

of time

.rs

the verlr between

54

and

55.

As you

may

have

noticed,

the scales

of

measurement

can be arranged

Siersrchically

from

nominal

to

ratio.

Staring

with

the ordinal

scale,

each

scale

includes

all

the capabilities

of

the preceding

scale

plus

something

r1ew.

Thus,

nominal

scales

are simply

categorical,

while

ordinal

scales

are

categorical

with

the

addition

of

ordering

of

the categories.

Interval

scales

of

measurement

involve

ordered

categories

of

equal

size;

in other

words,

the

intervals

between

numbers

on

the scale

are equivalent

throughout

the

scale.

Ratio scales

also

have equal

intervals

but,

in addition,

begin

at

a

true

zero score

that

represents

an

absence

of

the

characteristic

being

mea-

sured

and

allows

for the

computation

of

ratios.

Importance of Scales

of

Measurement

The

statistical

techniques

that

are

appropriate

for one

scale

of mea-

surement

may

not be

appropriate

for

another.

Therefore,

the

researcher

must

be able

to

identify

the scale

of

measurement

being

used,

so

that

appropriate

statistical

techniques

can

be applied.

Sometimes,

the

inappro-

piiut"t

"ts

of

a

technique

is

subtle;

at

other

times,

it

can

be

quite

obvi-

-ous-and

quite

embarrassing

to

a

researcher

who

lets an

inappropriate

statistic

slip

by.

For example,

imagine

that

ten people

are

rank-ordered

according

1o

height.

In addition,

information

about

the

individuals'

weight

in

pounds

and

age

in

years

is

recorded.

When

instructing

the com-

put&

to

ialculate

arithmetic

averages,

the

researcher

absentmindedly

includes

the

height

rankings

along

with

the other

variables.

The

com-

puter

calculates

that

the

average

age of

the

participants

is 22.6,years,

that

ih"

urr"ruge

weight

of

the group

is

155.6 pounds,

and that the

average

height

is 5'5".

Calculating

an average

of ordinal

data, such

as

the

height

,u.,kirrgr

in this

example,

will yield

little

useful

information.

Meaningful

results

will only

be obtained

by using

the statistical

technique

appropri-

ate

to the

data's

scale

of

measurement.

On

what

scale

of

measurement

would

each

measured?

of the

following

data

be


4/8

('lt,tIrlt't'

l].ttl'

a. The number

of dollars

in one's wallet.

b. The

rated

sweetness of

a can

of soda.

c. Whether

one responds

yes

or no to

a

question.

d. Height

measured

in inches.

e.

The gender

of individuals.

I:

*"wmeasulef

':

"rnl

T*i:T:11

n"1:l:

TYpes

oF

SrarrsrrcAr-

TEcHNTeUES

Having

recognized

the type

of data

collected,

the researcher

needs

also

to consider

the

question that

he

or she wants

to answer.

You

can't

tighten

a

screw with a hammer,

and

you can't answer

one

research

ques-

tion

with a

statistical

test meant for

a different

question.

Let's

consider

three

questions that

a

researcher

might

ask:

1. How

can

I

describe

the data?

2.

To what

degree

are these

two variables

related

to

each

other?

3.

Do

the

participants

in

this group

have different

scores than

the

participants

in

the

other

group?

These

three

questions

require

the

use

of

different types

of

statistical tech-

niques. The

scale

of measurement

on which

the data

were

collected deter-

mines

more

specifically

which

statistical

tool

to use.

DescribinS

the Data

When a researcher

begins

organizing

a set of data, it

can be very

use-

ful to

determine

typical

characteristics

of

the different

variables. The

sta-

tistical techniques

used for

this task are aptly

called descriptive

statistics.

Usually, researchers

use two

types

of descriptive

statistics: a

description

of the

average

score and

a description

of

how

spread

out or close together

the data lie.

Averages

Perhaps

the most

commonly discussed

characteristic

of a data

set

is its

average.

However,

there

are three

different averages

that

can be

calcu-

lated:

the

mode,

the median,

and

the mean.

Each

provides

somewhat

dif-

ferent

information.

The

scale

of measurement

on which the data are

collected

will, in

part, determine

which average

is

most appropriate

to

use.

Let's'consider

a researcher

who

has

collected

data

on people's weight

measured

in pounds;

hair

color

categortzed

according

to 10

shades rang-

ing

from light

to

dark;

and

eye color labeled

as

blue,

green, brown,

or

other. This researcher

has measured data

on three different

scales

of

mea-

surement:

ratio,

ordinal,

and nominal,

respectively.

When

describing

the

I'lrt'

l{olt'

ol

Stiltistit's

itt

l{t'st"ll't

ll

(r(l

t,yt,col0rs

Of

thc

participa,.-t,r,.:h"

researcher

will

neecl

t(l

Ltst'it

tlillt't't'trl

statistic

tharn

whe|r

a"r..iUing

the

participants'

average

weight'

.f.

describe

the

eye

colo*

of

ih"

p*ti.ipunts,

tie

researcher

w.ttltl

use

thc

mode.

The

mode

is

defined

as

the

score

that

occurs

most

fre-

quently.

Thus,

if

most

of

the

purai.ipunts

had

brown

eyes,

brown

would

be

the

modal

eye

color.

somelimes

a

set

of

data

will

have

two

scores

that

tie

for

occurring

most

frequentlf

tn

tnut

case,

the

distribution

is

said

to

be

bimodal.

If

three

or

more

scores

are

tied

for

occurring

most

frequently'

the

distribution

is

said

to

be

multimodal'

In

our

".;;i;;

nui,

color.

is

measured.

on

an

ordinal

scale

of

mea-

surement,

since

we

have

no

evidence

that

the

ten

shades

of

hair

color

are

equallydistantfromeachother'Todescribeaveragehaircolor,the

researcher

could

use

the

mode'

the

median'

or

perhaps

bgth'

The

median

is

defined

as

the

middle

pointin.a'al"'::^11:s,

the

point

below

which

50%

of

the

scores

fall'

The

median

is

especially

useful

because

it

pro.lt"rltr-,ror.r,ation

about

the

distribution

of

other

scores

in

the

set)If

,h"

*":;;[ffi;;i;;;"t

the

eighth

darkest

hair

categorv'

then

we

know

that

half

of

the

participants

hai

h.air

in

categories

8

to

10'

and

thattheotherhalfoftheparticipantshadhair'incategoriesltoS.

Finally,

orrr."r"urcher

will'want

to

describe

the

participants'

average

weight.Theresearchercouldusethemodeorthemedianhere,orthe

researche,,,r";;;h

ro.rr"

the mean. The mean is the

arithmetic

average

of

the

scores

in

a

distribution;-it

is

calculated

by

adding

uP

the

scores

in

the

distribution

and

dividing

by

the

number

oj

scores'

The

mean

is

probuury

tn.

'*or1-commonly

"-r".1

tIP:.

of

average,

in

part

because

it

islmathematically

very

manipuiablef

It

is

difficult

to

write

a

formula

that

describes

how

to

calculate

the

mode

or

median'

but

it

is

not

difficult

to

write

a

formula

for

adding

a

set

of

scores

and

dividing

the

sum

by

the

number

of

scores.

Because

oflhis,

the

mean

can

be

embedded

within

other

formulas'

The

mean

d.oes,

however,

have

its

limitations'

scores

that

are

inordi-

nately

large

or

small

(called

outliers)

are

given

as

much

weight

as

every

other

score

in

the

distributiorr,

ahi,

can

aflect

the

mean

score'

which

will

be

inflated

if

the

outlier

i,

turg" and

deflated

if

the outlier is small'

For

example,

suppose

a

set-of

"*.u'i

scores

rs

82'

88'

84'

86'

and

20'

The

mean

of

these

,.or"i

ts

T2,although

four

of

the

five

people

".u1lud

scores

in

the

gOs.

The

inordinately

small

score,

the

outliet

20,

deflated

the

mean'

Researchers

need

to

watch

o.ri

iot

ini'

ptoblem

when

using

means'

Nev-

ertheless,themeanisstillaverypopularaverage'Themeancanbeused

with

data

measured

on

intervur

u.,a

ratio

scales."It

is

sometimes

used

witlr

numerical

ordinal

data

(suc;;r;;;G

scales),

but

it

cannot

be

usccl

witlr

rank-order

data

or

d'ata

measured

on

a

nominal

scale'

The

mode,

median,

and

mean

ale

ways

of

describirrg

tlre

.tr,t't.ltgt'

score

among

a

set

of

data'

rn"f

u'"

often

tutt"d

measures

of

central

ten-


5/8

lr,r rl , .

l,'orrr.

clclrcy

[rt't'ilttst'tht'y

tt'lttl

t.

c.lescribc

thc

sc()r.s

ilr

tlrt,[rritlr.ll..f

tlrt,tiistri-

rtir.)(alth.rrgh

the

m'de

";;

not

be

in

the

midcirt,,at

ar).

A

researcher

observes

cars rrfArinc r^.t

r

rec

o rds

th

e

ge

n

d

e r

of

th

e

;G:fi

:

ir,

ffi.f

il'

;:

j#

:l'

:f,

J

".:

l.1

"

type

of

car

(Ford,

chevroret

,

yrazda,ua.j,

"na

the

speed

at

which

the

car

rives

through

the

rot

(measured

with

a ,radargun

in

mph).

a.

For

each

type of

data

measured,

what

wourd

be

an

appropriate

aver_

ge

to

carcurate

(mode,

median,

and/or

mean)?

b'

one

driver

travered

through

the

parking

rot

at

a

speed

20

mph

igher

than

any

other

1r'u.*

wh;;

.fp'"

or

average

would

be

most

lO.cted

by

this

one

score?

Another.,:p^*"*

.n*?:::,:11.

",

"

,",

:;

*-

,,

;"

,";.""

;,

hich

the{scores

are

crose

to

the.avi*r"

;

;."

spread

outf

statistics

that

escribe

this

chara.t"rlrii.

u"r"

.ut"a

--"";;s

of

dispersion.

Measures

of

Dispersion

Although

they

can be

used

with

nominal

and

ordin

aI

data,measures

f

dispersion

are

uged

p.i-urity

with

r"i"r""i

or

ratio

data.

he

most

straightiorward

"-"ur.r."";i;rrp"rsion

is

the

range.

The

ffil?#::;r",T.H:,:r"?,;::re

varues

r;J;,"s

in

a

discrete

data

set

or

tinu

ous

dis

trlb

u

ti

on.

In

u

d ir..:?

:J::"1,

"f":

:

ff

:

::i:l

**rl

;:1.

ible'

such

as

the

numu"r

rr

times

J

r"-ote

of

women

have

been

regnant;

as

they

say,

you

."":t_b::

lt*""oinant--she

either

is

or

she

sn't'

In

a

contirr.ro*

distribution

set,

fractionstf

scores

are

possibre,

such

ffii:,l|'fi-?:t

peopre

i"

u-'u*pre;

ror

,;;;

,h"

;;;;"

;ill.n

is

very

tn"

nigh",;

;;;J:lt#r#il-

d

bv

subtracting

the

r';;;;.ore

rrom

Range

=

Highest

_

Lowest

+

1

we

add"

1

so

that

the

range

will

include

both

the

highest

value

and

the

?il,"::;tff

*i:i*t**:*ffi

;ffi;"#"Lno

rr

iio,''.u,

72s,

776,

202_110+1=98

Ifl;fple

of

scores

covers

98

pounds

from

the

lightest

to

the

heaviest

I lrt' Itolt'

oI St.t t ist it's itt

ltt'sr.,r rt'lr 7 I

'l'lrt'rarrge

tt'lls lts ovcr lrow rri.lny

scores thc data arc

sprearcl, btrt it

titrcs

not

give Lts any information

about

how

the scores are distributed

over the

range.

lt is limited

because

it relies

on

only two scores from

the entire distri-

L'rtrtion. But it

does

provide us with

some useful information

about

the

spread

of the

scores and

it

is appropriate

for

use with

ordinal,

interval,

and

ratio

data.

A more

commonly used measure

of dispersion is

the standard

devia-

tion. The

standard

deviation may

be thought

of as

expressing the

average

distance

that the

scores in a

set of data fall from

the mean.

For

example,

imagine

that the mean

score

on an exam was74.If

the

class all

performed

about the

same,

the

scores

might

range

from

67

to

81;

this

set

of

data

would

have a relatively

small

standard deviation, and the average

dis-

tance from

the mean

of

74

would

be fairly

small.

On

the

other hand, if

the

members

of

the

class

performed

less

consistently-if

some did very well,

but

others did

quite poorly,

perhaps with

scores

ranging

from

47 to 100-

the

standard deviation would

be

quite

large;

the

average

distance from

the mean

of 74 would

be fairly

big.

The

standard deviation

and its

counterpart,

the variance

(the

stan-

dard deviation

squared),

are

probably the most

commonly used measures

of dispersion. They

are used individually

and also are

embedded

within

other more

complex formulas. To

calculate a standard

deviation

or vari-

ance,

you

need

to know the mean.

Because

we typically

calculate a mean

with

data measured

on interval

or ratio

scales,

standard

deviation

and

variance

are not appropriate

for use

with nominal

data.

Learning

to

calculate standard

deviation and variance

is not

neces-

sary for the

purposes

of

this

book

(although

it is

presented

in

appendix

A).

The underlying

concept-the notion

of how

spread out

or clustered

the data are-is important,

however,

especially in research

where two

or

more

groups of

data

are being

compared. This issue

will

be discussed a

little later

in

the

chapter.

The

weather report includes

information

about

the normal

tempera-

ture for the

day.

Suppose that today

the temperature

is

l0

degrees

above

normal. To

determine if

today is

a

very

strange day

or not especially

strange,

we need to know

the standard deviation.

lf we learn

that

the stan-

dard

deviation is l5

degrees,

what might we conclude

about how normal

or abnormal

the

weather

is

today? lf the

standard deviation is

5 degrees,

what d":.

:h1: :yBest

about

today's w3af3r?

Measures

of

Relationships

Often a

researcher

will want

to know more

than the averttgr'

.rrrr1

degree

of dispersion for

different variables.

Sometimes, the reseirrclrcr'


6/8

7?

(

'lt,tPlct'

lrotrt'

w.rrrts

to

leirrrr

how nruch two variables are rcl;rtet1 to

orrt..lnotht'r.

lrr tlris

cilse, thc

rcsearrcher

would want

to calculate

a

correlation.

A

crlrrclrrtion

is

ir measure

of

the

degree of relationship between two variables. For

exam-

ple,

if we

collected data on the number

of

hours

students

studied for

a

midterm

exam and the grades received

on

that

exam,

a

correlation could

be

calculated between the hours studied and the midterm grade.

We

might find

that those with

higher midterm

grades tended to study

more

hours,

while those with lower midterm

grades tended to study for

fewer

hours. This

is described

as

a

positive

correlation.

With

a

positive

correla-

tion, an increase in

one variable is accompanied by an increase in

the

other variable. With a negative correlation, by

contrast,

an

increase

in one

variable

is accompanied

by u decrease in the

other

variable. A

possible

negative

correlation might

occur

between the number

of

hours

spent

watching

television the night before an

exam and the scores on the exam.

As

the number

of

hours

of

viewing increase,

the exam scores decrease.

A mathematical formula is used to

calculate

a correlation coefficient,

and the resulting number will

be somewhere between

-1.00

and

+1.00.

The closer the number is to

either

+1.00

or

-1.00,

the stronger the relation-

ship

between the variables is. The

closer the number is to

0.00, the

weaker the

correlation is.

Thus, +.85

represents a relatively

strong posi-

tive

correlation, but

+.03 represents a weak

positive correlation.

Similarly,

-.9L

represents a

strong

negative

correlation,

but

-.12

represents a rela-

tively weak negative

correlation. The strength

of

the relationship is repre-

sented by the absolute value

of

the

correlation coefficient. The direction

of

the relationship is represented

by the sign of the

correlation

coefficient.

Therefore,

-.91.

represents

a stronger corcelation

than

does

+.85.

A particular type

of graph called a scattergram is used to demon-

strate the

relationship

between two variables. The two variables

(typically

called the

x

and the y variables) are

plotted on the same graph.

The

r vari-

able

is

plotted along the horizontal x-axis, and the

y

variable is

plotted

along the vertical

y-axis. Figure 4.1 is a

scattergram of

the hypothetical

data for number

of

hours

studied and midterm

exam scores.

Each

point on

figure 4.1 represents

the two

scores

for

each

person. To

calculate

a

correlation

there

must

be

pairs

of

scores

generated

by

one

set

of participants, not two

separate

sets

of

scores

generated by separate sets

of participants. Notice that the

points

tend

to

form

a pattern

from

the

lower

left corner to the upper right

corner.

This lower

left to upper right

pattern

is

hn indication

of

a

positive correlation. For a negatiue

correlation,

the points show a

pattern

from the top left

corner to the bottom

right

cor-

ner. Furthermore,

the

more

closely the points fall along a

straight

line, the

stronger the

correlation between the two variables. Figure 4.2

presents

several scattergrams

representing

positive and negative

correlations of

various

strengths.

Several

types

of correlations can be calculated.

The

two

most

com-

mon are Pearson's

product-moment correlation

(more

often called Pear-

Ilrt'

l{olt'ol

Statistics

irr

ltt'st"ll't'lt

73

Sttl],S

r,)

arrcl

Spcarrmirn,S

rlrtl

(ftlr

which

the

Corresponding

Grcc,k

synrbtll

is

7,).

l,c.rrs.r.,s

,';;';;"J

when

tn."i*o

tariables

being

correlated

are

mea-

strrccl

()n

interv.i

,.

*,io

scales.

when

one

or

both

variables

are

mea-

sured

()n

an

'rdinal

scale,

"rp"".lu[y

if

the

variables

are

rank-ordered'

Spearma,.,,,

,t,o.i,

;;p-p;iu,".

Oi^.i

correlation

coefficients

can

be

calcu-

lated

for

situatrorrr"lun"r-r,

for

example,

one

variable

is

measured

on

an

Figure

4.1

q"i:t

of

midterm

exam

scores

and

numbers

of'hou"

spent studying

for the

exam

l6

t)

oo

-

>\.

;E

oi

O ,

-0c

c0)

)uD

z

30

45

50

t>

Midterm

exam

score

Figure

4.2

torrel

ations

of

different

Scattergrams

rePresenting

cor

strengths

and

directions

(c)

Weak

Positive

(b)

Strong

negatrve

(a)

Strong

Positive

(e)

No

correlation

(d)

Weak

negative


7/8

76 ('lt,tpl1'1.

l,'prrl.

Table

4.1

Statistical

Technique

Some

Appropriate

statistics

for

Different

,,..r.ilrrJl

cales

of

Measurement

I Irt' liolt' ol St.ttistics

itr lit'st',tt'r lt

77

'l'hcrc

irrc

thrce ways tcl

mcasure an averagc:

thc moclr', tltt'rttt'tli,rrr,

ancl thc

mean. The mode is the most

frequent

score;

the

mediartr is tltt't'r'tt

tral

scclre;

and the

mean is the arithmetic average of

the data set.

Measures of

dispersion

provide

information about

how clusterecj

together or spread out

the data are

in

a

distribution.

The range describes

the

number

of score

values the

data are spread across.

The variance

and

standard

deviation provide

information about

the average

distance the

scores fall from the

mean.

A

researcher

might

also

ask

if

two

variables

are

related

to

each

other.

This

question

is answered by calculating

a correlation coefficient.

The cor-

relation

coefficient

is

a

number between

-1.00

and

+1.00. The

closer

the

coefficient

is to

either

-1.00

or

+1.00,

the stronger

the correlation

is.

The

negative and positive signs

indicate whether the

variables are changing

in

the same

direction

(a

positive correlation)

or

in

opposite

directions

(a

negative

correlation).

Finally,

a

researcher may wish to compare

sets of scores

in

order

to

determine

if

an

independent variable

had an

effect

on a

dependent

vari-

able. A number of statistical techniques can

be used to

look

for

this

differ-

ence.

The appropriate technique

depends on

a number of

factors, such

as

the

number

of groups

being compared

and the scale of

measurement on

which the data were collected.

If data at the ratio or

interval

level

were collected,

the statistical

tech-

niques

that look for

differences between groups

have the same

underly-

ing logic. A difference between groups

is

considered

to exist

when

the

variation among the scores

between the groups

is considerably greater

than the

variation among the scores

ruithin the group.

When data are measured on ordinal

or

nominal scales, other

statisti-

cal techniques can

be used; these tend to be

less

powerful

than

those used

for data on

ratio and interval scales,

though.

Statistical

techniques

are necessary

to test research

hypotheses once

data have been collected.

Knowledge of

this field

is

essential

for research

psychologists.

IvrponrANT

TEnvrs AND

CoNcnprs

analysis of

variance

(ANOVA)

median

between-groups

variance

mode

bimodal

correlation

descriptive statistics

error

variance

interval scale

mean

measurement

multimodal

negative

correlation

nominal scale

nonparametric tests

ordinal scale

outliers

parameter

measures

of central

tendency parametric

tests

Scales

of

Measurement

Nominal

Ordinal

lnterval

Ratio

.

Averages

mode

mode,

median

mode,

median,

mean

mode,

median,

mean

.

Measures

of

dispersion

range, s.d.,"

variance

range,

s.d.,

variance

.

Correlations

eQhi)

coefficient

Spearman's

p

Pearson's

r

Pearson's

r

4.

Single

group

compared

to

a

population

72

Goodness--

72

cooaness-

of-Fit

of-Fit

z-test,

single_

sample

t

z-test,

single_

sample

t

5.

Two

separate

grouPs

72

Tolb

x2

Tol

Wilcoxon's

rank-sum,

72

Tol

Wilcoxon's

rank-sum,

72

Tol,

indepen-

5.

Three

or

more

grouPs

72

Tol

dent-samples

t

72

Tol

ANOVA,

Kruskal-Wallis

ANOVA,

Kruskal-Wallis

.

One group

tested

twice

Mann-Whitney

Mann-Whitney

U,

dependent-

samples

t

U,

dependent-

samples

t

Standard

a"ui".ioi---

b

77

turt

of

independence

SurvuvlARy

Researchers

use

statistics

to

herp

them

test

their

hypotheses.

often,

sta_

istics

are

used

to

generalize

the

rer,rlt,

from

u

,umpt"

to

a

larger

population.

Mhich

type

oi

statisticar

t".t",r-,iq.r"

i,

.rrJ

d"p"r,d,

on ttre

scale

of

mea_

uremn

on which

the data

ur"

.br".t"d.

D;;;

measured

on

a

nominal

cale

are

classified

in

different

lategories.

order

is

not

important

for

nomi_

al

data,

but

it

is

for

autu

mear.*"a

""

",.

o;;;;

scale

of

measurement.

The

ata

mehsured

on

an

interval

scale

"f

,.,;uJ;rl"",

are

also

ordered

but,

in

ddition,

the

units

of

measutu."r-,,.*"

equal

throughout

the

scale.

The

ratio

cale

of

measurement

is

much

like

the

*d;;i;.Ilu,

"r,."pt

that

it

includes

a

rue

zero,

which

indicates

";;;;-ount

of

the

construct

being

measured.

he

scale

of

measu.u-"-ilor

the

data

anJthe

questioriueing

asked

y

the

researcher

determi"u

*^ut

statistical

technique

should

be

used.

hen

describin

g

dut?,.d;;.tp*e

statistics

are

used.

These

include

aver_

ges

and

measures

of

disperri"".

measures

of

dispersion population


8/8

i

I

I

I

7tl

('lr,rpl1,1.;;.,,,,

l)(

)si

t i

v('

t'or-r.r,la

t

iorr

raltgc

ratio

scale

sample

scattergram

ExpncrsEs

standard

cleviatiorr

f-test

variance

within-group

variance

I lrc ltolt'ol Sl.ttistir's

itt ltt'st',tt't'lt

7q

Concept

Question

4.2

a.

Irc>r

gender,

the

mode;

for the number

in

the car,

the

median trtrcl/or

mode; for the type of

car, the mode;

for the

speed,

the

mean,

mediatt,

and/or

mode.

b.

The mean.

Concept

Question

4.3

If the standard

deviation

is 15, a day

that is 10 degrees

above the

nor-

mal temperature

is

not an unusually

warm day; however,

if the standard

deviation

is

5, a

day

that is

10

degrees above

the

normal

temperature

is

twice

the average

distance from the mean

(roughly),

and thus

is an

unusually

warm

day.

Exercises

1.

Nominal:

license

plate

numbers, eye color. Ordinal:

ordered preference

for

five types of cookies, class

rank. Interval: degrees

Fahrenheit,

money in

your

checking

account

(assuming

you can

overdraw). Ratio:

loudness

in

decibels,

miles per gallon.

There are

of

course

any number

of other

correct

answers.

3.

a.

range,

standard

deviation,

and

variance

b.

range,

standard

deviation, and

variance

c.

range

5.

A

positive

correlation

describes

a relationship

in which two

variables

change

together

in the same direction.

For

example,

if

the

number of

violent crimes

increases as crowding

increases, that would

be a posi-

tive

correlation.

A

negative correlation

describes a

relationship

in

which two

variables change

together

in

opposite

directions.

For

instance,

if weight gained

increases as the amount

of exercise

decreases,

that would

be a

negative

correlation.

I

t

::""ri1"L*t"ples

of

variables

corresponding

to

each

of

the

scales

of

' ^

,:^I;?Xr;^er

measures

height

in

inches,

what

averages

might

be

b'

rf

a

researcher

measures

height

by

assignrlq

pgople

to

the

categories

hort'

medium,

and

tall,

wnlt

"";.;;;;might

be

calculated?

3'

a'

If

a

researcher

measrr"?

y^"tgnt

in

pounds,

what

measures

of

dis_ersion

could

be

calculated?

,

Ifj"'::ff;.i:i:i:1,#J"Tr-"isht

in

ounces,

what

measures

of

disper_

c'

If

a

researcher

measures

weight

by

assigning

each

person

to

either

ji.1,'.tliy;ffj'.Xil,"T,

n

"

u

"?

:

;id;;;:

wh

a

t

*

"

u,,,",

or

d

i

sp

e r

_

4.

Which correlation

is stron

ger:

_.g7or +.55?

5'

what

is

the

difference

between

a

positive

and

a

negative

correlation?

rovide

an

exampre

other

than

the

one

in

the

chapter.

6'

A

researcher

studying

the

effect

of

a

speed-reading

course

on

readingimes

compares

the

s'cores

of

a

grorf'rh;

has

taken

the

course

with

hose

of

a

control

group'

The

resear.h".

finds

that

the

ratio

of

the

vari-

tion

between

the

groups

to

the

variation

-irt,i'

the

group

is

equal

to

'76'

A

colleag'"

do",

a

simila*trly-u.a

rir-,a,

a

ratio

of

between_

roups

variatiol

to

within-group

variation

of

7.32.which

ratio

is

more

ikely

to

suggest

a

signifi.uit

aiir"rence

between

reading

groups?

ANswERs

To

CorucEpT

euESTroNs

AND

Oon-NuMB

ERED

ExERCriEs

Note:

There

w'l

often

be

more

than

one

correct

answer

for

each

of

hese

questions.

Consurt

-ith

yo,rr

instructor

about

your

own

answers.

Concept

euestion 4.1

a.

ratio

b.

ordinal

or

interval

c.

nominal

d.

ratio

e.

nominal

f.

ordinal

4 role of statistics in research

Documents