4 role of statistics in research

Upload: kong-tzeh-fung

Post on 04-Feb-2018

229 views

Category:

Documents


1 download

TRANSCRIPT

  • 7/21/2019 4 Role of Statistics in Research

    1/8

    The

    Role

    of

    Statistics

    in

    Research

    Scnles

    or

    MTRSUREMENT

    Nominal

    Scale

    Ordinal

    Scale

    lnterval

    Scale

    Ratio

    Scale

    lmportance

    of

    Scales

    of

    Measurement

  • 7/21/2019 4 Role of Statistics in Research

    2/8

    Pr()icct

    is

    nrcant

    ttl1;enerar

    lize.

    Apopulation

    ci.rn

    Lre

    us

    l.rr1laclly

    clcfirccJ

    as

    all

    f

    the

    people

    in

    the

    world,

    Jr

    "":l-1il

    living

    organisnrr;

    ,r.

    ih"

    population

    an

    be

    more

    narrowly

    defined-all_18-

    ,22?yiur-olds

    or

    a'orin"

    psychor_

    gy

    majors

    at

    a

    particular

    schSgl.

    Typically,'u

    ."r"urcher

    cannot

    test

    a1

    of

    he

    members

    of

    a

    target

    populatior-,.

    irrt"uj,

    itl

    a

    smalr

    percentage

    of

    that

    opulation-a

    ru*pl"

    tf'th.

    members-can

    te

    tested

    il;

    sample

    is

    il":iil:::fr:r."'''.

    the

    entire

    populatio.,

    ,o

    ,hu,

    if

    we

    identify

    a

    characteris_

    the

    wom"r.,_fr"_tor

    example,

    that

    tn"

    mur.,

    lr-,

    the

    sample

    are

    taller

    than

    r.r,he"-ii*ffi

    'riff

    ,f.;r[:Hit;TJHrJH**r*Ti*,*'*ji;

    e

    that

    a

    char'acteristic

    of

    our

    sampre

    *"

    u"

    g;eralized

    to

    the

    popuration.

    et's

    assume

    that

    a

    ."r"ur.hu,

    nur-.oii".r"a

    a

    memory

    experiment,

    nd

    the

    mnemonic

    (memo.,

    ,::l-llry")

    ;rr*

    performs

    bltt".

    than

    the

    ontrol

    group.

    perh-1ps

    the

    purucrpants

    in

    the

    averase

    or

    18

    out

    or

    zo

    *o.ir,

    undthe

    .."r.i#iilr'Jlit3:J:"-"r"#1.l

    16

    out

    of

    20

    v'ords.

    At

    this

    foi;;

    ,h;;#"inu,

    nor

    yer

    supporred

    the

    lternative

    hypothesis

    thatinr,"-o.i.

    instructions

    lead

    to

    bette

    erformance,

    even

    though

    ts

    i,

    greater

    ,n""

    16.

    The

    ."r"urJnt"T::lnow

    that

    the

    s,ample

    of"purii.ipants

    in

    the

    r

    i?1il"#*:y#r,f

    ."::::,:.,'m[:lnthe.il;';:ilT;ff

    :]lJ?ff;

    :Iffi

    JTfi*'f*ff

    JJ:"#:ft

    :nft

    ::'i:T,",i,"Jffi

    ::.'ff#*[.

    This

    chapter

    offers

    a

    nontechnical

    introduction

    to

    some

    statistical

    oncepts'

    The

    purpose

    is

    to

    familiariz

    yr"

    *iin

    some

    of

    the

    terms

    and

    ii:X'JT*,'l

    orgu'iri'g

    and

    anaryzins

    d;;;

    ,iu,irr,.ully

    so

    that

    you

    wiu

    vour

    studies,

    t";#,tTiils;if,

    ,T:";:'lutt'u'

    r'o,

    encounter

    as

    part

    of

    ('lt.t1rl1'1'

    l;,,,,,

    ScaTES

    oF

    T{easUREMENT

    'l'lrt'

    ltolt'ol'Stirtistics

    ilr

    l{t.st..ll'r'll

    b5

    Ltst'cl in this

    example are

    that

    the number

    you

    assign to

    yotrr hirp1rir11.**

    lnust

    be between l

    and 10, where

    l refers

    to a lack

    of

    happiness

    alrtl l0

    rcferrs

    to an

    abundance

    of happiness.

    Not all measurement

    systems are

    equivalent.

    Some measurements

    can be mathematically

    manipulated-for

    instance,

    by adding

    a constant

    or by taking

    the

    square root

    of each number-but

    still keep its

    primary

    characteristics.

    Other

    systems are

    very intolerant

    of any mathematical

    manipulation;

    adding

    a

    constant

    or taking

    the square root renders

    the

    data

    meaningless.

    Measurement

    systems can

    be

    assigned

    to

    one

    of four

    scales

    of measurement

    that vary

    by

    the level

    of mathematical

    manipula-

    tion

    they can tolerate. These

    four

    scales

    of measurement

    (also

    called lev-

    els of measurement)

    are

    the nominal,

    ordinal, interval,

    and ratio

    scales.

    Nominal

    Scale

    The

    nominal

    scale

    of

    measurement

    merely

    classifies objects

    or

    indi-

    viduals

    as

    belonging

    to different

    categories. The

    order

    of

    the

    categories is

    arbitrary

    and

    unimportant.

    Thus,

    participants might

    be

    categorized as

    male

    or female, and the male

    category

    may

    be assigned

    the number 1

    and

    the female

    category assigned

    the number

    2. These

    numbers

    say

    nothing

    about

    the importance

    of one

    category as

    compared

    to the other. The

    num-

    bers

    could

    just

    as well

    be 17.35

    and29.46.

    Other

    examples

    of

    nominal

    scales

    of

    measurement

    are numbers

    on basketball

    players'

    jerseys

    or the

    numbers

    assigned

    by the

    Department

    of Motor

    Vehicles

    to the license

    plates

    of cars.

    Numbers, when

    used in

    a nominal

    scale

    of

    measurement,

    serve as'labels

    only', and

    provide no information

    on the magnitude

    or

    amount

    of the

    charatteristic being

    measured.

    Ordinal

    Scale

    An

    ordinal scale

    differs from

    a nominal

    scale

    in

    that the

    order of the

    categories

    is important.

    A

    grading system with

    the grades A, B,

    C, D, and

    F

    is an

    ordinal

    scale.

    The

    order of the

    categories reflects

    a decrease in

    the

    amount

    of the stuff being

    measured-in

    this

    case, knowledge.

    Note,

    how-

    ever, that the

    distance

    between the

    categories

    is not necessarily

    equal.

    Thus, the difference between

    one

    A

    and

    one

    B

    is

    not

    necessarily the same

    as

    the difference

    between

    another A

    and another B.

    Similarly, the differ-

    ence between

    any A and B is

    not necessarily

    the same as

    the difference

    betweenaBandaC.

    'Rank-order

    dataiis

    also measured

    on an

    ordinal scale. An

    observer

    may rank-order

    participants according

    to attractiveness

    or

    a researcher

    -rnay

    ask

    tasters to rank-order

    a number

    of crackers according

    to saltiness.

    iWhen

    your eyesight is

    tested and

    you

    are

    asked to

    choose which

    of

    twtl

    lenses

    results in

    a clearer image,you

    are being

    asked to

    provide

    ordinirlf

    ,data.

    Again, when

    data are rank-ordered,

    a

    statement is beirrg

    r)I.)tlt',

    about

    th{magnitude

    or

    amount

    of the characteristic

    being m('asurt'tl)l',rrr-'

    the intervals

    between

    units need

    not

    be equivalent. If

    sevcn pt'o1rlt'.rrt.

    ,"r"jl.iltllltlt'

    satisfactory

    work

    depends

    o1

    u-sing

    the

    right

    toots.

    rn

    appropri",",."ir?ll,

    l?ffiXt"j-T

    Tt

    statistical

    to;l;

    ,r,1"-r,

    the most

    ue'+**""o*.ln"ffi

    i,:r'Tii,"JH:,*i;Ji[il'iff

    ',l,r::fi*i{j

    nigue

    is

    to

    identify

    the

    typ"

    if

    oata

    bein

    g

    analyzed.

    In

    psychorogy,

    researchers

    assume

    that

    anything

    that

    exists_be

    it

    a

    3ff

    t|3t"'ff

    ;i:T,Tl':::,.,iJ,'"r"*r.,"rgr.,,;;:p,yihorogi.urconstruct,

    ff

    trT*"ffi

    :'JJ:T*:,'il,",'J::T'1fr'"il.;'il,:H,il""H;:li;

    rgsr)

    n",i"f

    r''w

    happy

    il;:l,.T:itr"

    ::J"

    H

    if,'iJ::

    ry;i::

    easurement.

    It

    entails

    identifying

    a

    .r,uru.i".itic,

    your

    happiness,

    and

    uantifying

    the

    amount

    or

    huppin"r,

    yo.r-

    u."

    experiencing.

    The

    rules

  • 7/21/2019 4 Role of Statistics in Research

    3/8

    (r(r

    (

    lt,tlrlt'r.

    l,orrr.

    t''tllk-.rtlert'c-l

    tll't

    attractivencss,

    the

    clifference

    i.

    attr..rcti\,(,.(,ss

    [rt,lr.vt,t,.

    lrt'first;rtrcl

    seconcl

    person

    is

    not

    necessarily

    t5e

    sarn..

    .s

    the

    c.litit,rt,rrct,

    r.tween

    the

    second

    a'd

    the

    third.

    The

    firsf

    and

    sec'ncl

    pcrsons

    may

    b.t6

    e

    very

    attractive,

    yjrh-

    only

    the

    smalr"rt

    iilr"rence

    between

    them,

    wrrirc

    he

    third

    person

    might

    be

    ,.rbstuntially

    less

    attractive

    than

    the

    seconc,.

    Interual

    Scale

    The

    interval

    scale

    of

    measurement

    is

    characterized

    by

    equal

    units

    of

    easurement

    throughout

    the

    scale.

    Thus,

    measurements

    made

    with

    an

    nterval

    scale

    provide information

    about

    loth

    the

    order

    and

    the

    relative

    uantity

    of

    the

    characteristic

    being

    ..,"ur.r.J.

    r,r"..ral

    scales

    of

    measure-

    ent,

    however,

    do

    not

    have

    a

    true

    zero

    value.

    A

    true

    zero

    means

    that

    one

    of

    the

    characteristic

    being

    measu."d

    l".r,uins.

    Temperature

    mea-

    urements

    in

    degrees

    Fahrenh.eit

    or

    in

    d"gr"",

    Celsius

    (also

    called

    centi-

    rade)

    correspond

    to

    interval

    scales.

    ThE

    dlrtu.,."

    between

    degrees

    is

    qual

    over

    the

    full

    length

    of

    the

    scale;

    the

    difference

    between

    20"

    and

    40o

    s

    the

    same

    as

    that

    betri'een

    40o

    and

    60".

    In.,"i,n".

    scale,

    however,

    is

    there

    true

    zero;

    zero

    simply

    represents

    another

    point

    on

    the

    ,.u1",

    and

    nega_

    ive

    numbers

    are

    potribl"

    and.

    meaningfJ

    tg".u.rr"

    there

    is

    no

    true

    zero

    n

    these

    scales,

    it

    is

    inappropriate

    to

    sa'y

    tnui+0.

    is

    twice

    as

    warm

    as

    20o.

    ;'In

    other

    words,

    ratios

    .ir,.roi

    be

    compuied

    with

    intervar

    scale

    data.)

    There

    is

    a

    controversy

    among psychological

    researchers

    regarding

    nterval

    and

    ordinal

    scaler

    i.,

    ,"luJion"to

    ,atfig.

    suppose

    that

    a

    partici_

    ant

    is

    asked

    to

    rate

    something

    on

    a

    scare

    -itn

    pi.ti..,tu,

    ".a

    points,

    uch

    as

    1

    to

    7

    or

    0

    to

    5.

    For

    exanipre,

    u

    p"rro.,

    might

    be

    asked

    the

    folrow_

    ng

    rating

    question:

    How

    satisfied

    are

    you

    with

    your

    friendships?

    7

    2

    3

    4

    5

    6'7

    B

    9

    10

    ery

    dissatisfied

    very

    satisfied

    The

    end

    numbers

    usuaily

    have

    labels,

    but

    the

    middre

    numbers

    sometimes

    o

    not'

    The

    controversy

    arises

    as

    to

    whether

    the

    ratings

    should

    be

    consid-

    red

    ordinal

    data

    or

    interval

    data.

    what

    nu,

    ,r"rr"r

    been

    ascertained

    is

    hether

    the

    scales

    that

    peopre

    use

    in

    their

    heads

    have

    units

    oi

    "qrrut

    ,rr".

    f

    the

    gnits

    are

    equal

    it-t

    tir",

    the

    data

    could

    be

    regarded

    as

    interval

    data;

    if

    he;z

    are

    .r.,"qrur,

    the

    data

    should

    u"

    ,"gurd"Jas

    ordinal

    data.

    This

    is

    a

    oint

    of

    contention

    because

    interval

    dala

    often

    permit

    the

    use

    of

    more

    owbrful

    statistics

    than

    do

    ordinal

    data.

    There

    is

    stilr

    no

    consensus

    about

    the

    nature

    of

    rating

    scale

    data.

    In

    ome

    research

    areas,

    ratings

    tend

    to

    be

    t."uteJ

    cautiously

    and

    are

    consicl-

    red

    ordinal

    data.

    In

    otheiareas-such

    as

    langrug"

    and

    memory

    studies,

    here

    participants

    may

    be

    asked

    to

    rate

    ho--fu-iliar

    a

    phrase

    is

    or

    how

    trong

    their

    feeling

    of

    knowing

    is-ratings

    t"r,J

    to

    be

    treatecr

    as

    interval

    ata'

    The

    particuJai

    philosophi

    or

    any

    paiti..,to,l

    are;r

    of

    study

    is

    p.ssiblv

    est

    ascertained

    from

    prerriorrsresearch

    in

    that

    area.

    J

    .

    I lrt'

    liolr'

    ol

    sl,ttistit's

    itl l{t'st',tt't

    lt

    b7

    Ratio Scale

    I'lrr,

    ratio scale

    of

    measurement

    provides

    information

    about

    tlrder;

    all

    rrrrits,rrt,of cqLri-ll

    size

    throughoutthe

    scale,

    and

    there

    is

    a true

    zero

    verlue

    tlr.rt ru'1-rrescr-rts

    an

    atbsence

    of

    the characteristic

    being

    measured.

    The true

    z(,1'() itll1lws

    rrttios of

    values

    to be

    formed.

    Thus,

    a person

    who

    is S0-years-

    oltl is tr,r,icc

    as olcl

    as a person

    who

    is

    25.

    Age

    in

    years

    is a

    ratio scale.

    Each

    vt'ar

    rcpresents

    the

    same

    amount

    of time

    no

    matter

    where

    it occurs

    on

    the

    scalc;

    tlre

    year

    between

    20 and

    27

    years

    of

    age

    is

    the same

    amount

    of time

    .rs

    the verlr between

    54

    and

    55.

    As you

    may

    have

    noticed,

    the scales

    of

    measurement

    can be arranged

    Siersrchically

    from

    nominal

    to

    ratio.

    Staring

    with

    the ordinal

    scale,

    each

    scale

    includes

    all

    the capabilities

    of

    the preceding

    scale

    plus

    something

    r1ew.

    Thus,

    nominal

    scales

    are simply

    categorical,

    while

    ordinal

    scales

    are

    categorical

    with

    the

    addition

    of

    ordering

    of

    the categories.

    Interval

    scales

    of

    measurement

    involve

    ordered

    categories

    of

    equal

    size;

    in other

    words,

    the

    intervals

    between

    numbers

    on

    the scale

    are equivalent

    throughout

    the

    scale.

    Ratio scales

    also

    have equal

    intervals

    but,

    in addition,

    begin

    at

    a

    true

    zero score

    that

    represents

    an

    absence

    of

    the

    characteristic

    being

    mea-

    sured

    and

    allows

    for the

    computation

    of

    ratios.

    Importance of Scales

    of

    Measurement

    The

    statistical

    techniques

    that

    are

    appropriate

    for one

    scale

    of mea-

    surement

    may

    not be

    appropriate

    for

    another.

    Therefore,

    the

    researcher

    must

    be able

    to

    identify

    the scale

    of

    measurement

    being

    used,

    so

    that

    appropriate

    statistical

    techniques

    can

    be applied.

    Sometimes,

    the

    inappro-

    piiut"t

    "ts

    of

    a

    technique

    is

    subtle;

    at

    other

    times,

    it

    can

    be

    quite

    obvi-

    -ous-and

    quite

    embarrassing

    to

    a

    researcher

    who

    lets an

    inappropriate

    statistic

    slip

    by.

    For example,

    imagine

    that

    ten people

    are

    rank-ordered

    according

    1o

    height.

    In addition,

    information

    about

    the

    individuals'

    weight

    in

    pounds

    and

    age

    in

    years

    is

    recorded.

    When

    instructing

    the com-

    put&

    to

    ialculate

    arithmetic

    averages,

    the

    researcher

    absentmindedly

    includes

    the

    height

    rankings

    along

    with

    the other

    variables.

    The

    com-

    puter

    calculates

    that

    the

    average

    age of

    the

    participants

    is 22.6,years,

    that

    ih"

    urr"ruge

    weight

    of

    the group

    is

    155.6 pounds,

    and that the

    average

    height

    is 5'5".

    Calculating

    an average

    of ordinal

    data, such

    as

    the

    height

    ,u.,kirrgr

    in this

    example,

    will yield

    little

    useful

    information.

    Meaningful

    results

    will only

    be obtained

    by using

    the statistical

    technique

    appropri-

    ate

    to the

    data's

    scale

    of

    measurement.

    On

    what

    scale

    of

    measurement

    would

    each

    measured?

    of the

    following

    data

    be

  • 7/21/2019 4 Role of Statistics in Research

    4/8

    ('lt,tIrlt't'

    l].ttl'

    a. The number

    of dollars

    in one's wallet.

    b. The

    rated

    sweetness of

    a can

    of soda.

    c. Whether

    one responds

    yes

    or no to

    a

    question.

    d. Height

    measured

    in inches.

    e.

    The gender

    of individuals.

    I:

    *"wmeasulef

    ':

    "rnl

    T*i:T:11

    n"1:l:

    TYpes

    oF

    SrarrsrrcAr-

    TEcHNTeUES

    Having

    recognized

    the type

    of data

    collected,

    the researcher

    needs

    also

    to consider

    the

    question that

    he

    or she wants

    to answer.

    You

    can't

    tighten

    a

    screw with a hammer,

    and

    you can't answer

    one

    research

    ques-

    tion

    with a

    statistical

    test meant for

    a different

    question.

    Let's

    consider

    three

    questions that

    a

    researcher

    might

    ask:

    1. How

    can

    I

    describe

    the data?

    2.

    To what

    degree

    are these

    two variables

    related

    to

    each

    other?

    3.

    Do

    the

    participants

    in

    this group

    have different

    scores than

    the

    participants

    in

    the

    other

    group?

    These

    three

    questions

    require

    the

    use

    of

    different types

    of

    statistical tech-

    niques. The

    scale

    of measurement

    on which

    the data

    were

    collected deter-

    mines

    more

    specifically

    which

    statistical

    tool

    to use.

    DescribinS

    the Data

    When a researcher

    begins

    organizing

    a set of data, it

    can be very

    use-

    ful to

    determine

    typical

    characteristics

    of

    the different

    variables. The

    sta-

    tistical techniques

    used for

    this task are aptly

    called descriptive

    statistics.

    Usually, researchers

    use two

    types

    of descriptive

    statistics: a

    description

    of the

    average

    score and

    a description

    of

    how

    spread

    out or close together

    the data lie.

    Averages

    Perhaps

    the most

    commonly discussed

    characteristic

    of a data

    set

    is its

    average.

    However,

    there

    are three

    different averages

    that

    can be

    calcu-

    lated:

    the

    mode,

    the median,

    and

    the mean.

    Each

    provides

    somewhat

    dif-

    ferent

    information.

    The

    scale

    of measurement

    on which the data are

    collected

    will, in

    part, determine

    which average

    is

    most appropriate

    to

    use.

    Let's'consider

    a researcher

    who

    has

    collected

    data

    on people's weight

    measured

    in pounds;

    hair

    color

    categortzed

    according

    to 10

    shades rang-

    ing

    from light

    to

    dark;

    and

    eye color labeled

    as

    blue,

    green, brown,

    or

    other. This researcher

    has measured data

    on three different

    scales

    of

    mea-

    surement:

    ratio,

    ordinal,

    and nominal,

    respectively.

    When

    describing

    the

    I'lrt'

    l{olt'

    ol

    Stiltistit's

    itt

    l{t'st"ll't

    ll

    (r(l

    t,yt,col0rs

    Of

    thc

    participa,.-t,r,.:h"

    researcher

    will

    neecl

    t(l

    Ltst'it

    tlillt't't'trl

    statistic

    tharn

    whe|r

    a"r..iUing

    the

    participants'

    average

    weight'

    .f.

    describe

    the

    eye

    colo*

    of

    ih"

    p*ti.ipunts,

    tie

    researcher

    w.ttltl

    use

    thc

    mode.

    The

    mode

    is

    defined

    as

    the

    score

    that

    occurs

    most

    fre-

    quently.

    Thus,

    if

    most

    of

    the

    purai.ipunts

    had

    brown

    eyes,

    brown

    would

    be

    the

    modal

    eye

    color.

    somelimes

    a

    set

    of

    data

    will

    have

    two

    scores

    that

    tie

    for

    occurring

    most

    frequentlf

    tn

    tnut

    case,

    the

    distribution

    is

    said

    to

    be

    bimodal.

    If

    three

    or

    more

    scores

    are

    tied

    for

    occurring

    most

    frequently'

    the

    distribution

    is

    said

    to

    be

    multimodal'

    In

    our

    ".;;i;;

    nui,

    color.

    is

    measured.

    on

    an

    ordinal

    scale

    of

    mea-

    surement,

    since

    we

    have

    no

    evidence

    that

    the

    ten

    shades

    of

    hair

    color

    are

    equallydistantfromeachother'Todescribeaveragehaircolor,the

    researcher

    could

    use

    the

    mode'

    the

    median'

    or

    perhaps

    bgth'

    The

    median

    is

    defined

    as

    the

    middle

    pointin.a'al"'::^11:s,

    the

    point

    below

    which

    50%

    of

    the

    scores

    fall'

    The

    median

    is

    especially

    useful

    because

    it

    pro.lt"rltr-,ror.r,ation

    about

    the

    distribution

    of

    other

    scores

    in

    the

    set)If

    ,h"

    *":;;[ffi;;i;;;"t

    the

    eighth

    darkest

    hair

    categorv'

    then

    we

    know

    that

    half

    of

    the

    participants

    hai

    h.air

    in

    categories

    8

    to

    10'

    and

    thattheotherhalfoftheparticipantshadhair'incategoriesltoS.

    Finally,

    orrr."r"urcher

    will'want

    to

    describe

    the

    participants'

    average

    weight.Theresearchercouldusethemodeorthemedianhere,orthe

    researche,,,r";;;h

    ro.rr"

    the mean. The mean is the

    arithmetic

    average

    of

    the

    scores

    in

    a

    distribution;-it

    is

    calculated

    by

    adding

    uP

    the

    scores

    in

    the

    distribution

    and

    dividing

    by

    the

    number

    oj

    scores'

    The

    mean

    is

    probuury

    tn.

    '*or1-commonly

    "-r".1

    tIP:.

    of

    average,

    in

    part

    because

    it

    islmathematically

    very

    manipuiablef

    It

    is

    difficult

    to

    write

    a

    formula

    that

    describes

    how

    to

    calculate

    the

    mode

    or

    median'

    but

    it

    is

    not

    difficult

    to

    write

    a

    formula

    for

    adding

    a

    set

    of

    scores

    and

    dividing

    the

    sum

    by

    the

    number

    of

    scores.

    Because

    oflhis,

    the

    mean

    can

    be

    embedded

    within

    other

    formulas'

    The

    mean

    d.oes,

    however,

    have

    its

    limitations'

    scores

    that

    are

    inordi-

    nately

    large

    or

    small

    (called

    outliers)

    are

    given

    as

    much

    weight

    as

    every

    other

    score

    in

    the

    distributiorr,

    ahi,

    can

    aflect

    the

    mean

    score'

    which

    will

    be

    inflated

    if

    the

    outlier

    i,

    turg" and

    deflated

    if

    the outlier is small'

    For

    example,

    suppose

    a

    set-of

    "*.u'i

    scores

    rs

    82'

    88'

    84'

    86'

    and

    20'

    The

    mean

    of

    these

    ,.or"i

    ts

    T2,although

    four

    of

    the

    five

    people

    ".u1lud

    scores

    in

    the

    gOs.

    The

    inordinately

    small

    score,

    the

    outliet

    20,

    deflated

    the

    mean'

    Researchers

    need

    to

    watch

    o.ri

    iot

    ini'

    ptoblem

    when

    using

    means'

    Nev-

    ertheless,themeanisstillaverypopularaverage'Themeancanbeused

    with

    data

    measured

    on

    intervur

    u.,a

    ratio

    scales."It

    is

    sometimes

    used

    witlr

    numerical

    ordinal

    data

    (suc;;r;;;G

    scales),

    but

    it

    cannot

    be

    usccl

    witlr

    rank-order

    data

    or

    d'ata

    measured

    on

    a

    nominal

    scale'

    The

    mode,

    median,

    and

    mean

    ale

    ways

    of

    describirrg

    tlre

    .tr,t't.ltgt'

    score

    among

    a

    set

    of

    data'

    rn"f

    u'"

    often

    tutt"d

    measures

    of

    central

    ten-

  • 7/21/2019 4 Role of Statistics in Research

    5/8

    lr,r rl , .

    l,'orrr.

    clclrcy

    [rt't'ilttst'tht'y

    tt'lttl

    t.

    c.lescribc

    thc

    sc()r.s

    ilr

    tlrt,[rritlr.ll..f

    tlrt,tiistri-

    rtir.)(alth.rrgh

    the

    m'de

    ";;

    not

    be

    in

    the

    midcirt,,at

    ar).

    A

    researcher

    observes

    cars rrfArinc r^.t

    r

    rec

    o rds

    th

    e

    ge

    n

    d

    e r

    of

    th

    e

    ;G:fi

    :

    ir,

    ffi.f

    il'

    ;:

    j#

    :l'

    :f,

    J

    ".:

    l.1

    "

    type

    of

    car

    (Ford,

    chevroret

    ,

    yrazda,ua.j,

    "na

    the

    speed

    at

    which

    the

    car

    rives

    through

    the

    rot

    (measured

    with

    a ,radargun

    in

    mph).

    a.

    For

    each

    type of

    data

    measured,

    what

    wourd

    be

    an

    appropriate

    aver_

    ge

    to

    carcurate

    (mode,

    median,

    and/or

    mean)?

    b'

    one

    driver

    travered

    through

    the

    parking

    rot

    at

    a

    speed

    20

    mph

    igher

    than

    any

    other

    1r'u.*

    wh;;

    .fp'"

    or

    average

    would

    be

    most

    lO.cted

    by

    this

    one

    score?

    Another.,:p^*"*

    .n*?:::,:11.

    ",

    "

    ,",

    :;

    *-

    ,,

    ;"

    ,";.""

    ;,

    hich

    the{scores

    are

    crose

    to

    the.avi*r"

    ;

    ;."

    spread

    outf

    statistics

    that

    escribe

    this

    chara.t"rlrii.

    u"r"

    .ut"a

    --"";;s

    of

    dispersion.

    Measures

    of

    Dispersion

    Although

    they

    can be

    used

    with

    nominal

    and

    ordin

    aI

    data,measures

    f

    dispersion

    are

    uged

    p.i-urity

    with

    r"i"r""i

    or

    ratio

    data.

    he

    most

    straightiorward

    "-"ur.r."";i;rrp"rsion

    is

    the

    range.

    The

    ffil?#::;r",T.H:,:r"?,;::re

    varues

    r;J;,"s

    in

    a

    discrete

    data

    set

    or

    tinu

    ous

    dis

    trlb

    u

    ti

    on.

    In

    u

    d ir..:?

    :J::"1,

    "f":

    :

    ff

    :

    ::i:l

    **rl

    ;:1.

    ible'

    such

    as

    the

    numu"r

    rr

    times

    J

    r"-ote

    of

    women

    have

    been

    regnant;

    as

    they

    say,

    you

    ."":t_b::

    lt*""oinant--she

    either

    is

    or

    she

    sn't'

    In

    a

    contirr.ro*

    distribution

    set,

    fractionstf

    scores

    are

    possibre,

    such

    ffii:,l|'fi-?:t

    peopre

    i"

    u-'u*pre;

    ror

    ,;;;

    ,h"

    ;;;;"

    ;ill.n

    is

    very

    tn"

    nigh",;

    ;;;J:lt#r#il-

    d

    bv

    subtracting

    the

    r';;;;.ore

    rrom

    Range

    =

    Highest

    _

    Lowest

    +

    1

    we

    add"

    1

    so

    that

    the

    range

    will

    include

    both

    the

    highest

    value

    and

    the

    ?il,"::;tff

    *i:i*t**:*ffi

    ;ffi;"#"Lno

    rr

    iio,''.u,

    72s,

    776,

    202_110+1=98

    Ifl;fple

    of

    scores

    covers

    98

    pounds

    from

    the

    lightest

    to

    the

    heaviest

    I lrt' Itolt'

    oI St.t t ist it's itt

    ltt'sr.,r rt'lr 7 I

    'l'lrt'rarrge

    tt'lls lts ovcr lrow rri.lny

    scores thc data arc

    sprearcl, btrt it

    titrcs

    not

    give Lts any information

    about

    how

    the scores are distributed

    over the

    range.

    lt is limited

    because

    it relies

    on

    only two scores from

    the entire distri-

    L'rtrtion. But it

    does

    provide us with

    some useful information

    about

    the

    spread

    of the

    scores and

    it

    is appropriate

    for

    use with

    ordinal,

    interval,

    and

    ratio

    data.

    A more

    commonly used measure

    of dispersion is

    the standard

    devia-

    tion. The

    standard

    deviation may

    be thought

    of as

    expressing the

    average

    distance

    that the

    scores in a

    set of data fall from

    the mean.

    For

    example,

    imagine

    that the mean

    score

    on an exam was74.If

    the

    class all

    performed

    about the

    same,

    the

    scores

    might

    range

    from

    67

    to

    81;

    this

    set

    of

    data

    would

    have a relatively

    small

    standard deviation, and the average

    dis-

    tance from

    the mean

    of

    74

    would

    be fairly

    small.

    On

    the

    other hand, if

    the

    members

    of

    the

    class

    performed

    less

    consistently-if

    some did very well,

    but

    others did

    quite poorly,

    perhaps with

    scores

    ranging

    from

    47 to 100-

    the

    standard deviation would

    be

    quite

    large;

    the

    average

    distance from

    the mean

    of 74 would

    be fairly

    big.

    The

    standard deviation

    and its

    counterpart,

    the variance

    (the

    stan-

    dard deviation

    squared),

    are

    probably the most

    commonly used measures

    of dispersion. They

    are used individually

    and also are

    embedded

    within

    other more

    complex formulas. To

    calculate a standard

    deviation

    or vari-

    ance,

    you

    need

    to know the mean.

    Because

    we typically

    calculate a mean

    with

    data measured

    on interval

    or ratio

    scales,

    standard

    deviation

    and

    variance

    are not appropriate

    for use

    with nominal

    data.

    Learning

    to

    calculate standard

    deviation and variance

    is not

    neces-

    sary for the

    purposes

    of

    this

    book

    (although

    it is

    presented

    in

    appendix

    A).

    The underlying

    concept-the notion

    of how

    spread out

    or clustered

    the data are-is important,

    however,

    especially in research

    where two

    or

    more

    groups of

    data

    are being

    compared. This issue

    will

    be discussed a

    little later

    in

    the

    chapter.

    The

    weather report includes

    information

    about

    the normal

    tempera-

    ture for the

    day.

    Suppose that today

    the temperature

    is

    l0

    degrees

    above

    normal. To

    determine if

    today is

    a

    very

    strange day

    or not especially

    strange,

    we need to know

    the standard deviation.

    lf we learn

    that

    the stan-

    dard

    deviation is l5

    degrees,

    what might we conclude

    about how normal

    or abnormal

    the

    weather

    is

    today? lf the

    standard deviation is

    5 degrees,

    what d":.

    :h1: :yBest

    about

    today's w3af3r?

    Measures

    of

    Relationships

    Often a

    researcher

    will want

    to know more

    than the averttgr'

    .rrrr1

    degree

    of dispersion for

    different variables.

    Sometimes, the reseirrclrcr'

  • 7/21/2019 4 Role of Statistics in Research

    6/8

    7?

    (

    'lt,tPlct'

    lrotrt'

    w.rrrts

    to

    leirrrr

    how nruch two variables are rcl;rtet1 to

    orrt..lnotht'r.

    lrr tlris

    cilse, thc

    rcsearrcher

    would want

    to calculate

    a

    correlation.

    A

    crlrrclrrtion

    is

    ir measure

    of

    the

    degree of relationship between two variables. For

    exam-

    ple,

    if we

    collected data on the number

    of

    hours

    students

    studied for

    a

    midterm

    exam and the grades received

    on

    that

    exam,

    a

    correlation could

    be

    calculated between the hours studied and the midterm grade.

    We

    might find

    that those with

    higher midterm

    grades tended to study

    more

    hours,

    while those with lower midterm

    grades tended to study for

    fewer

    hours. This

    is described

    as

    a

    positive

    correlation.

    With

    a

    positive

    correla-

    tion, an increase in

    one variable is accompanied by an increase in

    the

    other variable. With a negative correlation, by

    contrast,

    an

    increase

    in one

    variable

    is accompanied

    by u decrease in the

    other

    variable. A

    possible

    negative

    correlation might

    occur

    between the number

    of

    hours

    spent

    watching

    television the night before an

    exam and the scores on the exam.

    As

    the number

    of

    hours

    of

    viewing increase,

    the exam scores decrease.

    A mathematical formula is used to

    calculate

    a correlation coefficient,

    and the resulting number will

    be somewhere between

    -1.00

    and

    +1.00.

    The closer the number is to

    either

    +1.00

    or

    -1.00,

    the stronger the relation-

    ship

    between the variables is. The

    closer the number is to

    0.00, the

    weaker the

    correlation is.

    Thus, +.85

    represents a relatively

    strong posi-

    tive

    correlation, but

    +.03 represents a weak

    positive correlation.

    Similarly,

    -.9L

    represents a

    strong

    negative

    correlation,

    but

    -.12

    represents a rela-

    tively weak negative

    correlation. The strength

    of

    the relationship is repre-

    sented by the absolute value

    of

    the

    correlation coefficient. The direction

    of

    the relationship is represented

    by the sign of the

    correlation

    coefficient.

    Therefore,

    -.91.

    represents

    a stronger corcelation

    than

    does

    +.85.

    A particular type

    of graph called a scattergram is used to demon-

    strate the

    relationship

    between two variables. The two variables

    (typically

    called the

    x

    and the y variables) are

    plotted on the same graph.

    The

    r vari-

    able

    is

    plotted along the horizontal x-axis, and the

    y

    variable is

    plotted

    along the vertical

    y-axis. Figure 4.1 is a

    scattergram of

    the hypothetical

    data for number

    of

    hours

    studied and midterm

    exam scores.

    Each

    point on

    figure 4.1 represents

    the two

    scores

    for

    each

    person. To

    calculate

    a

    correlation

    there

    must

    be

    pairs

    of

    scores

    generated

    by

    one

    set

    of participants, not two

    separate

    sets

    of

    scores

    generated by separate sets

    of participants. Notice that the

    points

    tend

    to

    form

    a pattern

    from

    the

    lower

    left corner to the upper right

    corner.

    This lower

    left to upper right

    pattern

    is

    hn indication

    of

    a

    positive correlation. For a negatiue

    correlation,

    the points show a

    pattern

    from the top left

    corner to the bottom

    right

    cor-

    ner. Furthermore,

    the

    more

    closely the points fall along a

    straight

    line, the

    stronger the

    correlation between the two variables. Figure 4.2

    presents

    several scattergrams

    representing

    positive and negative

    correlations of

    various

    strengths.

    Several

    types

    of correlations can be calculated.

    The

    two

    most

    com-

    mon are Pearson's

    product-moment correlation

    (more

    often called Pear-

    Ilrt'

    l{olt'ol

    Statistics

    irr

    ltt'st"ll't'lt

    73

    Sttl],S

    r,)

    arrcl

    Spcarrmirn,S

    rlrtl

    (ftlr

    which

    the

    Corresponding

    Grcc,k

    synrbtll

    is

    7,).

    l,c.rrs.r.,s

    ,';;';;"J

    when

    tn."i*o

    tariables

    being

    correlated

    are

    mea-

    strrccl

    ()n

    interv.i

    ,.

    *,io

    scales.

    when

    one

    or

    both

    variables

    are

    mea-

    sured

    ()n

    an

    'rdinal

    scale,

    "rp"".lu[y

    if

    the

    variables

    are

    rank-ordered'

    Spearma,.,,,

    ,t,o.i,

    ;;p-p;iu,".

    Oi^.i

    correlation

    coefficients

    can

    be

    calcu-

    lated

    for

    situatrorrr"lun"r-r,

    for

    example,

    one

    variable

    is

    measured

    on

    an

    Figure

    4.1

    q"i:t

    of

    midterm

    exam

    scores

    and

    numbers

    of'hou"

    spent studying

    for the

    exam

    l6

    t)

    oo

    -

    >\.

    ;E

    oi

    O ,

    -0c

    c0)

    )uD

    z

    30

    45

    50

    t>

    Midterm

    exam

    score

    Figure

    4.2

    torrel

    ations

    of

    different

    Scattergrams

    rePresenting

    cor

    strengths

    and

    directions

    (c)

    Weak

    Positive

    (b)

    Strong

    negatrve

    (a)

    Strong

    Positive

    (e)

    No

    correlation

    (d)

    Weak

    negative

  • 7/21/2019 4 Role of Statistics in Research

    7/8

    76 ('lt,tpl1'1.

    l,'prrl.

    Table

    4.1

    Statistical

    Technique

    Some

    Appropriate

    statistics

    for

    Different

    ,,..r.ilrrJl

    cales

    of

    Measurement

    I Irt' liolt' ol St.ttistics

    itr lit'st',tt'r lt

    77

    'l'hcrc

    irrc

    thrce ways tcl

    mcasure an averagc:

    thc moclr', tltt'rttt'tli,rrr,

    ancl thc

    mean. The mode is the most

    frequent

    score;

    the

    mediartr is tltt't'r'tt

    tral

    scclre;

    and the

    mean is the arithmetic average of

    the data set.

    Measures of

    dispersion

    provide

    information about

    how clusterecj

    together or spread out

    the data are

    in

    a

    distribution.

    The range describes

    the

    number

    of score

    values the

    data are spread across.

    The variance

    and

    standard

    deviation provide

    information about

    the average

    distance the

    scores fall from the

    mean.

    A

    researcher

    might

    also

    ask

    if

    two

    variables

    are

    related

    to

    each

    other.

    This

    question

    is answered by calculating

    a correlation coefficient.

    The cor-

    relation

    coefficient

    is

    a

    number between

    -1.00

    and

    +1.00. The

    closer

    the

    coefficient

    is to

    either

    -1.00

    or

    +1.00,

    the stronger

    the correlation

    is.

    The

    negative and positive signs

    indicate whether the

    variables are changing

    in

    the same

    direction

    (a

    positive correlation)

    or

    in

    opposite

    directions

    (a

    negative

    correlation).

    Finally,

    a

    researcher may wish to compare

    sets of scores

    in

    order

    to

    determine

    if

    an

    independent variable

    had an

    effect

    on a

    dependent

    vari-

    able. A number of statistical techniques can

    be used to

    look

    for

    this

    differ-

    ence.

    The appropriate technique

    depends on

    a number of

    factors, such

    as

    the

    number

    of groups

    being compared

    and the scale of

    measurement on

    which the data were collected.

    If data at the ratio or

    interval

    level

    were collected,

    the statistical

    tech-

    niques

    that look for

    differences between groups

    have the same

    underly-

    ing logic. A difference between groups

    is

    considered

    to exist

    when

    the

    variation among the scores

    between the groups

    is considerably greater

    than the

    variation among the scores

    ruithin the group.

    When data are measured on ordinal

    or

    nominal scales, other

    statisti-

    cal techniques can

    be used; these tend to be

    less

    powerful

    than

    those used

    for data on

    ratio and interval scales,

    though.

    Statistical

    techniques

    are necessary

    to test research

    hypotheses once

    data have been collected.

    Knowledge of

    this field

    is

    essential

    for research

    psychologists.

    IvrponrANT

    TEnvrs AND

    CoNcnprs

    analysis of

    variance

    (ANOVA)

    median

    between-groups

    variance

    mode

    bimodal

    correlation

    descriptive statistics

    error

    variance

    interval scale

    mean

    measurement

    multimodal

    negative

    correlation

    nominal scale

    nonparametric tests

    ordinal scale

    outliers

    parameter

    measures

    of central

    tendency parametric

    tests

    Scales

    of

    Measurement

    Nominal

    Ordinal

    lnterval

    Ratio

    .

    Averages

    mode

    mode,

    median

    mode,

    median,

    mean

    mode,

    median,

    mean

    .

    Measures

    of

    dispersion

    range, s.d.,"

    variance

    range,

    s.d.,

    variance

    .

    Correlations

    eQhi)

    coefficient

    Spearman's

    p

    Pearson's

    r

    Pearson's

    r

    4.

    Single

    group

    compared

    to

    a

    population

    72

    Goodness--

    72

    cooaness-

    of-Fit

    of-Fit

    z-test,

    single_

    sample

    t

    z-test,

    single_

    sample

    t

    5.

    Two

    separate

    grouPs

    72

    Tolb

    x2

    Tol

    Wilcoxon's

    rank-sum,

    72

    Tol

    Wilcoxon's

    rank-sum,

    72

    Tol,

    indepen-

    5.

    Three

    or

    more

    grouPs

    72

    Tol

    dent-samples

    t

    72

    Tol

    ANOVA,

    Kruskal-Wallis

    ANOVA,

    Kruskal-Wallis

    .

    One group

    tested

    twice

    Mann-Whitney

    Mann-Whitney

    U,

    dependent-

    samples

    t

    U,

    dependent-

    samples

    t

    Standard

    a"ui".ioi---

    b

    77

    turt

    of

    independence

    SurvuvlARy

    Researchers

    use

    statistics

    to

    herp

    them

    test

    their

    hypotheses.

    often,

    sta_

    istics

    are

    used

    to

    generalize

    the

    rer,rlt,

    from

    u

    ,umpt"

    to

    a

    larger

    population.

    Mhich

    type

    oi

    statisticar

    t".t",r-,iq.r"

    i,

    .rrJ

    d"p"r,d,

    on ttre

    scale

    of

    mea_

    uremn

    on which

    the data

    ur"

    .br".t"d.

    D;;;

    measured

    on

    a

    nominal

    cale

    are

    classified

    in

    different

    lategories.

    order

    is

    not

    important

    for

    nomi_

    al

    data,

    but

    it

    is

    for

    autu

    mear.*"a

    ""

    ",.

    o;;;;

    scale

    of

    measurement.

    The

    ata

    mehsured

    on

    an

    interval

    scale

    "f

    ,.,;uJ;rl"",

    are

    also

    ordered

    but,

    in

    ddition,

    the

    units

    of

    measutu."r-,,.*"

    equal

    throughout

    the

    scale.

    The

    ratio

    cale

    of

    measurement

    is

    much

    like

    the

    *d;;i;.Ilu,

    "r,."pt

    that

    it

    includes

    a

    rue

    zero,

    which

    indicates

    ";;;;-ount

    of

    the

    construct

    being

    measured.

    he

    scale

    of

    measu.u-"-ilor

    the

    data

    anJthe

    questioriueing

    asked

    y

    the

    researcher

    determi"u

    *^ut

    statistical

    technique

    should

    be

    used.

    hen

    describin

    g

    dut?,.d;;.tp*e

    statistics

    are

    used.

    These

    include

    aver_

    ges

    and

    measures

    of

    disperri"".

    measures

    of

    dispersion population

  • 7/21/2019 4 Role of Statistics in Research

    8/8

    i

    I

    I

    I

    7tl

    ('lr,rpl1,1.;;.,,,,

    l)(

    )si

    t i

    v('

    t'or-r.r,la

    t

    iorr

    raltgc

    ratio

    scale

    sample

    scattergram

    ExpncrsEs

    standard

    cleviatiorr

    f-test

    variance

    within-group

    variance

    I lrc ltolt'ol Sl.ttistir's

    itt ltt'st',tt't'lt

    7q

    Concept

    Question

    4.2

    a.

    Irc>r

    gender,

    the

    mode;

    for the number

    in

    the car,

    the

    median trtrcl/or

    mode; for the type of

    car, the mode;

    for the

    speed,

    the

    mean,

    mediatt,

    and/or

    mode.

    b.

    The mean.

    Concept

    Question

    4.3

    If the standard

    deviation

    is 15, a day

    that is 10 degrees

    above the

    nor-

    mal temperature

    is

    not an unusually

    warm day; however,

    if the standard

    deviation

    is

    5, a

    day

    that is

    10

    degrees above

    the

    normal

    temperature

    is

    twice

    the average

    distance from the mean

    (roughly),

    and thus

    is an

    unusually

    warm

    day.

    Exercises

    1.

    Nominal:

    license

    plate

    numbers, eye color. Ordinal:

    ordered preference

    for

    five types of cookies, class

    rank. Interval: degrees

    Fahrenheit,

    money in

    your

    checking

    account

    (assuming

    you can

    overdraw). Ratio:

    loudness

    in

    decibels,

    miles per gallon.

    There are

    of

    course

    any number

    of other

    correct

    answers.

    3.

    a.

    range,

    standard

    deviation,

    and

    variance

    b.

    range,

    standard

    deviation, and

    variance

    c.

    range

    5.

    A

    positive

    correlation

    describes

    a relationship

    in which two

    variables

    change

    together

    in the same direction.

    For

    example,

    if

    the

    number of

    violent crimes

    increases as crowding

    increases, that would

    be a posi-

    tive

    correlation.

    A

    negative correlation

    describes a

    relationship

    in

    which two

    variables change

    together

    in

    opposite

    directions.

    For

    instance,

    if weight gained

    increases as the amount

    of exercise

    decreases,

    that would

    be a

    negative

    correlation.

    I

    t

    ::""ri1"L*t"ples

    of

    variables

    corresponding

    to

    each

    of

    the

    scales

    of

    ' ^

    ,:^I;?Xr;^er

    measures

    height

    in

    inches,

    what

    averages

    might

    be

    b'

    rf

    a

    researcher

    measures

    height

    by

    assignrlq

    pgople

    to

    the

    categories

    hort'

    medium,

    and

    tall,

    wnlt

    "";.;;;;might

    be

    calculated?

    3'

    a'

    If

    a

    researcher

    measrr"?

    y^"tgnt

    in

    pounds,

    what

    measures

    of

    dis_ersion

    could

    be

    calculated?

    ,

    Ifj"'::ff;.i:i:i:1,#J"Tr-"isht

    in

    ounces,

    what

    measures

    of

    disper_

    c'

    If

    a

    researcher

    measures

    weight

    by

    assigning

    each

    person

    to

    either

    ji.1,'.tliy;ffj'.Xil,"T,

    n

    "

    u

    "?

    :

    ;id;;;:

    wh

    a

    t

    *

    "

    u,,,",

    or

    d

    i

    sp

    e r

    _

    4.

    Which correlation

    is stron

    ger:

    _.g7or +.55?

    5'

    what

    is

    the

    difference

    between

    a

    positive

    and

    a

    negative

    correlation?

    rovide

    an

    exampre

    other

    than

    the

    one

    in

    the

    chapter.

    6'

    A

    researcher

    studying

    the

    effect

    of

    a

    speed-reading

    course

    on

    readingimes

    compares

    the

    s'cores

    of

    a

    grorf'rh;

    has

    taken

    the

    course

    with

    hose

    of

    a

    control

    group'

    The

    resear.h".

    finds

    that

    the

    ratio

    of

    the

    vari-

    tion

    between

    the

    groups

    to

    the

    variation

    -irt,i'

    the

    group

    is

    equal

    to

    '76'

    A

    colleag'"

    do",

    a

    simila*trly-u.a

    rir-,a,

    a

    ratio

    of

    between_

    roups

    variatiol

    to

    within-group

    variation

    of

    7.32.which

    ratio

    is

    more

    ikely

    to

    suggest

    a

    signifi.uit

    aiir"rence

    between

    reading

    groups?

    ANswERs

    To

    CorucEpT

    euESTroNs

    AND

    Oon-NuMB

    ERED

    ExERCriEs

    Note:

    There

    w'l

    often

    be

    more

    than

    one

    correct

    answer

    for

    each

    of

    hese

    questions.

    Consurt

    -ith

    yo,rr

    instructor

    about

    your

    own

    answers.

    Concept

    euestion 4.1

    a.

    ratio

    b.

    ordinal

    or

    interval

    c.

    nominal

    d.

    ratio

    e.

    nominal

    f.

    ordinal