5 comparison of two groups

Upload: som-piseth

Post on 03-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 5 Comparison of Two Groups

    1/17

    n ro uc on o var a e a s ca na ys s:

    Comparison of Two Groups

    .

    Aims

    Difference Hypotheses

    Nominal Scales: Difference of Proportions

    Ordinal Scales: Wilcoxon Test

    NPAR Tests

    2010 Poch Bunnak 2

  • 7/28/2019 5 Comparison of Two Groups

    2/17

    Recall the link b/w research purposes & statistics

    2010 Poch Bunnak 3

    Difference H otheses

    Examples of difference hypotheses There is a difference between men and women with regard

    to earnings

    en are more e y t an women to m grate Migration propensity tends to be higher among men than

    among women

    Students have to learn developing difference

    hypotheses: IV must be binary or categorical

    Replace the variables in the examples above.

    2010 Poch Bunnak 4

  • 7/28/2019 5 Comparison of Two Groups

    3/17

    Start-up Statistical tests are much more commonly applied to

    two-sample comparisons than to one-sample

    The two samples compared can be:that a member is chosen for the inclusion in one sample isnot dependent on which members are selected in the other

    Groups to be compared can be derived from dividing a largersample into subgroups: men vs. women, rural vs. urban

    ,constitute independent random samples

    Dependent samples: They occur when members of onesample (husbands-wives, same samples at time1-time2.

    Know our binar IV and the DV measurement

    2010 Poch Bunnak 5

    H othesis Tests for Cate orical Data

    Categorical Data

    Tests of Proportions Test of Independence: 2 Test

    1 Population: Z Test (for parametric test)and Binomial Sign Test (See NPAR below)

    2 Large Ind. Samples: Z Test

    2 small ind. samples: 2 Test (any tables)

    2 small ind. samples: Fisher (fe < 5; 2x2)

    2010 Poch Bunnak 6

    2 Dep. Samples: McNemar (pXp tables)

  • 7/28/2019 5 Comparison of Two Groups

    4/17

    Nominal Scales: Difference of Pro ortions

    For 2x2 Table. Use that data in Table 7.1 as example.

    Ha: Any c ange n support or t e pres ents per ormance

    (measured by % approval) after 2 months in the office? 0 0 - -

    The parameter to be tested is 21, where 2 is theproportion of approval in Feb. and is the proportion ofapproval in Jan. Thus, Ha: 2 1 and H0: 2 = 1

    Based on the data, 21 = .04. Now, we need to examine ife c ange s s a s ca y s gn can .

    Use the formula on page 169 to compute 99% CI and 95% CI

    Use the z formula on a e 170 and formula on a e 169 toperform a statistical test and check against the critical z value at =.001 and at = .05.

    2010 Poch Bunnak 7

    Nominal Scales: Difference of Pro ortions cont.

    99% CI for 21 = .04 = (-.004, .084) We are sure that is reater than but the there is insufficient

    evidence to conclude that the difference is statistically significant atthe = .01 level.

    = =2 1

    . . , . Is the change statistically significant at the = .05 level?

    Z = 2.35. Table A: z/2 = 2.35p = .0094 for one-sidedp = 2(.0094) =

    .0188 for two-sided.

    ,presidents performance, though not enough evidence to reject H0 atthe = .01 level.

    equ remen s or n erva es ma on o e erence wequ remen s or n erva es ma on o e erence w

    two proportions: ntwo proportions: n11pp11, n, n11(1(1--pp11), n), n22pp22, and n, and n22(1(1--pp22)) 55

    2010 Poch Bunnak 8

  • 7/28/2019 5 Comparison of Two Groups

    5/17

    Other Nominal Tests1. Chi-square Test (See HD 6 for more)

    It generalizes the 2-sample Z-test to use with >2 proportions,but equivalent to Z-test when comparing 2 proportions.

    Assumptions (Page 209): 2 nominal variables

    Random or stratified random sample

    2x2 table: f 5 in all cells RxC: fe 5 in at least 75% of cells and fe 1 in the remaining cells

    2 statistic = (fo - fe)/fe H0: the 2 nominal variables are independent

    All2

    test values (p) are one-sided; that is, for tests withs g. eve , compare s a s c o df,1- w = r- c- Conclusion: Reject H0 at -level if p < 2

    2010 Poch Bunnak 9

    -

    Other Nominal Tests cont.2. Fishers Exact Test

    Assumptions: oes no re y on e norma y assump on, u uses e exac

    distribution of the data.

    For 2x2 table only when fe < 5 (use in place of2 test)

    0: t ere s no re at ons p etween t e two var a es. The p-value can then be computed by calculating the number

    of ossible arran ements of observations that roduce tablesthat are more extreme than the observed and then dividingthis by the total number of possible arrangements of theobservations.

    See SPSS output for understanding

    3. McNemars Test

    For comparing proportions from paired data that have PxP (r= c) tables

    -

    2010 Poch Bunnak 10

  • 7/28/2019 5 Comparison of Two Groups

    6/17

    Interval Scales: Difference of Means, Large

    2 studies in: 1965 1985

    x yStandard deviation: $300 $700

    N 25 50

    Ha: 1 2 (change in the mean health expenses) H0: 1 = 2 (no change)

    z test for large samples (n1 and n2 20) ttest for small samples (n1 and n2 < 20)

    The samples must be independent

    The DV is normally distributed The variances of the DV in the two populations are equal.

    Formula on page 172 to compute CI; Formula on page 173to com ute z for testin h othesis

    2010 Poch Bunnak 11

    Interval Scales: Difference of Means for Large

    .

    See computation in the Excel (file: compare 2

    groups)

    Verif with SPSS

    Difference of proportions cannot be obtained with

    Difference of means can be obtained using

    Ana yze compare means n epen ent-samp es t-

    test

    o e a rea e pop parame er as un nown,

    thus only t test is available.

    2010 Poch Bunnak 12

  • 7/28/2019 5 Comparison of Two Groups

    7/17

    Paired differences for de endent sam les

    This is the case when cases in sample 1 are matched,

    Comparing the means of the 2 samples (2 and 1), ,inference about H0 is based on the single sample ofthed distribution.

    The data must be restructured or have match ID var.

    =0 2 1 a 2 1 See example 7.5

    se pa r es .sav : na yze ompare eans Paired-Samples T Test

    2010 Poch Bunnak 13

    SPSS T-Test of Paired differences

    Descript. stats for each var.

    This is not part of theaired-sam le test

    T-Test stats for pairedsamples. Report those

    encircled.

    2010 Poch Bunnak 14

  • 7/28/2019 5 Comparison of Two Groups

    8/17

    How to restructure our data?

    Data restructure

    ID Therapy Score

    1 1 60

    ID = identifier and Therapy = index

    3 1 80

    1 2 80

    ID Score1 Score2

    1 60 80

    2 70 952 2 95

    3 2 953 80 95

    Data restructure

    Restructure se ecte vars nto cases

    Select one group

    2010 Poch Bunnak 15

    core core = ars o etransposed and pair = fixed var

    Other Im lications of Paired-Sam le Tests

    Checking Reliability:

    The paired t-test can be used to check reliability,

    especially test-retest reliability

    Suppose that the result above is based on the test-retest data thera 1 and 2 . You should re ort:

    Are the two means different? Large or small difference?

    Re ort the aired sam les correlation coefficients r:

    large (1) or small ( 0). The larger the r, the stronger

    the association and the more reliable is the therapy.

    The paired samples test statistics are of lesserimportance.

    2010 Poch Bunnak 16

  • 7/28/2019 5 Comparison of Two Groups

    9/17

    Ordinal Scales

    Independent Samples:

    Mann-Whitney U Test (nonparametric, equivalent

    to t test): Tests whether two independent samples are from the

    same population.

    Requires an ordinal level of measurement.

    U is the number of times a value in the first group

    precedes a value in the second group, when values are

    sorted in ascending order.

    s more power u an e me an es s nce uses

    the ranks of the cases.

    2010 Poch Bunnak 17

    Ordinal Scales

    Dependent Samples:

    Sign Test (See NPAR tests below):

    Note about the two sign tests:

    One is for one population with dichotomous data and the testis based on binomial distribution. SPSSNPAR tests

    . .

    The other is for paired ordinal data. NPAR tests 2 related

    samples Check the sign box. This is called SIGN TEST.

    Wilcoxon Signed-Rank Test (See NPAR tests

    below)

    2010 Poch Bunnak 18

  • 7/28/2019 5 Comparison of Two Groups

    10/17

    Other Ordinal-Data Tests (Not covered) Kolmogorov-Smirnov Z:

    A test of whether two groups come from the same distribution.

    ,between the two cumulative distributions.

    Moses Test: A nonparametric test designed to test hypotheses in which it is expected that the

    exper men a var a e w a ec some su ec s n one rec on an o er su ec s n eopposite direction.

    Tests for extreme responses compared to the control group. Requires an ordinal scale ofmeasurement. This test focuses on the span of the control group, and is a measure of how

    the control group.

    Wald-Wolfowittz runs: A nonparametric test of the hypothesis that two samples come from the same population.

    Requires at least an ordinal scale of measurement. The values of the observations fromboth samples are combined and ranked from smallest to largest.

    Runs are sequences of values from the same group. If the samples are from the same

    population, the two groups should be randomly scattered throughout the ranking.

    2010 Poch Bunnak 19

    Nonparametric Tests (NPT)

    NPT can be used with nominal data, ordinal data, orinterval/ratio data when no assum tion can be madeabout the pop. prob. Distribution. Below describesome NPT for at least numerically ordinal data.

    1- Sign Test (One Population=Binomial Test) Sign test is used to test if there is a difference in preferring

    . Ex. In a study of rural development, n of villagers were asked if

    they prefer raising pigs or raising fish.

    0 .raising pigs is .5 (the same as the proportion of preferringraising fish)

    Ha: p .5 If H0 is true (p=.5), it cannot be rejected. Then, there is no

    evidence indicatin that a difference in reference exists.

    2010 Poch Bunnak 20

    Tests of H0: small sample (n20) and large sample (n>20)

  • 7/28/2019 5 Comparison of Two Groups

    11/17

    Non arametric Tests (NPT), cont.

    1.A. Sign Test for n20: Data: 12 Rs. 4 prefer raising fish, 8 raising pigs

    Ste s in conductin the si n test:

    a. H0: P(fish) = .5; Ha: P(fish) .5 (2-tailed).

    b. Assi n + si n to those referrin raisin fish and

    sign to those preferring the alternative (raising pigs).

    The number of + signs is used in the calculations to

    determine if H0 is rejected.

    c. H0 has a binomial probability distribution.d. For n = 12, H0 is rejected:

    at = .05 if n of + signs < 3 or n of + signs > 9

    2010 Poch Bunnak 21

    Binomial Probability Distribution (n=12)At p < .05:

    Lower end: p < .025; n of +s should be 2 because the sum of the prob of0 1 and 2 is .0002+.0029+.0161= .0192 < .025

    0.250

    Higher end: n 11 because the sum of the prob of 10, 11, and 12 = .092 < .025

    0.1934

    .

    0.19340.200

    0.1208 0.1208

    0.100

    0.150

    0.0537 0.0537

    0.00020.0029

    0.050

    0.0161 0.0161.

    0.00020.0000 1 2 3 4 5 6 7 8 9 10 11 12

    2010 Poch Bunnak 22

    Thus, H0 is rejected at = .05 ifn of + signs < 3 or n of + signs > 9

  • 7/28/2019 5 Comparison of Two Groups

    12/17

    Non arametric Tests (NPT), cont.

    1.B. Sign Test for n>20:

    se z str ut on, w t mean = . n an stan ar

    deviation = sqrt(.25n) uppose e a a con a n or s an or p gs:

    = .5*n = .5*30 = 15; = sqrt(.25n) = 2.74

    e s r u on o s norma , so we re ec 0 z -1.96 or z > 1.96 at = .05

    . .

    Therefore, the H0 is accepted, meaning that there is no

    raising fish or raising pigs.

    Use binomialsav data: NPAR Tests Binomial2010 Poch Bunnak 23

    Non arametric Tests NPT cont.

    2. Sign Test for Paired Ordinal Data: ,

    assumptions about the data, but it is also not verypowerful

    Used to test the hypothesis that two variables have thesame distribution. H0: the median difference is zero:

    e on y nee e s gns or - o e erences o eva ua e snull hypothesis. The differences between the two variables for allcases are computed and classified as either + or (ties excluded)

    a t e erences are pos t ve > , an a are negat ve 7), where X =number of positive differences.

    Use binomial distribution table for n = 8, p = 0.5: P-value =P(X=0) + P(X=1) + P(X=7) + P(X=8)

    = + + +. . . .

    =.0704

    H0 is accepted, the mean difference of the 2 populationss , e wo popu a ons ave e same s r u on.

    SPSS: use binomial sign test paired data.savanalyze nonparametric tests2 related samplessign.

    2010 Poch Bunnak 25

    Non arametric Tests NPT cont.

    3. Mann-Whitney U Test

    Other names: Wilcoxon rank-sum test, Mann-

    Whitney-Wilcoxon, Mann-Whitney test

    It is used to test if there is a difference betweentwo o ulations H =2 o s are identical

    Assumptions:

    2 independent samples

    equal variances

    2010 Poch Bunnak 26

  • 7/28/2019 5 Comparison of Two Groups

    14/17

    SPSS Result for Mann-Withne Test

    Report Ranks Table, Z, and Sig.- Describe the results in Ranks Table:

    erapy g ves poorer resu s anTherapy 2 (2.33 vs. 4.67)

    -significant at =.05 level (z=-1.52,

    p

  • 7/28/2019 5 Comparison of Two Groups

    15/17

    Non arametric Tests (NPT), cont.

    3.A. Mann-Whitney Test for small samples (N10): Steps:=.

    Highest = n, ties=average)

    ii. Split the ranked data by groups and compute the sum ofranks for each group, symbolized by T,

    iii. Find the possible values ofTfor one group (H0 group, ex., .

    For BT of n=4, min T= 1+2+3+4 = 10 and max T= 6+7+8+9 = 30.Thus, the possible Tfor BT is (10,30)

    If the 2 pop are identical, the value of BT would be near the averageof (10+30)/2=20.

    .values for the Mann-Whitney-Wilcoxon Test to compute:

    TU= n1(n1+n2+1)-TL (TL = 12 for n1=4, n2=5, and = .05)

    2010 Poch Bunnak 29

    Reject H0 ifT< TL or ifT> TU

    Non arametric Tests NPT cont.

    3.B. Mann-Whitney Test for large samples (N>10):

    o ow eps - a ove

    iv. Since n is large, the sampling distribution ofTis

    norma . ompu e: Mean: T= [n1(n1+n2+1)]

    12 1 2 1 2 Z = (T- T) /

    . . , - .1.96.

    . . .

    nonparametric tests 2 independent tests-

    2010 Poch Bunnak 30

  • 7/28/2019 5 Comparison of Two Groups

    16/17

    Non arametric Tests (NPT), cont.

    3. Wilcoxon signed-rank test (=Wilcoxon Test in SPSS) Assumptions:

    2 dependent samples as the sign test, but it is better than the sign testbecause it compares the signs andthe rank magnitude of the differences.

    DV (the pairs scores) can be ordinal or interval e erence e ween pa rs o scores s or na y sca e cu o es

    The signed-rank test compares the sum of the average ranks of positivedifferences (R1) to those of the negative differences (R2).

    0: e me an erence s zero ese ran sums are a ou equa .

    Computep-value to reject or accept H0: If there are less than 16 non-zero differences, use Rosner Table

    ere are or more o non-zero erences, use z norma score. s swhat you get when running Wilcoxon signed-rank data in SPSS.

    Use Wilcoxon signed test for paired data.sav Analyze

    non arametric tests two-related-sam les test select the air variables Ensure that Wilcoxon is cheched

    In the output, note on the +, -, = ranks and test statistics. Those should bereported and interpreted, in addition to the mean rank for each paired rank.

    2010 Poch Bunnak 31

    Summar

    Independent samples

    Normality assumptions

    No Normalit assum tion

    Categorical DV

    Ordinal DV: Mann-Whitne U

    Interval DV

    2010 Poch Bunnak 32

  • 7/28/2019 5 Comparison of Two Groups

    17/17

    Writing the SPSS Outputs

    2010 Poch Bunnak 33

    Writing the SPSS Outputs: Your Table

    2010 Poch Bunnak 34