experimental & behavioral economics - tu berlin · experimental & behavioral economics...

71
Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N. J. Castellan (1988) Nonparametric statistics for the behavioral sciences, McGraw-Hill, New York, and teaching material by John Duffy (University of Pittsburgh) Bernd Rönz (HU Berlin) David Danz Summer term 2014 1

Upload: others

Post on 06-Sep-2019

23 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Experimental & Behavioral Economics

Lecture 6: Non-parametric tests and selection of sample size

Based on Siegel, Sidney, and N. J. Castellan (1988)

Nonparametric statistics for the behavioral sciences, McGraw-Hill, New York, and teaching material by

John Duffy (University of Pittsburgh) Bernd Rönz (HU Berlin)

David Danz

Summer term 2014

1

Page 2: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Contents

1. Introduction (recap hypothesis testing)

2. Common (non-parametric) tests in experimental economics

3. Selection of sample size (power analysis)

2

Page 3: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Hypothesis testing

• The research hypothesis is the prediction derived from the theory under test.

• Null hypothesis (H0)

– is an hypothesis of “no effect” (e.g., μ1 = μ2)

– usually formulated for the purpose of being rejected

– If rejected, the alternative hypothesis (H1) is supported (not necessarily true)

• Alternative hypothesis (H1)

– is the operational statement of the experimenter's research hypothesis.

– nature of the research hypothesis determines how H1 should be stated (e.g., μ1 ≠ μ2, or μ1 < μ2, or μ1 > μ2)

3

Page 4: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Hypothesis testing

• The region of rejection

– is a region of the sampling distribution under H0

– includes all possible values that a test statistic can take on.

– consists of a set of possible values which are so extreme that when H0 is true the probability of observing them is very small (α)

4

Distribution of some test statistic under H0

Page 5: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

NON-PARAMETRIC TESTS

5

Page 6: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Non-parametric tests

• +

– If the sample size is very small, there may be no alternative to using a nonparametric statistical test (unless the nature of the population distribution is known exactly)

– Make usually fewer assumptions about the data

– Interpretation of nonparametric statistical tests is often more straightforward than the interpretation of parametric tests (easier to learn and to apply than are parametric tests)

• −

– If assumptions of a parametric statistical model are met in the data, then parametric statistical tests are usually more efficient (lower power-efficiency with non-parametric tests)

– parametric statistical tests have been systematized: different tests are simply variations on a central theme (non-parametric tests less systematic)

6

Page 7: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Non-parametric tests

• Two independent samples (e.g., between-subject design: same measure for each subject in two treatments)

– Fisher’s Exact Test / Chi-Square Test of independence

– Median test

– Wilcoxon-Mann-Whitney Test / Robust Rank Order Test

– Kolmogorov-Smirnov Test

• Two dependent samples (e.g., within-subject design: two measures or repeated measure for each subject)

– McNemar test

– Sign test / Wilcoxon Signed Ranks Tests

7

Page 8: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Scales

1. Nominal (or categorical) scale

– numbers or other symbols are used to classify an object, person, or characteristic (i.e., to identify the groups to which various objects belong)

– Example: Gender

2. Ordinal (or ranking) scale

– (1) + objects in one category of a scale stand in some kind of relation >R to objects in other categories (“higher”, “more preferred”, “more difficult”, etc.)

– Example: Socioeconomic status, grades

3. Interval Scale

– (2) + distances or differences between any two numbers on the scale have can be interpreted in a meaningful way

– Example: Temperature

4. Ratio Scale

– (3) + has a true zero point as its origin, thus the ratio of any two scale points is independent of the unit of measurement

– Example: Weight, age

8

Page 9: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

NON-PARAMETRIC TESTS INDEPENDENT SAMPLES

9

Page 10: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Fisher’s Exact Test

• Two independent samples

• Binary variables

10

Two independent samples Binary variables (nominal or ordinal)

Page 11: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Fisher’s Exact Test

• H0: No relation between the variables (independence)

• … under H0, the conditional probability of observing success for one variable is independent of the realization of the other variable. i.e., Pr(+|I) = Pr(+|II) = Pr(+)

11

Two independent samples Binary variables (nominal or ordinal)

Page 12: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Fisher’s Exact Test

• Hypergeometric distribution describes the probability of k successes in n draws without replacement from a finite population of size N containing exactly K successes.

• In our contingency table: – K = (A+C)

– N = (A + B + C + D)

– k = A

– n = (A+B)

12

Two independent samples Binary variables (nominal or ordinal)

Page 13: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Fisher’s Exact Test

• Idea: – Regard marginal totals as fixed

– A finite population of size N has (A+C) elements of group I and (B+D) elements of group II

– We draw a random sample of size (A+B) without replacement

– V is a random variable = number of observations sampled from group I

– In our sample, the realization of V is V = A

Under H0, the probability that V takes on the value A is given by the hypergeometric distribution

13

Two independent samples Binary variables (nominal or ordinal)

Page 14: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Fisher’s Exact Test

• With marginal totals being fixed, we can write down all possible contingency tables possible tables will be completely determined by alternative values for A (V)

• P-value is the probability (under H0) of sampling the observed or a „more extreme“ contingency table

• Let A be the observed frequency in the cell where the row and column containing the smallest and second smallest marginal frequencies intersect.

• Observed or „more extreme“ contingency tables (two-sided):

– |D| | Dobserved |= |A/(A+C) − B/(B+D)|

• Reject H0, if Pr(|D| | Dobserved |) < α

14

Two independent samples Binary variables (nominal or ordinal)

Page 15: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Fisher’s Exact Test

• Example

15

Two independent samples Binary variables (nominal or ordinal)

• II = observed: D = 0.589

Page 16: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Fisher’s Exact Test

• Example

16

Two independent samples Binary variables (nominal or ordinal)

• II = observed: D = 0.589

• Pr(|D| | Dobserved |) =

Page 17: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Chi-square test of independence

• Two variables, independent observations

• Generalization of Fisher’s exact test to more than two discrete categories

• Expected frequencies in each discrete category should not be too small

17

Two independent samples Nominal or ordinal scaling

– expected frequencies of each cell must exceed 1

– at most 20% of the cells with expected frequencies less than 5

Page 18: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Chi-square test of independence

18

Two independent samples Nominal or ordinal scaling

• H0: The variables are statistically independent = no relation = groups are sampled from the same population

Page 19: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Chi-square test of independence

19

Two independent samples Nominal or ordinal scaling

• Idea: Test whether the deviations of observed cell proportions (conditional probabilities) from cell proportions expected under H0 (independence) exceed what we can expect by chance (random deviations)

Page 20: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Chi-square test of independence

• Test statistic:

• nij = observed number of cases categorized in the ith row of the jth column

20

Two independent samples Nominal or ordinal scaling

• Eij = number of cases expected in the ith row of the jth column when H0 is true

Page 21: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Chi-square test of independence

• Asymptotically (as N gets large), X2 follows a chi-square distribution with df = (r – 1)(c – 1), where r is the number of rows and c is the number of columns in the contingency table

21

Two independent samples Nominal or ordinal scaling

Page 22: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Chi-square test of independence

• Example

22

Two independent samples Nominal or ordinal scaling

Page 23: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Chi-square test of independence

• Example

23

Two independent samples Nominal or ordinal scaling

Page 24: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

• Example

– df = (r – 1)(c – 1) = 2

Reject H0 since value of X2 is beyond the critical value with df = 2 and α = 0.05

Chi-square test of independence

24

Two independent samples Nominal or ordinal scaling

Page 25: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Chi-square test of independence

• Remark for 2x2 tables:

– if N not too large, use Fisher‘s exact test

– If N large (say N > 30), use chi-square test, but employ test statistic with continuity correction (Yates):

25

Two independent samples Nominal or ordinal scaling

Page 26: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

The median test

• Two independent groups

• At least ordinal scale

• H0: Groups do not differ in central tendency = groups have been drawn from populations with the same median

26

Two independent samples At least ordinal scale

Page 27: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

The median test

• Idea:

– first determine the median score for the combined group (i.e., the median for all scores in both samples)

– if both groups are samples from populations whose medians are the same, we would expect about half of each group's scores to be above the combined median and about half to be below

27

Two independent samples At least ordinal scale

Page 28: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

The median test

• Under H0, the sampling distribution of

– the number of the m cases in group I that fall above the combined median (A) and

– the number of the n cases in group II that fall above the combined median (B)

is the hypergeometric distribution:

28

Two independent samples At least ordinal scale

Page 29: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

The median test

• Remarks

– When several scores may fall right at the combined median:

i. The groups may be dichotomized as those scores that exceed the median and those that do not.

ii. If m + n is large, and if only a few cases fall at the combined median, those few cases may be dropped from the analysis.

Better do (i) and see whether it makes a difference when analysis based on „greater than or equal to“ or „greater than“.

– There may be no alternative to the median test, even for interval-scale data, e.g., with censored data (some observations may be “off the scale” and therefore measured as the maximum (or minimum) previously assigned to the observations.)

29

Two independent samples At least ordinal scale

Page 30: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

• Two independent groups

• At least ordinal scale

• Asymptotically equivalent to a t-test.

• H0: X and Y come from the same population, Pr(X>Y)= ½ =Pr(X<Y). … the median is the same in both groups (assuming that variances of the distributions in both groups are equal)

• H1 (one-tail):

– X is stochastically larger than Y, Pr(X>Y) > ½

– … the “bulk” of the elements in X are larger than the bulk of the elements in Y

• H1 (two-tail): Pr(X>Y) ≠ ½

Wilcoxon Mann-Whitney Test

30

Two independent samples At least ordinal scale

(a.k.a. Mann–Whitney U test, Wilcoxon rank-sum test, or Wilcoxon–Mann–Whitney test)

Page 31: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Wilcoxon Mann-Whitney Test

• Idea:

– m = number of observations in the sample from group X

– n = number of observations in the sample from group Y

– combine the observations from both groups and rank them in order of increasing size – lowest ranks are assigned to the largest negative values (if any)

– Note that the sum of the first N = (m+n) integers is 1 + 2 + 3 + . . . + N = N(N + 1)/2

– Wx is the sum of the ranks in group X

– Wy is the sum of the ranks in group Y

– Thus, Wx + Wy = N(N + 1)/2

31

Two independent samples At least ordinal scale

Page 32: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Wilcoxon Mann-Whitney Test

• Idea:

– If H0 is true, we would expect the average ranks in each of the two groups to be about equal.

– If Wx is very large (or very small), then we may have reason to suspect that the samples were not drawn from the same population.

– The sampling distribution of Wx (together with m and n) when H0 is true is known

– Hence, we can determine the probability associated with the occurrence under H0 of any Wx as extreme as the observed value.

32

Two independent samples At least ordinal scale

Page 33: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Wilcoxon Mann-Whitney Test

• Example

– Wx = 3 + 5 + 7 = 15

– WY = 1 + 2 + 4 + 6 = 13

– Pr(Wx 15, n = 4, m = 3 ) = .20

– (Pr(Wx ≤ 15, n = 4, m = 3 ) = .8857)

Do not reject H0

33

Two independent samples At least ordinal scale

Page 34: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Wilcoxon Mann-Whitney Test

• Remarks

– Use normal approximation for large samples (m > 10 or n > 10)

• Then, is asymptotically normally distributed with zero mean and unit variance.

– Wilcoxon test has greater power than the median test

• The Wilcoxon test considers the rank value of each observation rather than simply its location with respect to the combined median, and, thus, uses more of the information in the data.

34

Two independent samples At least ordinal scale

Page 35: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Wilcoxon Mann-Whitney Test

• Remarks

– When ties occur

• each of the tied observations the average of the ranks they would have had if no ties had occurred

• Correction of test statistic may be necessary (see Siegel & Castellan, 1988)

– Wilcoxon Mann-Whitney Test may be regarded as a permutation test applied to the ranks of the observations and, thus, constitutes a good approximation to the permutation test.

35

Two independent samples At least ordinal scale

Page 36: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Robust Rank Order Test

• In order to interpret Wilcoxon tests as a test for equality of medians, we have to assume equal variances in both groups

• The robust Rank Order Test relaxes the assumption of the same variances, i.e., the underlying distributions may be different when testing equality of medians

• As before:

– Two independent groups, at least ordinal scale

– m = number of observations in the sample from group X

– n = number of observations in the sample from group Y

– combine the observations from both groups and rank them in order of increasing size, were lowest ranks are assigned to the largest negative values (if any)

36

Two independent samples At least ordinal scale

Page 37: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Robust Rank Order Test

• Procedure

– For each observation in X [Y] we count the number of observations of Y [X] with a lower rank (“placement of Xi [Yj]”) =: U(YXi) [=: U(XYj)]

– Calculate the mean of the placements in X [and Y]:

– Calculate the index of variability of U(YXi) and U(XYj):

– Test statistic with known distribution:

37

Two independent samples At least ordinal scale

Page 38: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Robust Rank Order Test

• Example

– U(YX) = 3

– U(XY) = .75

– Vx = 2

– Vy = 2.75

– Ù = 1.13

– Pr(Ù > 1.13) > 0.1

do not reject H0 (same conclusion with Wilcoxon Mann-Whitney test)

38

Two independent samples At least ordinal scale

Page 39: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Kruksal-Wallis Test

• Do k > 2 independent samples (ordinal/ordered data) come from the same or different populations?

• Extension of Mann-Whitney to three or more samples.

• Analogue to the F-test used in analysis of variance, but without the assumption that all populations under comparison are normally distributed.

• H0: All k samples have the same distribution functions.

• H1: At least two of the samples have different distribution functions.

39

More than two independent samples At least ordinal scale

Page 40: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Kolmogorov-Smirnov Test

• Prerequisites

– Here: Two independent samples/groups

– At least interval scale

• H0: samples have been drawn from the same population (i.e., from populations with the same distribution)

• sensitive to any kind of difference in the distributions from which the two samples were drawn- differences in location (central tendency), in dispersion, in skewness, etc.

40

Two independent samples At least interval scale

Page 41: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Kolmogorov-Smirnov Test

• Idea

– If the two samples have been drawn from the same population distribution, then the cumulative distribution functions (CDF) of both samples are expected to be close to each other

– If the two sample CDFs are "too far apart" at any point, this suggests that the samples come from different populations

large deviations between the two sample CDFs is evidence against H0

41

Two independent samples At least interval scale

Page 42: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Kolmogorov-Smirnov Test

• Procedure

– determine the empirical CDF for each sample by using the same intervals for both distributions

– for each interval we subtract one step function from the other

– test focuses on the largest of these observed deviations

– Sm(X) := empirical CDF for sample A (of size m), i.e., Sm(X) =K/m, where K is the number of observations equal to or less than X

– Sn(X) := empirical CDF for sample B (of size n)

– Kolmogorov-Smirnov two-sample test statistic

• one-sided: Dm,n = max[Sm(X)− Sn(X)]

• two-sided: Dm,n = max[|Sm(X)− Sn(X)|]

– Reject H0 if Dm,n is too large (sampling distributions of Dm,n are known, depend on nature of H1)

42

Two independent samples At least interval scale

Page 43: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Kolmogorov-Smirnov Test

• Example

43

Two independent samples At least interval scale

Page 44: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Kolmogorov-Smirnov Test

• Example

– Dm,n = 0.70, m = 9, n = 10

– Value of test statistic greater than critical value -> reject H0.

44

Two independent samples At least interval scale

Page 45: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Kolmogorov-Smirnov Test

• Remarks – Can also be used to test an empirical distribution from one sample

against some theoretical distribution (as the corresponding chi-square test); then the theoretical distribution must be continuous

45

Two independent samples At least interval scale

Page 46: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

NON-PARAMETRIC TESTS DEPENDENT SAMPLES

46

Page 47: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

McNemar test

• Two related (dependent) samples

• Binary variable

• Test for the significance of changes in some binary response (e.g., by treatment manipulation)

• Often used in the context of “before and after” designs

47

Two dependent samples Binary variable (nominal or ordinal scale)

Page 48: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

McNemar test

• Idea

– B, C: # individuals who responded the same on each treatment (+ and –, respectively)

– A, D: # individuals whose responses changed between treatments (from + to –, and from – to +, respectively)

48

Two dependent samples Binary variable (nominal or ordinal scale)

Page 49: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

McNemar test

• Idea

– Thus, (A + D) is the total number of people whose responses changed.

– Focus on cells in which changes may occur: Without any treatment effect, the number of changes in each direction would be equally likely.

– H0: Expected number of observations in each cell is (A + D)/2

49

Two dependent samples Binary variable (nominal or ordinal scale)

Page 50: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

McNemar test

• Remember: Test statistic for the Chi-square test of independence

– Oi = number of cases observed in category i

– Ei = number of cases expected in category i (under H0)

• Applied to cells counting changes, we yield McNemar’s statistic: which (approximately) follows a chi-square distribution with df = 1

50

Two dependent samples Binary variable (nominal or ordinal scale)

Page 51: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

McNemar test

• Remarks

– Correction for continuity (Yates) gives better approximation (correction is necessary because a continuous distribution (chi-square) is used to approximate a discrete distribution):

– If the total number of “changes“ (A+D) is less than 10, use the binomial test rather than the McNemar test.

51

Two dependent samples Binary variable (nominal or ordinal scale)

Page 52: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

McNemar test

• Example

– Under H0, Pr(X2 > 1.25) > 0.05

Do not reject H0 52

Two dependent samples Binary variable (nominal or ordinal scale)

Page 53: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Sign test

• Two related samples

• Variable under consideration has a continuous distribution

• Xi: score of subject i in treatment X

• Yi: score of subject i in treatment Y

• H0: Pr(Xi > Yi) = Pr(Xi < Yi) = ½ = „median difference between X and Y is zero“

53

Two dependent samples At least ordinal scale

Page 54: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Sign test

• Idea

– focus on the direction of the difference between every Xi and Yi, noting whether the sign of the difference is positive or negative

– When H0 is true, we would expect the number of pairs which have (Xi > Yi) to be equal to the number of pairs which have (Xi < Yi).

– H0 is rejected if too few differences of one sign occur.

54

Two dependent samples At least ordinal scale

Page 55: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Sign test

• The probability associated with the occurrence of a particular number of positive (and negative) differences can be determined by the binomial distribution with

– p = 1/2,

– N = the number of pairs.

• If a matched pair shows no difference (i.e., the difference is zero and has no sign), it is dropped from the analysis and N is reduced accordingly.

55

Two dependent samples At least ordinal scale

Page 56: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Sign test

• Example

56

Two dependent samples At least ordinal scale

Page 57: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

– Probability of observing k of n ranks being negative:

– With n = 12, p = ½, the probability of observing 2 or less negative (or positive) signs = 2*Pr(X ≤ 2) = 2*0.0193 = 0.0386

Reject H0

Sign test

• Example

57

Two dependent samples At least ordinal scale

k pdf cdf

0 0.0002 0.0002

1 0.0029 0.0032

2 0.0161 0.0193

3 0.0537 0.0730

4 0.1208 0.1938

5 0.1934 0.3872

6 0.2256 0.6128

7 0.1934 0.8062

8 0.1208 0.9270

9 0.0537 0.9807

10 0.0161 0.9968

11 0.0029 0.9998

12 0.0002 1.0000

Page 58: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Sign test

• Remark

– For large samples (say, N > 35), normal approximation to the binomial distribution is used

58

Two dependent samples At least ordinal scale

Page 59: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Wilcoxon Signed-Rank Test

• Sign test uses only information about the direction of the differences within pairs

• Wilcoxon signed-rank test uses also the relative magnitude – gives more weight to a pair which shows a large difference between the two conditions than to a pair which shows a small difference.

59

Two dependent samples At least ordinal scale

(a.k.a. Wilcoxon T test)

Page 60: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Wilcoxon Signed-Rank Test

• Idea

– Calculate the difference di = Xi – Yi for each matched pair of observations

– Rank di's without respect to sign

– Assign to each rank the sign (+ or –) of the di which it represents.

– If H0 is true, the sum of ranks having plus signs and summed those ranks having minus signs, are expected to be equal

– Reject H0 if the sum of the positive ranks is too different from the sum of the negative ranks, (suggesting that treatment X differs from treatment Y)

60

Two dependent samples At least ordinal scale

Page 61: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Wilcoxon Signed-Rank Test

• N = number of nonzero di’s.

• T+ = sum of the ranks which have a positive sign

• T– = sum of the ranks which have a negative sign

• Note: the sum of all of the ranks is N(N + 1)/2 = T+ + T–

• Distribution of T+ under H0 is known (Wilcoxon Signed-Rank Test corresponds to permutation test (for paired observations) based on ranks rather than scores di)

61

Two dependent samples At least ordinal scale

Page 62: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Wilcoxon Signed-Rank Test

• Example

– T+ = 10 + 10 + 12 + 6 + 3 + 8 + 5 + 11 + 9 + 2 + 7 = 73

– 2*Pr(T+73, N=12) = 0.0048

Reject H0 (as with sign test, but note lower p-value here) 62

Two dependent samples At least ordinal scale

Page 63: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Wilcoxon Signed-Rank Test

• Remarks

– Ties

• pairs with di = 0 are dropped from the analysis and the sample size is reduced accordingly.

• When two or more d's have the same magnitude, their rank is the average of the ranks which would have been assigned if the d's had differed slightly

– Large Samples

• T+ is approximately normally distributed with

63

Two dependent samples At least ordinal scale

Page 64: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

SELECTION OF SAMPLE SIZE - POWER ANALYSIS

64

Page 65: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Power analysis

65

True state of the world (population)

H0 is true

H1

is true

Test result (based on sample)

Do not reject H0

Correct (1- α)

Type II Error β

Reject H0

Type I Error α

Correct (1- β) “power”

Page 66: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Power analysis

• Type I error:

– rejecting H0 when it is, in fact, true.

– Pr(Type I error) =: α

– In experimental economics, common values of α are .05 and .01

• Type II error:

– failing to reject H0 when, in fact, it is false.

– Pr(Type II error) =: β

66

Distribution of test statistic

under H0

True distribution of the test statistic:

Page 67: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Power analysis

• Type I error:

– rejecting H0 when it is, in fact, true.

– Pr(Type I error) =: α

– In experimental economics, common values of α are .05 and .01

• Type II error:

– failing to reject H0 when, in fact, it is false.

– Pr(Type II error) =: β

• Power of a test:

– Probability of correctly concluding a significant effect when it really exist in the population = 1 - Pr(Type II error) = 1 - β

– Usually desired to be ≥ 0.80

67

Page 68: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Power analysis

68

Distribution of test statistic under H0:

True distribution of the test statistic:

Page 69: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Power analysis

• Power depends on

– Level of significance α (+)

– True effect size in the population (+)

– Sample size N (+)

– Variance in the data (−)

– The kind of test (e.g., Sign test versus Wilcoxon signed rank test)

– The nature of H1 (one-sided > two-sided)

– … and other variables, depending upon the test being done.

69

Page 70: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Power analysis

• Given

– The kind of test (and nature of H1)

– Probability of Type-I error

– Power (1−β)

– Presumed size of effect/parameter

– Variance in the data

– (and further assumptions, depending on the test)

we can determine the lowest sample size we need in order to detect the presumed effect (with probability (1−β))

70

Page 71: Experimental & Behavioral Economics - TU Berlin · Experimental & Behavioral Economics Lecture 6: Non-parametric tests and selection of sample size Based on Siegel, Sidney, and N

Power analysis

• Ways to determine power

– If the (approximate) distribution of the test statistic is known, we may calculate the power for given parameters directly

– If the distribution of the test statistic is not known or if an analytical solution is to tedious (e.g. some parameter of a structural model), we may determine power by simulation

• Example: Two-sample test of proportions

See script

71