tauchi – tampere unit for computer-human interaction erit 2015: data analysis and interpretation...

TAUCHI – Tampere Unit for Computer-Human Interaction

ERIT 2015: Data analysis and interpretation (1 &

2)

Hanna VenesvirtaTampere Unit for Computer-Human

Interaction


Aims

• See which analysis are used for the course projects and how


Overview of comparing samples

•Our aim is simple: we wish to find out, if the means of our collected data samples are separated enough to conclude that the means are likely to be different


Overview of comparing samples

• Null hypothesis: no difference• Alternative hypoth.: there is difference• Aim to reject the null hypoth.• Result is ”statistically significant” when

there is only little likehood that the null hypothesis is true – p-value < 0.05


Analysis of Variance (ANOVA)

• Is used to find out differences between means from more than two sample means• Two sample designs: t-tests

• Can be used for testing the effects of more than one independent variable (IV) at one time• 2-way / 3-way / etc.-way designs


Repeated measures ANOVA

• Is used if we have measured all the participants under all the different levels of the (different) IV(s)

• Standard ANOVA cannot be used as the data is correlated


Analysis example step-by-step• Experimental task: select an object as

fast as possible• Depend variable: selection time (ms)• One independent variable: diameter of

an object• With three levels: diameter either

25, 30, or 40 mm• All the participants made the same

task-> one-way within subjects design


participant no.Diameter: 25 mm Diameter: 30 mm Diameter: 40 mm1 2491 1240 11552 6462 1852 26033 1007 738 7474 1164 860 8065 1890 1919 12266 3400 1238 13867 1092 874 8748 2180 1635 18809 1614 949 971

10 1663 1442 178211 1599 1066 127712 1082 1160 125413 1425 1142 132914 1212 2521 119715 2542 1703 112816 1472 1861 134917 1463 1090 101718 1048 1073 103719 1289 1857 117520 1712 899 917

Note! The values per participant per level of IV are averages of several tasks - usually one exact task is repeated several time during the trial.



10 1663 1442 178211 1599 1066 127712 1082 1160 125413 1425 1142 132914 1212 2521 119715 2542 1703 112816 1472 1861 134917 1463 1090 101718 1048 1073 103719 1289 1857 117520 1712 899 917

Mean 1890 1356 1255S.E.M. 276,2519175 105,3038123 95,3892171

Note! The values per participant per level of IV are averages of several tasks - usually one exact task is repeated several times during the trial.

..Thus, these means are averages of averages.


Visualize your data! (on this case: means)

• Good for initial inspection of the possible difference

• Excellent for showing a summary of the results to the reader


Visualize your data!

• Column graphs are good when presenting means

• The one below shows only the means of different levels of the IV

0

200

400

600

800

1000

1200

1400

1600

1800

2000

Diameter 25 mm Diameter 30 mm Diameter 40 mm

Mea

n po

intin

g tim

e (m

s)

Target size


Visualize your data! • This one shows also the deviation of

the data sample • Here: Standard Error of the Mean

(S.E.M.)

If you add error bars to the graphs, see, e.g., https://www.youtube.com/watch?v=AfAG61UWsWA

0

500

1000

1500

2000

2500


Mea

n po

intin

g tim

e (m

s)

Target size


0

500

1000

1500

2000

2500


Mea

n po

intin

g tim

e (m

s)

Target size


10 1663 1442 178211 1599 1066 127712 1082 1160 125413 1425 1142 132914 1212 2521 119715 2542 1703 112816 1472 1861 134917 1463 1090 101718 1048 1073 103719 1289 1857 117520 1712 899 917

Mean 1890 1356 1255S.E.M. 276,2519175 105,3038123 95,3892171

The means differ!

…significantly?


Data to SPSS?1) Select “variable view” –tab from the bottom left corner2) Add descriptive variable names3) Select “data view” –tab from the bottom left corner4) Add your data by, e.g., copy-and-paste from, e.g., excel•NOTE! Only the numbers – you defined the column headings already on points no. 1-2


Parametric tests – One way repeated measures ANOVA

andPaired samples t-test


From the output, find table called “tests of within-subjects effects” –

this is where ANOVA result is


…but which row to read?

???


Go back up and find table called”mauchly’s test of sphericity”

Tests, if the data looks like this: …or, more like this:


If the result from this test is significant..

…the variances of the data are not equal, that is, the sphericity cannot be assumed


Back to the result table…

- thus we read the second row

As we cannot assume sphericity, we cannot read the first row from

the result table


..also the with Greenhouse-Geisser corrected degrees of freedom the significance value is less than 0.05


NOTE: If it happens that the result from the mauchly’s table is not

significant we can assume sphericity, and thus we can use the

result from the first row


• Thus there is a difference, but where?-> we shall find out by running pairwise

comparisons with paired samples t-tests

• NOTE: pairwise comparisons are not to run if the ANOVA shows non-sign. result

• Comparing • 25 mm to 30 mm, • 25 mm to 40 mm, and • 30 mm to 40 mm -> 3 comparisons

Multiple comparisons – remember to adjust the p-value in order to avoid Type I error!

Bonferroni correction: original p / number of comparisons

Here: 0.05/3 = ~0.017


Paired Samples Test

Paired Differences t df Sig. (2-tailed)

Mean Std. Deviation Std. Error Mean 95% Confidence Interval of the

Difference

Lower Upper

Pair 1Diameter25mm -

Diameter30mm

534,25047932330

8200

1188,6194789491

00000

265,78339543105

6930

-

22,040560576190

714

1090,5415192228

07000

2,010 19 ,059


Diameter40mm

634,76911027568

9700

942,43528288740

0500

210,73493569304

7280

193,69582076596

8620

1075,8423997854

10600

3,012 19 ,007


Diameter40mm

100,51863095238

1600

456,37710097444

7500

102,04902211531

5080

-

113,07242706381

1220

314,10968896857

4230

,985 19 ,337

From the output, find table called ”paired samples test” – here are the

resultsThis one is smaller than the adjusted p (0.007 < 0.017), thus the significant difference is between this comparison…

0

500

1000

1500

2000

2500


Mea

n po

intin

g tim

e (m

s)

Target size

…and you can check the direction of the difference from, e.g., the graph you made.


• Reporting ANOVA result following is needed:

• (fixed) degrees of freedom (here: ~1.2 and ~22.4)

• F-value (here: ~5.6)• p-value (here p < 0.05)

ANOVA reporting


“One-way within subjects ANOVA with object diameter size as a factor revealed a statistically significant effect of the object diameter size, F(1.2, 22.4) = 5.6, p < 0.05.”

ANOVA reporting


• For reporting the results from pairwise comparisons (with paired sample t-tests) following is needed:

• Degrees of freedom (here: 19)• t-value (here: ~3.0)• p-value (here: p < 0.01)

Reporting pairwise comparisons


Paired Samples Test

Paired Differences t df Sig. (2-tailed)

Mean Std. Deviation Std. Error Mean 95% Confidence Interval of the

Difference

Lower Upper


Diameter30mm

534,25047932330

8200

1188,6194789491

00000

265,78339543105

6930

-

22,040560576190

714

1090,5415192228

07000

2,010 19 ,059


Diameter40mm

634,76911027568

9700

942,43528288740

0500

210,73493569304

7280

193,69582076596

8620

1075,8423997854

10600

3,012 19 ,007


Diameter40mm

100,51863095238

1600

456,37710097444

7500

102,04902211531

5080

-

113,07242706381

1220

314,10968896857

4230

,985 19 ,337

Reporting pairwise comparisons

“Post hoc pairwise comparisons for the object diameter size showed that the participants pointed significantly faster the 40 mm diameter object than the 25 mm diameter object, t(19) = 3.0, p < 0.01. Other pairwise comparisons were not statistically significant.”


• Take the data from following slide and• Create a data matrix excel & SPSS• Visualize data

• Column graph is recommended• Run analysis with SPSS• Write down the results/• Sent graph & the written-down-results

to Hanna via mail as .pdf• We shall take a look on this next week

Task: parametric analysis (1)


Errors P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 P16 P17 P18 P19 P20

25 mm 25 50 0 50 50 25 0 25 12,5 50 0 0 12,5 87,5 25 25 25 37,5 12,5 0

30 mm 0 12,5 0 0 25 12,5 37,5 0 25 0 12,5 12,5 50 75 12,5 12,5 25 12,5 12,5 0

40 mm 0 25 0 0 12,5 0 25 0 0 12,5 0 0 25 0 0 0 0 25 12,5 12,5

• Experimental task: select an object as accurately as possible

• Depend variable: errors• One independent variable: diameter of

an object• With three levels: diameter either

25, 30, or 40 mm

Task: parametric analysis (2)


Non-parametric analysis – Friedman’s test and

Wilcoxon Repeated Measures Signed-Rank test


• These analysis do not make any assumptions about the probability distribution of the data

• Before the analysis, the data is transformed to ranks (by the statistical SW)

• Usually a non-parametric test has an equivalent parametric test

Non-parametric analysis


• One-way repeated measures ANOVA

Equivalent tests

• Friedman’s rank test for k-correlated samples

• Matched pairs t-test

•Wilcoxon Repeated Measures Signed-Rank test

Parametric testsNon-parametric

tests


Friedman’s rank test for k-correlated samples


Test Statisticsa

N 20

Chi-Square 10,900

df 2

Asymp. Sig. ,004

a. Friedman Test

From the output,

find ”test statistics”


• Again, a difference; find out where-> pairwise comparisons, this time done

with non-parametric paired samples test

• Comparing • 25 mm to 30 mm, • 25 mm to 40 mm, and • 30 mm to 40 mm

-> 3 comparisons

Multiple comparisons – remember to adjust the p-value in order to avoid Type I error!

Bonferroni correction: original p / number of comparisons

Here: 0.05/3 = ~0.017


Wilcoxon Repeated Measures Signed-Rank test


Test Statisticsa

Diameter30mm - Diameter25mm



Z -2,165b -3,472b -,261b

Asymp. Sig. (2-tailed) ,030 ,001 ,794

a. Wilcoxon Signed Ranks Test

b. Based on positive ranks.

From the output, find ”test statistics”

Again, his one is smaller than the adjusted p (0.007 < 0.017), thus the significant difference is here again.


Test Statisticsa

N20

Chi-Square10,900

df2

Asymp. Sig.,004

a. Friedman Test

Reporting Friedman’s test

“Friedman's test showed that there was statistically significant effect of object diameter, Χ²(2) = 10.9, p < 0.01.”


Reporting Wilcoxon testTest Statisticsa




Z -2,165b -3,472b -,261b

Asymp. Sig. (2-tailed) ,030 ,001 ,794

a. Wilcoxon Signed Ranks Test

b. Based on positive ranks.

Note: double click the table (twice) and you will see more accurate p-value; on this case p = 0.000517, thus it is significant in 0.01 level as 0.01 / 3 = 0.0033.

“Post hoc pairwise comparisons with Wilcoxon signed-rank tests showed that the selection time was significantly faster when the object diameter was 40 mm than when the diameter was 25 mm, Z = -3.47, p < 0.01. Other pairwise comparisons were not statistically significant.”


• Check your previous exercise and compare it to Hanna’s answer • Answer is in course

web-side/schedule• Try to fix if different

• Run non-parametric analysis to the same error data in SPSS• Are the results different in any way?

• Mail possible fix of the 1st task and the written-down-results of the 2nd to Hanna• If the results differ, note it, and how

1) Check the previous task2) Make non-parametric analysis

tauchi – tampere unit for computer-human interaction erit 2015: data analysis and interpretation...

Documents