Download - Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999
![Page 1: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/1.jpg)
Formal User Studies
Marti Hearst (UCB SIMS)SIMS 213, UI Design &
DevelopmentApril 13, 1999
![Page 2: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/2.jpg)
Outline Experiment Design
– Factoring Variables– Interactions
Special considerations when involving human participants
Example: Marking Menus– Motivation– Hypotheses– Design– Analysis
![Page 3: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/3.jpg)
Formal Usability Studies
Situations in which these are useful– to determine time requirements for task
completion– to compare two designs on measurable
aspects» time required» number of errors» effectiveness for achieving very specific tasks
Require Experiment Design
![Page 4: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/4.jpg)
Experiment Design Experiment design involves
determining how many experiments to run and which attributes to vary in each experiment
Goal: isolate which aspects of the interface really make a difference
![Page 5: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/5.jpg)
Experiment Design Decide on
– Response variables» the outcome of the experiment»usually the system performance»aka dependent variable(s)
– Factors (aka attributes)» aka independent variables
– Levels (aka values for attributes)– Replication
»how often to repeat each combination of choices
![Page 6: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/6.jpg)
Experiment Design Studying a system; ignoring users Say we want to determine how to
configure the hardware for a personal workstation Hardware choices
»which CPU (three types)»how much memory (four amounts)»how many disk drives (from 1 to 3)
– Workload characteristics»administration, management, scientific
![Page 7: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/7.jpg)
Experiment Design We want to isolate the effect of each
component for the given workload type. How do we do this?
– WL1 CPU1 Mem1 Disk1– WL1 CPU1 Mem1 Disk2– WL1 CPU1 Mem1 Disk3– WL1 CPU1 Mem2 Disk1– WL1 CPU1 Mem2 Disk2– …
There are (3 CPUs)*(4 memory sizes)*(3 disk sizes)*(3 workload types) = 108 combinations!
![Page 8: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/8.jpg)
Experiment Design
One strategy to reduce the number of comparisons needed:– pick just one attribute– vary it– hold the rest constant
Problems:– inefficient– might miss effects of interactions
![Page 9: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/9.jpg)
Interactions among Attributes
A1 A2B1 3 5B2 6 8
A1 A2B1 3 5B2 6 9
A1
B1B1
A2
A1
B2
A2
B2
Non-interacting Interacting
A2A2 A1A1
B1 B2B1 B2
![Page 10: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/10.jpg)
Experiment Design Another strategy: figure out which
attributes are important first Do this by just comparing a few
major attributes at a time – if an attribute has a strong effect,
include it in future studies– otherwise assume it is safe to drop it
This strategy also allows you to find interactions between attributes
![Page 11: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/11.jpg)
Experiment Design
Common practice: Fractional Factorial Design– Just compare important subsets– Use experiment design to partially
vary the combinations of attributes Blocking
– Group factors or levels together– Use a Latin Square design to arrange
the blocks
![Page 12: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/12.jpg)
Adapted from slide by James Landay
Between Groups vs. Within Groups
Do participants see only one design or both? Between groups
– two groups of test users– each group uses only 1 of the systems
Within groups experiment– one group of test users
» each person uses both systems» can’t use the same tasks (learning)
– Why is this a consideration?– People often learn during the experiment.
![Page 13: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/13.jpg)
Special Considerations for Formal Studies with Human
Participants Studies involving human participants
vs. measuring automated systems– people get tired– people get bored– people (may) get upset by some tasks– learning effects
»people will learn how to do the tasks (or the answers to questions) if repeated
»people will (usually) learn how to use the system over time
![Page 14: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/14.jpg)
More Special Considerations
High variability among people– especially when involved in
reading/comprehension tasks– especially when following hyperlinks!
(can go all over the place)
![Page 15: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/15.jpg)
Experiment Design Example: Marking Menus
Based onKurtenbach, Sellen, and Buxton, Some Articulartory and Cognitive Aspects of
“Marking Menus”, Graphics Interface ‘94, http://reality.sgi.com/gordo_tor/papers
![Page 16: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/16.jpg)
Experiment Design Example: Marking Menus
Pie marking menus can reveal – the available options – the relationship between mark and command
1. User presses down with stylus 2. Menu appears 3. User marks the choice, an ink trail follows
![Page 17: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/17.jpg)
Why Marking Menus?
Supporting markings with pie menus should help transition between novice and expert
Useful for keyboardless devices Useful for large screens Pie menus have been shown to be
faster than linear menus in certain situations
![Page 18: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/18.jpg)
What do we want to know? Are marking menus better than pie
menus?– Do users have to see the menu?– Does leaving an “ink trail” make a difference?– Do people improve on these new menus as
they practice? Related questions:
– What, if any, are the effects of different input devices?
– What, if any, are the effects of different size menus?
![Page 19: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/19.jpg)
Experiment Factors Isolate the following factors (independent
variables):
– Menu condition»exposed, hidden, hidden w/marks (E,H,M)
– Input device»mouse, stylus, track ball (M,S,T)
– Number of items in menu »4,5,7,8,11,12 (note: both odd and even)
Response variables (dependent variables):
– Response Time – Number of Errors
![Page 20: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/20.jpg)
Experiment Hypotheses Note these are stated in terms of the
factors (independent variables)– Exposed menus will yield faster response times
and lower error rates, but not when menu size is small
– Response variables will monotonically increase with menu size for exposed menus
– Response time will be sensitive to number of menu choices for hidden menus (familiar ones will be easier, e.g., 8 and 12)
– Stylus better than Mouse better than Track ball
![Page 21: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/21.jpg)
Experiment Hypotheses
– Device performance independent of menu type
– Performance on hidden menus (both marking and hidden) will improve steadily across trials. Performance on exposed menus will remain constant.
![Page 22: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/22.jpg)
Experiment Design Participants
– 36 right-handed people» usually gender distribution is stated
– considerable mouse experience– (almost) no trackball, stylus experience
Task– Select target “slices” from a series of different pie
menus as quickly and accurately as possible– Menus were simply numbered segments
» meaningful items would have longer learning times
– Participants saw running scores» lose points for wrong selection
![Page 23: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/23.jpg)
Experiment Design
One between-subjects factor – Menu Type
»Three levels: E, H, or M
Two within-subjects factors– Device Type
»Three levels: M, T, or S
– Number of Menu Items»Six levels: 4, 5, 7, 8, 11, 12
How should we arrange these?
![Page 24: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/24.jpg)
Experiment Design
E H M
12 12 12
Betweensubjectsdesign
How to arrange
thedevices?
![Page 25: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/25.jpg)
Experiment Design
M
T
S
T
S
M
S
M
T
E H M
12 12 12
A LatinSquare
No row or
columnsharelabels
![Page 26: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/26.jpg)
Experiment Design
M
T
S
T
S
M
S
M
T
E H M
How toarrange
themenu sizes?
Block by sizethen
randomize the
blocks.
![Page 27: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/27.jpg)
Experiment Design
M
T
S
T
S
M
S
M
T
E H M
5 11
12 8
7 4
Block by sizethen
randomize the
blocks.
![Page 28: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/28.jpg)
Experiment Design
M
T
S
T
S
M
S
M
T
E H M
5 11
12 8
7 4
7 8
12 5
4 11
40 trials per block
![Page 29: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/29.jpg)
Experiment Overall Results
Group Mean RT(s.d)
Mean Errors(s.d.)
Mean %Errors
Exposed 0.98 (.23) 0.64 (1.0) 1.6%
Hidden 1.10 (.31) 3.27 (3.57) 8.2%
Marking 1.10 (.31) 3.76 (3.67) 9.4%
So exposing menus is faster … or is it?Let’s factor things out more.
![Page 30: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/30.jpg)
A Learning EffectWhen we graph over the number of trials, we finda difference between exposed and hidden menus.This suggests that participants may eventually becomefaster using marking menus. (hypothesized)A later study verified this.
![Page 31: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/31.jpg)
Factoring to Expose Interactions Increasing menu size increases selection time and
number of errors (hypothesized) No differences across menu groups in terms of
response time. That is, until we factor by menu size AND group
– Then we see that menu size has effects on hidden groups not seen on exposed group
– This was hypothesized (12 easier than 11)
![Page 32: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/32.jpg)
Factoring to Expose Interactions Stylus and mouse outperformed trackball
(hypothesized) Stylus and mouse the same (not
hypothesized) Initially, effect of input device did not interact
with menu type– this is when comparing globally– BUT ...
More detailed analysis:– Compare both by menu type and device type– Stylus significantly faster with Marking group– Trackball significantly slower with Exposed group– Not hypothesized!
![Page 33: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/33.jpg)
Average response time and errors as
a function of device, menu size,
and menu type
Potential explanations:Markings provide feedbackfor when stylus is pressedproperly.Ink trail is consistent withthe metaphor of using a pen.
![Page 34: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/34.jpg)
Experiment Design
M
T
S
T
S
M
S
M
T
E H M
How can we tell if order in which the device appears has an effect on the final outcome?
Some evidence:There is no significant difference among devices in the Hidden group.Trackball was slowest and most error prone in all three cases.Still, there may be some hidden interactions, but unlikelyto be strong given the previous graph.
![Page 35: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/35.jpg)
Statistical Tests
Need to test for statistical significance– This is a big area– Assuming a normal distribution:
»Students t-test to compare two variables»ANOVA to compare more than two
variables
![Page 36: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/36.jpg)
Adapted from slide by James Landay
Analyzing the Numbers
Example: trying to get task time <=30 min. – test gives: 20, 15, 40, 90, 10, 5– mean (average) = 30– median (middle) = 17.5– looks good! – wrong answer, not certain of anything
Factors contributing to our uncertainty– small number of test users (n = 6)– results are very variable (standard deviation =
32)» std. dev. measures dispersal from the mean
![Page 37: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/37.jpg)
Adapted from slide by James Landay
Analyzing the Numbers (cont.)
This is what statistics is for Crank through the procedures and you find
– 95% certain that typical value is between 5 & 55
Usability test data is quite variable– need lots to get good estimates of typical
values– 4 times as many tests will only narrow range by
2x
![Page 38: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/38.jpg)
Followup Work
Hierarchical Markup Menu study
![Page 39: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/39.jpg)
Followup Work Results of use of marking menus
over an extended period of time– two person extended study– participants became much faster
using gestures without viewing the menus
![Page 40: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/40.jpg)
Followup Work Results of use of marking menus
over an extended period of time– participants temporarily returned to
“novice” mode when they had been away from the system for a while
![Page 41: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/41.jpg)
Summary Formal studies can reveal detailed information but
but take extensive time/effort Human participants entail special requirements Experiment design involves
– Factors, levels, participants, tasks, hypotheses– Important to consider which factors are likely to have real
effects on the results, and isolate these Analysis
– Often need to involve a statistician to do it right– Need to determine statistical significance– Important to make plots and explore the data
![Page 42: Formal User Studies Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 13, 1999](https://reader035.vdocuments.us/reader035/viewer/2022081516/5697bfa71a28abf838c9892d/html5/thumbnails/42.jpg)
References
Kurtenbach, Sellen, and Buxton, Some Articulartory and Cognitive Aspects of “Marking Menus”, Graphics Interface ‘94, http://reality.sgi.com/gordo_tor/papers
Kurtenbach and Buxton, User Learning and Performance with Marking Menus, Graphics Interface ‘94, http://reality.sgi.com/gordo_tor/papers
Jain, The art of computer systems performance analysis, Wiley, 1991
http://www.statsoft.com/textbook/stanman.html Gonick and Smith, The Cartoon Guide to Statistics,
HarperPerennial, 1993 Dix et al. textbook