![Page 1: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/1.jpg)
1
Experimental Design
What is it?When to use it?
Types of VariablesSetting up an Experiment
Case StudyAnalyzing the data
9/25/2008 Comp 7570 - CSCW (PPI) 2
HCI
• Concerned with the study, design and implementation of human-centric interactive computer systems
• Interdisciplinary including design, psychology, anthropology, sociology, ergonomics, etc…
• Significant amount devoted to evaluation of novel designs
• But then how do we evaluate systems or design choices?
9/25/2008 Comp 7570 - CSCW (PPI) 3
Types of evaluation
• Users not involved
• Supported by practice/theory
• Occurs in realistic setting
• External validity: degree to which research
results applies to real situations
• Large Sampling
• Subjective/qualitative
9/25/2008 Comp 7570 - CSCW (PPI) 4
Done this someway
• In one form or another we have resorted to experimenting
• Also an important tool for survival!
– experimented with various types of ear plugs
– experimented with different types of pacifiers
– experimented with various types of snow tires
– etc…
• But somewhat different, i.e. less formal
![Page 2: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/2.jpg)
2
9/25/2008 Comp 7570 - CSCW (PPI) 5 9/25/2008 Comp 7570 - CSCW (PPI) 6
Approaches: Naturalistic
• Naturalistic:
– describes an ongoing process as it evolves over time
– observation occurs in realistic setting
• ecologically valid
– “real life”
• External validity
– degree to which research results applies to real situations
9/25/2008 Comp 7570 - CSCW (PPI) 7
Approaches: Naturalistic
• Advantage
– Can state something about the user’s behavior in an actual environment
• Disadvantage
– Cannot control (or even know) all the contributing factors to user’s performance
• i.e. do they use menus more frequently than toolbar buttons because the icons are not comprehensible
• OR because the buttons are too small OR simply because they do not know that they exist OR …. [can go on]
9/25/2008 Comp 7570 - CSCW (PPI) 8
Approaches: Experimental
• In certain cases you want to make a statement about a particular UI design choice
– i.e. “I really want to know whether the size of buttons contribute to how quickly users click on them”
or
– i.e. “I want to know whether a menu designed in a circular shape (pie menu) is more effective than a regular menu”
or
– i.e. “I want to find out whether technique 1 (or system 1) is better than technique 2 (system 2)”
• You want to make some generic statements that can be widely applicable (not only restrained to your app)
![Page 3: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/3.jpg)
3
9/25/2008 Comp 7570 - CSCW (PPI) 9
Approaches: Experimental
• Experimental– study relations by manipulating one or more independent
variables
• experimenter controls all environmental factors
– observe effect on one or more dependent variables
• Internal validity– confidence that we have in our explanation of
experimental results
• Trade-off: Natural vs. Experimental– precision and direct control over experimental design
versus
– desire for maximum generalizability in real life situations
9/25/2008 Comp 7570 - CSCW (PPI) 10
Quantitative Evaluation
• What task to evaluate?
– Depends on application
– Attempt to find canonical task(s)
• i.e. what would be a set of tasks that can be used to test whether larger icons contribute to faster selection?
• Common measures
– Task completion time
– Error rate
– Learning rate (novice -> expert transition)
– Fatigue, comfort?
– etc.
9/25/2008 Comp 7570 - CSCW (PPI) 11
What task to evaluate?
• Example: Pointing Device Evaluation
• Real task: interacting with GUI’s
– pointing is fundamental
• Experimental task: target acquisition
– abstract, elementary, essential
D
W
9/25/2008 Comp 7570 - CSCW (PPI) 12
HCI as a Science
• Researchers in HCI seek to:– Establish relationships between circumstances and
interaction behavior– Fit these relationships into a given body of knowledge
• Need to demonstrate that one event is related to a second event in some predictable way
• At least one of these events must be measurable
• Have to deal with variability between humans and also between trials (different in other sciences)– handle variability using randomness, multiple subjects,
statistics– handle variability by controlling as much as possible
![Page 4: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/4.jpg)
4
9/25/2008 Comp 7570 - CSCW (PPI) 13
Example
• Is it easier to read with CAPS or without Caps?
• Want to make a conclusive and general statement whether CAPS are more efficient than non-Caps
• Conclusion would look like:
– “for text, CAPS are 20% less efficient than non-Caps”or
– “for text, CAPS are 25% more efficient than non-Caps”
9/25/2008 Comp 7570 - CSCW (PPI) 14
Example
• How do we test this question?
• Need to come up with
– a hypothesis
– a set of variables we are going to manipulate
– a set of variables we are going to measure
– reduce the number of confounding variables
– a task
– a set of randomized trials
9/25/2008 Comp 7570 - CSCW (PPI) 15
Example
THE BROWN FOX JUMPED OVER THE MOON.
OR, SHOULD IT SAY THE BROWN FOX
JUMPED OVER THE CAT.
9/25/2008 Comp 7570 - CSCW (PPI) 16
Example
The brown fox jumped over the moon. Or, should it say the brown fox jumped over the cat.
![Page 5: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/5.jpg)
5
9/25/2008 Comp 7570 - CSCW (PPI) 17
Example
• Would it be sufficient to simply show those two slides and do some measurements?
• What are some problems with this kind of setup?
• What would we measure?
• Lets first look at some definitions
9/25/2008 Comp 7570 - CSCW (PPI) 18
Hypothesis
• Statement or claim that the experimenter wants to test
• Defines the nature of the relationship between two types of variables
9/25/2008 Comp 7570 - CSCW (PPI) 19
Hypothesis
H0: there is no difference in the number of cavities in children and teenagers using crest and no-teeth toothpaste
H1: children and teenagers using crest toothpaste have fewer cavities than those who use no-teeth toothpaste
9/25/2008 Comp 7570 - CSCW (PPI) 20
Hypothesis
H0: there is no difference in user performance (time and error rate) when selecting a single item from a pop-up or a pull down menu, regardless of the subject’s previous expertise in using a mouse or using the different menu types
File Edit View Insert
New
Open
Close
Save
File
Edit
View
Insert
New
Open
Close
Save
![Page 6: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/6.jpg)
6
9/25/2008 Comp 7570 - CSCW (PPI) 21
Hypothesis
• Hypothesis can be softer and uncertain:
– Will color affect recognition speed?
– Will proximity affect perceptual organization?
– Etc…
9/25/2008 Comp 7570 - CSCW (PPI) 22
Independent Variables
• At least one circumstance is of major interest in an experiment– i.e. menu type in selection time experiment OR text type
• Referred to as an independent variable– Independent of the subject’s behavior or performance
• Want to choose two or more levels of this circumstance to present (manipulate)– Nothing the subject does can change the levels of the
independent variable• CAPS vs. non-caps
• What are the independent variables in the toothpaste experiment? What are the different levels?
9/25/2008 Comp 7570 - CSCW (PPI) 23
Dependent Variables
• Want to measure a subject’s behavior in response to manipulations of the independent variable
• Dependent variable, depends on what the subject does
• Statement about the expected nature of the relationship between the independent and dependent variables is referred to as hypothesis (as seen previously)
9/25/2008 Comp 7570 - CSCW (PPI) 24
Control Variables
• Only want to manipulate one circumstance– the independent variable
• All other circumstances need to be controlled
• These become control variables– control font of two different types of menus– control color coding on two different types of visualizations
• Have to be controlled across all levels of the IV– confirm that change in dependent variable due to
change in independent variable
• However impossible to control everything– More control leads to less generalization
![Page 7: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/7.jpg)
7
9/25/2008 Comp 7570 - CSCW (PPI) 25
Confounding Variables
• A confounding variable is any factor that varies with the independent variable
• Suppose we want to use 5 different levels for text type
– “subjects respond more quickly to the last 2”
– “subjects respond more quickly after practice”
– Practice confounded with speed
• Coke vs. Pepsi
9/25/2008 Comp 7570 - CSCW (PPI) 26
Random Variables
• Want to avoid confounded effects; allow variables to randomly vary: random variables
• Selecting subjects is usually done randomly
– For testing effect of color on visibility of an object
• choose subjects randomly from a large population
• choose colors to be tested on randomly as well
– Age factors, eye deficiencies, and other elements would randomly enter into the equation (can eliminate some of these)
• Can flip a coin, throw dice, allow a random number generator to select for us
9/25/2008 Comp 7570 - CSCW (PPI) 27
Example
• In the previous example what may be a hypothesis– H1: “Users are slower reading CAPS”
– H2: “There is no difference in reading rates”
– H3: “CAPS are less memorable”
• What variables do we manipulate, i.e. what are the independent variables? – Text type, i.e. CAPS or no Caps (Two levels)
• What variables do we measure, i.e. what are the dependent variables? – Lets look first at the hypothesis
• H1 or H2: reading speed
• H3: recall after 2 hours
9/25/2008 Comp 7570 - CSCW (PPI) 28
Example
• What variables do we control?
• What may be some confounding variables and how do we counter these?
![Page 8: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/8.jpg)
8
9/25/2008 Comp 7570 - CSCW (PPI) 29
Experimental Design
• Manipulating and Measuring Variables
• Within vs. Between Subjects Design
• Single vs. Multiple Variable Experiment
9/25/2008 Comp 7570 - CSCW (PPI) 30
Choosing an Independent Variable
• Should be what the experimenter wants to manipulate:– Font 10 vs. 12 vs. 14 (IV=font size)
– Bar graph vs. line graph (IV=type of graph)
– Are children more violent after being exposed to games with violence. What is the IV?
• In the last question need to define “violence”, i.e. what is the operational definition of violence in games?– Is there shooting/hurting/physical contact?
– Are the actions moral/immoral (stealing, deceiving, etc.)?
– Language abuse?
– Would it be considered violent if outside the game?
9/25/2008 Comp 7570 - CSCW (PPI) 31
Single Variable Experiment
• Only one independent variable
• Two-level experiment: the IV has two levels (simplest case, where one is the experimental group and the other control group), i.e. existence vs. non-existence
• Advantages:
– Way of finding out if IV is worth studying
– Results easy to interpret and analyze
– Some cases do not need more than two levels
• investigating two interaction techniques
• two educational methods
• etc.
9/25/2008 Comp 7570 - CSCW (PPI) 32
Single Variable Experiment
• Disadvantages:
– Sometimes does not say much about the relationship between the IV and the DV
12 10
Print Size
Re
ad
ing
Tim
e
12 10
Print Size
Re
ad
ing
Tim
e
12 10
Print Size
Re
ad
ing
Tim
e
![Page 9: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/9.jpg)
9
9/25/2008 Comp 7570 - CSCW (PPI) 33
Single Variable Experiment
• Multilevel Experiments: single variable experiments where IV has > 2 levels
• Advantages:
– Have better handle over IV-DV relationship
– The more levels added the less critical is the range of IV (balance between realistic and large enough)
• Disadvantages:
– Requires more time and effort than 2-level (within-subjects increases time for each subject, between-subjects requires additional subjects)
– Statistical tests more complex
– Need to know when to limit the number of levels
Low High
Anxiety Level
Av
era
ge
Te
st
Sc
ore
Low Neutral High
Anxiety Level
Av
era
ge
Te
st
Sc
ore
9/25/2008 Comp 7570 - CSCW (PPI) 34
Multiple Variable Experiment
• Most frequent design combines several variables in a factorial combination that pairs each level of IV with the others � referred to as a factorial design
• 2 levels for Caps/no-caps and 3 levels for font size (small/medium/large)
– Gives 2 x 3 design
Ca
ps
Font Size
Yes
No
Small Medium Large
9/25/2008 Comp 7570 - CSCW (PPI) 35
Multiple Variable Experiment
• Advantages– Interactions between IVs can be studied (interaction
occurs when the relationship between one IV and subject’s behavior depends on the level of a second IV)
– Can add additional circumstances by making them IVs
– When circumstance that could add variability to the data is made into a factor, the amount of variability decreases
• Disadvantages– Time-consuming and costly
– Analysis more complicated, need to typically do an ANOVA
– Assumption that variability in data approximates a normal distribution (don’t know until completed experiment)
– Interpretation of results is more complex
9/25/2008 Comp 7570 - CSCW (PPI) 36
Range of the Independent Variable
• Range is the difference between the highest and lowest level of a variable; no specific guidelines, need to fit it in the experiment
• Realistic range: do not choose levels that are so wide that effects will definitely be found without carrying out the experiment
• Range that shows effect: should be large enough to have an effect– If interested in effect of font size on reading speed
choosing between font 14 vs. font 15 will could lead to false conclusions
• Pilot experiment: similar to real experiment but data thrown out; can test design before proceeding
![Page 10: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/10.jpg)
10
9/25/2008 Comp 7570 - CSCW (PPI) 37
Choosing a Dependent Variable
• Measure of the subject’s behavior
• Need operational definition; i.e. “do violent games result in children’s aggression?”
• How do we measure aggressiveness? – Panel of judges observing playing behavior + rating
– Give a selection of toys and observe how they play
– Narrate frustrating stories and count number of direct-attacks
• In HCI it can be a bit more straightforward fortunately
• But need to also define validity and reliability of the measurements
9/25/2008 Comp 7570 - CSCW (PPI) 38
Reliability/Repeatability
• Would the same results be achieved if the test were repeated?– Experiment is perfectly reliable if you get same results each
time experiment is repeated
• Problems– Individual differences:
• best user 10x faster than slowest
• best 25% of users ~2x faster than slowest 25%
– Unreliable instruments
• e.g., built in clock vs. stop watch
• Partial Solution– Reasonable number and range of users tested
– Correlate data from repeated measurements
9/25/2008 Comp 7570 - CSCW (PPI) 39
Validity
• Are you measuring what you think you’re measuring?
– Errors in equipment
– Errors in procedure
– Incorrect pool of subjects
– Errors questions asked, variables measured
9/25/2008 Comp 7570 - CSCW (PPI) 40
Directly Observable Dependent Variables
• Directly observable DVs can be measured directly; indirect DVs use secondary measures
– i.e. physiological measures with a lie detector
– response time to measure how much info. is processed
• Single dependent variable: measuring only accuracy or speed; usually not sufficiently indicative of performance
– i.e. could be very fast but also very inaccurate
• Multiple dependent variable: speed-accuracy tradeoffs for example gives an overall better indication of performance
– i.e. more “valid”
• Composite dependent variable: multiple dependent variables combined to form one variable
![Page 11: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/11.jpg)
11
9/25/2008 Comp 7570 - CSCW (PPI) 41
Experimental Design
• Individual differences– Need more than one subject
– Usually multiple subjects (n=at least 10, ideally much more)
– how to distribute tasks amongst subjects?
9/25/2008 Comp 7570 - CSCW (PPI) 42
Within vs. Between Subjects Design
• Within subject design:– Pros:
• All subjects do all conditions
• Fewer subjects, less individual differences
• Easier stats analysis
– Cons:• Transfer effects
– Doing 1 condition affects following condition
• Often you want subjects to learn extensively
• Between subjects design:– Pros:
• Subjects only do one condition
• No transfer effects
• Train to high skill
– Cons:• More subjects, individual differences
• Harder stats analysis
Condition 1
Subject 1
Subject 2
.
Subject 10
Condition 2
Subject 1
Subject 2
.
Subject 10
Condition 1
Subject 1
Subject 2
.
Subject 10
Condition 2
Subject 11
Subject 12
.
Subject 20
9/25/2008 Comp 7570 - CSCW (PPI) 43
Experimental Design
• Order of presentation in within-subjects designs
– ABBA counterbalancing:• Every subject does trials in the order: A, B, B, A
• Any confounding effect (e.g., learning curve) is counterbalanced
Resulting Confound: A: 5+60 =65
B: 30+50 = 80
6030 505Nonlinear confounding effect
Resulting Confound: A: 10+40 = 50
B: 20+30 = 50
40302010Linear Confounding effect
ABBACondition
4321Trial#
9/25/2008 Comp 7570 - CSCW (PPI) 44
Experimental Design
• Order of presentation in within-subjects designs– Make order a between-subjects variable
– Fully counterbalanced:
• Combinatorial explosion when n>4
• Needs lots of subjects
A B
B A
A B C
A C B
B A C
B C A
C A B
C B A
![Page 12: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/12.jpg)
12
9/25/2008 Comp 7570 - CSCW (PPI) 45
• Order of presentation in within-subjects designs– Partial counterbalancing. e.g., Latin square:
• Ensures each level appears in every position in order equally often
• n rows x n columns and each treatment occurs once in each row and in each column
• Balanced Latin Square:– Each condition precedes and follows each of the other
conditions equally often:
Experimental Design
A B C D
B D A C
D C B A
C A D B
A B C
B C A
C A B
9/25/2008 Comp 7570 - CSCW (PPI) 46
Experimental Design
• Why counterbalance?– Reduce transfer effects
– Assumes symmetric transfer• A-B transfer == B-A transfer
– If asymmetric transfer
• i.e., A-B transfer > or < B-A transfer then use a between-subjects design
– Range effects
• People tend to perform best in middle of range of trials
• does between-subjects design solve this?– Context effect � when one level of IV is used subjects establish a
context
9/25/2008 Comp 7570 - CSCW (PPI) 47
Activity
• How would you carry out the experiment for comparing CAPS to non-caps, i.e. what would be your design?
9/25/2008 Comp 7570 - CSCW (PPI) 48
Activity
• Design an experiment to compare a pop-up linear menu vs. a pie menu
– Subjects?
– Hypothesis?
– IV?
– DV?
– Design?
– Task (s)?
Day
Evening
Night
Split
Day Shift
Evening Shift
Night Shift
Split Shift
![Page 13: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/13.jpg)
13
9/25/2008 Comp 7570 - CSCW (PPI) 49
Activity
9/25/2008 Comp 7570 - CSCW (PPI) 50
Activity
• Design an experiment to test whether adding color coding to a menu interface improves accuracy?
• Subjects?
• Hypothesis?
• IV?
• DV?
• Design?
• Task (s)?
9/25/2008 Comp 7570 - CSCW (PPI) 51
Activity
• Only one form of solution, many others exist– Subjects: Taken from user population– Hypothesis: Color coding will make selection more
accurate– IV: Color coding– DV: Accuracy measured as number of errors– Design: between groups to ensure no transfer of learning
(or within groups with appropriate safeguards if subjects are scarce)
– Task: the interfaces are identical in each of the conditions, except that, in the second color is added to indicate related menu items. Subjects are presented with a screen of menu choices (ordered randomly) and verbally told what they have to select. Selection must be done within a strict time limit when the screen clears. Failure to select the correct item is deemed an error. Each presentation places items in new positions. Subjects perform in one of two conditions. July 14-16, 2004
Example
The Effect of Shading in Extracting Structure from
Space-Filling Visualizations
![Page 14: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/14.jpg)
14
9/25/2008 Comp 7570 - CSCW (PPI) 53
Motivation
• Hierarchies are abundant and interacted with on a regular basis
• For adequate navigation, the structure has to be explicit
• Hierarchies are generally represented as trees
• Structure is explicit, but space-inefficient & navigation complexity increases with size
9/25/2008 Comp 7570 - CSCW (PPI) 54
Space-Filling Visualization
• Developed to make more efficient use of display space– i.e.: Treemap [Shneiderman, 1990]
• Characterized by compactness and effectiveness of showing node size
• However, the structure is no longer explicit
• Can shading facilitate the extraction of structure information?
9/25/2008 Comp 7570 - CSCW (PPI) 55
CushionMap: “Shaded Treemap”
• CushionMap (SequoiaView™) uses shading to give a 2½-D impression, to make structure more explicit [van Wijk, 1999]
9/25/2008 Comp 7570 - CSCW (PPI) 56
Structure-from-Shading (1)
• Evidence that our visual system extracts shading information early on
• Simple shading information processed preattentively [Enns & Rensink, 1990]
![Page 15: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/15.jpg)
15
9/25/2008 Comp 7570 - CSCW (PPI) 57
Structure-from-Shading (2)
• Shading and contour combine to strongly influence the shape of an object [Sun and Perona, 1996]
• We innately make assumptions about shading information [Ramachandran, 1988]
9/25/2008 Comp 7570 - CSCW (PPI) 58
Structure-from-Shading (3)
• Shading useful in extracting structure information in node-link diagrams [Irani and Ware, 2001]
9/25/2008 Comp 7570 - CSCW (PPI) 59
Structure-from-Shading (4)
• Some evidence that shading impairs size judgments
• 2D bar/pie charts better than 3D counterpart [Carswell et al, 1991]
• Similarly 2D line graphs lower accuracy than 3D counterpart [Zacks et al, 1998]
9/25/2008 Comp 7570 - CSCW (PPI) 60
Study Methodology
• Hypotheses
• Participants
• Apparatus and task
• Experimental factors
• Study Design
![Page 16: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/16.jpg)
16
9/25/2008 Comp 7570 - CSCW (PPI) 61
Experiment - Hypotheses
• Hypothesis 1: shading (CM) will result in higher performance on structure related tasks than the no-shading condition (TM)
• Hypothesis 2: shading (CM) will result in lower performance on tasks related to file and directory size comparisons than the no-shading condition (TM)
9/25/2008 Comp 7570 - CSCW (PPI) 62
Participants
• 20 undergraduate students (paid) participated
• Random assignment to one of two condition CM or TM first
• All familiar with concept of file and directory management tasks/routines
• None had experience with SequoiaView™
9/25/2008 Comp 7570 - CSCW (PPI) 63
Experiment – Method
• Half started on TreeMap (TM) the other half on CushionMap (CM)
• Used 2 different hierarchies H1 and H2
• {CM-H1, TM-H2}, {CM-H2, TM-H1}, {TM-H1, CM-H2}, and {TM-H2, CM-H1}.
9/25/2008 Comp 7570 - CSCW (PPI) 64
Experiment – Tasks
• Tasks divided into two major categories:– Structure-based
• Count the number of directories in the hierarchy• Find the directory with the most number of files• Count the number of subdirectories in a given directory• Count the number of files in a given subdirectory• Find the directory with the most number of bit map files (.bmp)• Count the number of sub-directories that contain bitmap
(.bmp) files
– Size-based• Find the smallest directory in the hierarchy• Find the largest file in the hierarchy• Find the largest file in a given directory• Find the largest mp3 file in the hierarchy
![Page 17: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/17.jpg)
17
9/25/2008 Comp 7570 - CSCW (PPI) 65
Experiment – Measurements
• Measure: subjects’ performance on each task with respect to two variables:
– time until completion (0 to 45 seconds)
– successful/unsuccessful completion (0/1)
• Timeouts classified as failures
• Unsuccessful and timeouts not included in average completion time calculations
9/25/2008 Comp 7570 - CSCW (PPI) 66
Experiment – Results (2)
0
5
10
15
20
25
St ruct ure Size
TM
CM
TM = 3.4 (0.7)CM = 3.1 (0.9)
TM = 2.7 (1.5)CM = 4.9 (0.8)
Average # of tasks successfully completed
TM = 17.9 (5.4)CM = 20.2 (5.4)
TM = 21.5 (6.1)CM = 16.2 (3.7)
Average Completion Time (seconds)
SizeStructure
0
1
2
3
4
5
6
Str uctur e Size
TM
CM
Completion Time # of Tasks Successfully Completed
9/25/2008 Comp 7570 - CSCW (PPI) 67
Experiment – Results (3)
No significant difference between
CM and TM
Subjects significantly more accurate on CM
over TM (p<0.001)
Completion Success
No significant difference between
CM and TM
CM significantly faster that TM (p=0.0021)
Completion Time
SizeStructure
9/25/2008 Comp 7570 - CSCW (PPI) 68
Experiment – Subjective Evaluation
2.053.0510. I found toolname confusing to use.
4.354.009. After the training session I knew how to use toolname.
4.403.708. I was able to find the largest directory using toolname.
3.903.307. I was able to compare the sizes of files using toolname.
3.953.506. I was able to find the largest file using toolname.
3.953.055. I was able to find the files inside a sub-directory using toolname.
4.353.604. I was able to find subdirectories using toolname.
4.553.953. I was able to detect the type of files using toolname.
4.603.702. I was able to find the bitmap (.bmp) files using toolname.
4.403.651. I was able to count the number of directories using toolname.
CMTMStatement
5 =“strongly agree” , 1 = “strongly disagree”
![Page 18: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/18.jpg)
18
9/25/2008 Comp 7570 - CSCW (PPI) 69
Discussion
Low Medium High Very High
Low
Mediu
mH
igh
Very
Hig
h
Level of Support for Tasks Based on Structure
Level of Support for Tasks Based on Size
?
Sunburst ?
9/25/2008 Comp 7570 - CSCW (PPI) 70
Discussion
• Tested the effect of shading on non-explicit structures (CM vs. TM)
• Confirmed the first hypothesis– Users were faster and more accurate in completing directory
management tasks with the shaded hierarchies
• Did not obtain any conclusive results on the unfavorable effect of shading for size-based tasks
• Need to investigate the ability of users to extract structure from space-filling techniques
9/25/2008 Comp 7570 - CSCW (PPI) 71
R
R
R
R
R
R
R
M
M
M
M
Introduction
9/25/2008 Comp 7570 - CSCW (PPI) 72
2D navigation: scrolling
R
![Page 19: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/19.jpg)
19
9/25/2008 Comp 7570 - CSCW (PPI) 73
2D navigation: zooming
R
R
R
R
R
R
R
R
M
M
M
M
9/25/2008 Comp 7570 - CSCW (PPI) 74
2D navigation: overview+detail
R
R
R
R
R
R
R
R
M
M
M
M
9/25/2008 Comp 7570 - CSCW (PPI) 75
2D navigation: focus+context
9/25/2008 Comp 7570 - CSCW (PPI) 76
Design goals
Off-screen object awareness
Full-scale view: see key features of a potential target
Context visibility: see environment surrounding target
Minimal navigation: move off-screen with little effort
![Page 20: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/20.jpg)
20
9/25/2008 Comp 7570 - CSCW (PPI) 77
Halos: Off-screen object visualization
[Baudisch and Rosenholtz, 2003]9/25/2008 Comp 7570 - CSCW (PPI) 78
Proxies: Interacting with remote objects
[Baudisch et al., 2003] [Bezerianos and Balakrishnan, 2005]
9/25/2008 Comp 7570 - CSCW (PPI) 79
Hop (halo+proxies)
Halos – provide awareness of off-screen content
Laser Beam – invokes replica of content off-screen
Proxy – permits inspection before going off-screen
Teleporting – navigate to area of off-screen interest
9/25/2008 Comp 7570 - CSCW (PPI) 80
Video
![Page 21: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/21.jpg)
21
9/25/2008 Comp 7570 - CSCW (PPI) 81
Evaluation: Task
9/25/2008 Comp 7570 - CSCW (PPI) 82
Evaluation: Conditions
Navigation TechniquesZooming - two-level zoom
Panning - grab-and-drag panning
Hopping
Off-Screen Distance600 and 1200 pixels off the edge
DensityFew - 15 objects
Some - 30 objects
Many - 60 objects
9/25/2008 Comp 7570 - CSCW (PPI) 83
Video
9/25/2008 Comp 7570 - CSCW (PPI) 84
Results: Completion Time
0
5
10
15
20
25
15 30 60
Tim
e (secs) Hop
Zoom
Pan
0
5
10
15
20
25
15 30 60
600 pixels 1200 pixels
![Page 22: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/22.jpg)
22
9/25/2008 Comp 7570 - CSCW (PPI) 85
Possible explanations
Number of navigation operations less with hop than pan or zoom
1.3 operations vs. 16 for zoom, and 21 for pan
Panning less disorientation in high density
Zooming requires more trips in high densities
9/25/2008 Comp 7570 - CSCW (PPI) 86
Navigation patterns – zooming (1)
Initial view through viewport
9/25/2008 Comp 7570 - CSCW (PPI) 87
Navigation patterns – zooming (2)
Zoom out - locate large cluster 9/25/2008 Comp 7570 - CSCW (PPI) 88
Navigation patterns – zooming (3)
Zoom into large cluster
![Page 23: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/23.jpg)
23
9/25/2008 Comp 7570 - CSCW (PPI) 89
Navigation patterns – panning (1)
Initial view 9/25/2008 Comp 7570 - CSCW (PPI) 90
Navigation patterns – panning (2)
Goes off-screen first
9/25/2008 Comp 7570 - CSCW (PPI) 91
Navigation patterns – panning (3)
Pans consistently in one direction 9/25/2008 Comp 7570 - CSCW (PPI) 92
Navigation patterns – hopping (1)
Initial view
![Page 24: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/24.jpg)
24
9/25/2008 Comp 7570 - CSCW (PPI) 93
Navigation patterns – hopping (2)
Projects laser beam on individual halos9/25/2008 Comp 7570 - CSCW (PPI) 94
Navigation patterns – hopping (3)
Later, performs large sweep to invoke proxies
9/25/2008 Comp 7570 - CSCW (PPI) 95
Main Findings
Hop selection time approximately half of zoom or pan
Performance with hop remained constant
Zooming times increased with density and object distance
Panning improved as number of objects increased
9/25/2008 Comp 7570 - CSCW (PPI) 96
Limitations to hop
Clutter
Sufficient details in proxies, difficult without
Proxies lose large-scale context
Getting lost
![Page 25: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/25.jpg)
25
9/25/2008 Comp 7570 - CSCW (PPI) 97
Recap
• Choosing IVs and DVs
• Range of IVs
• Determining reliability and validity
• Within-subjects & between-subjects design
• Single variable vs. multi-variable designs
9/25/2008 Comp 7570 - CSCW (PPI) 98
Interpreting Experimental Results
• Plotting Frequency Distributions
• Statistics for Describing Distributions
• Plotting Relationships Between Variables
• Describing the Strength of a Relationship
• Interpreting Results from Factorial Experiments
• Inferential Statistics
9/25/2008 Comp 7570 - CSCW (PPI) 99
Statistical analysis
• Calculations that tell us
– mathematical attributes about our data sets
• mean, amount of variance, ...
– how data sets relate to each other
• whether we are “sampling” from the same or different distributions
– the probability that our claims are correct
• “statistical significance”
9/25/2008 Comp 7570 - CSCW (PPI) 100
Questions one might ask
• Is there a difference?– Is one system better than another?
• Techniques addressing this are called hypothesis testing• The answers are not simply yes/no, but of the form: we are
99% certain that selection on 5 item menus is faster than 7 item menus
• How big is the difference?– i.e. selection from 5 items is 270 ms faster than from 7 items– Called point estimation, often obtained by averages
• How accurate is the estimate?– i.e. selection is faster by 270 +/- 30 ms– Answers to this are in the form of standard deviations or
confidence intervals– “we are 95% certain that the difference in response time is
between 240 and 310 ms”
![Page 26: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/26.jpg)
26
9/25/2008 Comp 7570 - CSCW (PPI) 101
Interpreting Results
• First two rules:
– Look at the data
• a graph, histogram or table of results could be more instructive
• Exposes outliers, which need to be removed to avoid biases
– Save the data
• May want to try different analyses on the data
• Trace back the analysis to the raw data collected
• Choice of statistical analysis depends on type of data and questions to be answered
9/25/2008 Comp 7570 - CSCW (PPI) 102
Plotting Frequency Distributions
• Plot a frequency distribution telling us how frequently each score appears in the data
• “Frequency” is the number of raw data points that fall into each score category
• Useful first step in finding out whether there is a difference between conditions
• Example: two groups
– Want to determine whether video game player who plays racing games is more comfortable (less anxious) with fast drivers
9/25/2008 Comp 7570 - CSCW (PPI) 103
Plotting Frequency Distributions
Game Player Non-Player1 62 11 552 56 12 423 67 13 614 91 14 585 53 15 706 87 16 477 51 17 628 63 18 369 46 19 7410 71 20 51
By looking at distributions we can notice that there are no differences
0
0.5
1
1.5
2
2.5
3
3.5
10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89 90-99
0
0.5
1
1.5
2
2.5
3
3.5
0-9 10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89 90-99
Game Player
Non-Player 9/25/2008 Comp 7570 - CSCW (PPI) 104
Plotting Frequency Distributions
• Normal distribution, fits a complex mathematical formula. For our purposes, dist is normal if fits a bell-shaped curve
• Important to know whether distribution is normal so that you can apply appropriate statistical tests
• Could also have bimodal, truncated or skewed distributions
• Although nice to see frequency distribution, nice to have a single number representing how subjects performed
![Page 27: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/27.jpg)
27
9/25/2008 Comp 7570 - CSCW (PPI) 105
Statistics for Describing Distributions
• Use typically two types of statistics: descriptive and inferential
• Descriptive statistic is simply a number that allows the experimenter to describe some characteristics
• Inferential will be discussed later
9/25/2008 Comp 7570 - CSCW (PPI) 106
Statistics for Describing Distributions
• One important descriptor is the location of the middle of a distribution (central tendency)
• Mode, the most frequently occurring score
• Median, it’s the middle score, equal number of scores above it and below it
• Mean, weighted average of the scores
• Which to use depends on the distribution, what purpose the average plays, and your judgment– outliers vs. no outliers
9/25/2008 Comp 7570 - CSCW (PPI) 107
Statistics for Describing Distributions
• Another important statistic is the measure of dispersion, or how spread out the scores are
• Range, difference between largest and smallest value
• Variance, calculated by computing deviation of each score from the mean, squaring these, adding them up, and dividing by number of scores
• Std deviation, simply the square root of the variance
• The smaller the std dev, indicates that mean is with fewer errors 9/25/2008 Comp 7570 - CSCW (PPI) 108
Plotting Relationships Between Variables
• Reason for experiment is to determine if there is a relationship between IV and DV
• Find it useful to draw a graph to represent the experimental relationship
• Plot DV on y-axis and IV on x-axis
• What types of graphs to use:
– If IV levels cannot be represented by numbers use bar graphs
– If IV is continuous use histogram or line graph
![Page 28: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/28.jpg)
28
9/25/2008 Comp 7570 - CSCW (PPI) 109
Plotting Relationships Between Variables
0
10
20
30
40
50
60
70
P NP
Bar Graph showing mean
comfort scores for players (P)
and non-players (NP)
0
10
20
30
40
50
60
70
1 2 3 4 5
Line graph showing mean
comfort scores for players after
several months of gaming
9/25/2008 Comp 7570 - CSCW (PPI) 110
Strength of a Relationship
• The previous graphs were functions of a descriptive statistic rather than that of individual points
• Rarely will every data point fall on a smooth function
• If you use raw data will very likely find some variability or spread – a scatter plot
9/25/2008 Comp 7570 - CSCW (PPI) 111
Scatterplots
+.87 - 1.0
0
9/25/2008 Comp 7570 - CSCW (PPI) 112
Strength of a Relationship
• Correlation:
– Measures the extent to which two concepts are related
• e.g. years of university training vs. computer ownership per capita
• How?
– obtain the two sets of measurements
– calculate correlation coefficient
• +1: positively correlated
• 0: no correlation (no relation)
• –1: negatively correlated
![Page 29: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/29.jpg)
29
9/25/2008 Comp 7570 - CSCW (PPI) 113
Strength of a Relationship
5 64 5
6 74 45 6
3 55 74 45 7
6 76 67 7
6 87 9
condition 1 condition 2
3
4
5
6
7
8
9
10
2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5
Pickles eaten per month
r2 = .668
Sa
lary
pe
r y
ea
r (
*10
,00
0)
9/25/2008 Comp 7570 - CSCW (PPI) 114
Correlation
5 64 56 74 45 63 55 74 45 76 76 67 76 87 9
3
4
5
6
7
8
9
10
2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5
r2 = .668Pickles eaten
per month
Salary per year
(*10,000)
Pickles eaten per month
Sa
lary
pe
r y
ea
r (
*10
,00
0)
Which conclusion could be correct?
- Eating pickles causes your salary to increase
- Making more money causes you to eat more pickles
- Pickle consumption predicts higher salaries because
older people tend to like pickles better than younger
people, and older people tend to make more money than
younger people
9/25/2008 Comp 7570 - CSCW (PPI) 115
Correlation
• Dangers
– attributing causality
• a correlation does not imply cause and effect
• cause may be due to a third “hidden” variable related to both other variables
– drawing strong conclusion from small numbers
• unreliable with small groups
• be weary of accepting anything more than the direction of correlation unless you have at least 40 subjects
9/25/2008 Comp 7570 - CSCW (PPI) 116
Correlation
• Cigarette Consumption
• Crude Male death rate for lung cancer in 1950 per capita consumption of cigarettes in 1930 in various countries.
• While strong correlation (.73), can you prove that cigarette smoking causes death from this data?
• Possible hidden variables:
– age
– poverty
![Page 30: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/30.jpg)
30
9/25/2008 Comp 7570 - CSCW (PPI) 117
Regression
• Calculates a line of “best fit”
• Use the value of one variable to predict the value of the other
• e.g., 60% of people with 3 years of university own a computer
3
4
5
6
7
8
9
10
3 4 5 6 7
Condition 1
y = .988x + 1.132, r2 = .668y = .988x + 1.132, r2 = .668
65
4 56 74 4
5 63 55 74 4
5 76 76 6
7 76 87 9
condition 1 condition 2
Conditio
n 2
9/25/2008 Comp 7570 - CSCW (PPI) 118
Interpreting Results from Factorial Experiments
• Example:
– time it takes subjects to read paragraphs typed in 12-point or 10-point print
– 8-year olds in one group, 12-year olds in another group
• Cannot simply ask whether the independent variable has had an effect on the dependent variable
• Must ask more specifically:
– Is there an effect of print size? (main effect)
– Is there an effect of age? (main effect)
– Does the effect of one variable depend on the level of the other? (interaction)
9/25/2008 Comp 7570 - CSCW (PPI) 119
Interpreting Results from Factorial Experiments
• Main Effects
– To evaluate main effects of an IV must average across levels of the other variable
– To determine effect of print size we need to find a point halfway between the two levels of age at each level of print size
• We observe a change in print size (10-point to 12-point) causes a change in DV (time) – “yes, there is main effect of print size”
– To determine effect of age we need to find a point halfway between the two levels of print size at each level of age
• We observe that a change in age (increase) causes a change in DV (time decreases) – “yes, there is a main effect of age” 9/25/2008 Comp 7570 - CSCW (PPI) 120
Interpreting Results from Factorial Experiments
10 12
Print Size
Re
ad
ing
Tim
e
40
30
20
10
10 12
Print Size
Tim
e
10 12
Print Size
Tim
e
Main effect of print size?
Main
effect
of age?
yes
yes
Age
8 years
12 years
![Page 31: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/31.jpg)
31
9/25/2008 Comp 7570 - CSCW (PPI) 121
Interpreting Results from Factorial Experiments
• Interactions
– To determine whether the IVs interact we must ask:
• is the effect of print size different for each age? (or)
• is the effect of age different for each print size?
• 1st question:
– we see that going from 10-point to 12-point causes a decrease in reading time for 8-year old but no diff for 12-year old
• 2nd question:
– we see that the difference between reading times for the two ages is larger for 10-point than for 12-point
9/25/2008 Comp 7570 - CSCW (PPI) 122
Interpreting Results from Factorial Experiments
10 12
Print Size
Tim
e
10 12
Print Size
Tim
e
Interaction?
yesAge
8 years
12 years
9/25/2008 Comp 7570 - CSCW (PPI) 123
Activity
10 12
Print Size
Tim
e
Age
8 years
12 years
Print size?
Age?
Interaction?
10 12
Print Size
Tim
e
Print size?
Age?
Interaction?
No
YesNo
Yes
Yes
No
9/25/2008 Comp 7570 - CSCW (PPI) 124
Inferential Statistics
• In many experiments testing one design against another
– i.e. the independent variable is usually discrete
• Can have discrete variables or continuous variables
– Discrete take on finite number of values (screen color)
– Continuous take on any value (person’s height, time to complete task)
• Special case when continuous variable is positive (response time cannot be < 0)
![Page 32: Types of evaluation Done this somewayirani/comp7570/lec01.pdf · dependent variables is referred to as hypothesis (as seen previously) 9/25/2008 Comp 7570 - CSCW (PPI) 24 Control](https://reader035.vdocuments.us/reader035/viewer/2022070722/5f01c0307e708231d400dbfd/html5/thumbnails/32.jpg)
32
9/25/2008 Comp 7570 - CSCW (PPI) 125
Choosing a Statistical Technique
Wilcoxon (Mann-Whitney) rank-sum testRank-sum versions of ANOVASpearman’s rank correlation
• Continuous• Continuous• Continuous
Non-parametric
• Two-valued• Discrete• Continuous
Student’s t-test on difference of meansANOVA (ANalysis Of VAriance)Linear (non-linear) regression factor analysis
• Normal• Normal• Normal
Parametric
• Two-valued• Discrete• Continuous
Dependent Variable
Independent Variable