evaluating ubicomp applications in the field
Post on 07-Jan-2016
34 Views
Preview:
DESCRIPTION
TRANSCRIPT
Evaluating ubicomp applications in the fieldGregory D. Abowd, Distinguished ProfessorSchool of Interactive Computing and GVU Center, Georgia Tech
Ubicomp evaluation in the field
• Weiser (CACM 93): “Applications are of course the whole point of ubiquitous computing.”– Applications are about the real world
• Design: finding appropriate ubicomp solutions for real-world problems.
• Evaluation: Demonstrating that a solution works for its intended purpose.
• But these activities are intertwined.Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Defining evaluation terms
• Formative and Summative
• Who is involved in evaluation
• Empirical, Quantitative, and Qualitative Evidence
• Approaches to evaluation
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Formative and Summative
Formative–assess a system being designed
–gather input to inform design
Summative–assess an existing system
–Summary judgement of success criteria
•Which to use?–Depends on
•maturity of system•how evaluation results will be used
–Same technique can be used for either
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Who is involved with evaluation?
• End users, or other stakeholders
• The designers or HCI experts
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Form of data gathered
Empirical: based on evidence from real users
Quantitative: objective measurement of behavior
Qualitative: subjective recording of experience
Mixed methods: a combination of quant and qual
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Approach
• Predictive modeling
• Controlled experiment
• Naturalistic study
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Predictive Modeling
• Try to predict usage before real users are involved • Conserve resources (quick & low cost)• Model based
– Calculate properties of interaction– Fitts’ Law, Keystroke Level Model
• Review based– HCI experts (not real users) interact with system, find
potential problems, and give prescriptive feedback– Best if they:
• Haven’t used earlier prototype• Familiar with domain or task• Understand user perspectives
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Controlled experimentation
Lab studies, quantitative results– Typically in a closed, lab setting– Manipulate independent variables to see effect
on dependent variables
– Replicable
– Expensive, requires real users and lab
– Can use follow-up interviews for qualitative results
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Naturalistic Study
Or the field study
– Observation occurs in “real life” setting– Watch process over time
– “Ecologically valid” contends with controlled and “scientific”
– What is observed can vary tremendously
We will focus on this form of evaluation
Why is this so hard?
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Historical overviewSome examples of ubicomp technologies in the field
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Examples of field studies
• Xerox PARC– PARCTab
– Liveboard
• Georgia Tech– Classroom 2000
– Digital Family Portrait
– Personal Audio Loop
– Abaris, CareLog and BabySteps
– Cellphone proximity
– SMS for asthmaEvaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Xerox PARC
• Computing at different scales– Inch
– Foot
– Yard
• They were pretty successful at inch and yard scales with two very different approaches to evaluation
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Xerox PARC inch scale
• The PARCTab– Location-aware thin
client
– Deployed in CSL building
• Built devices and programmable context-awareness
• Gave it to community to see what they would produce
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Xerox PARC yard scale
• The Liveboard– Pen-based
electronic whiteboard
– Deployed in one meeting room
• Designed solution for IP meetings– Supporting work of
one individual
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Georgia Tech Classroom 2000
• The Liveboard in an educational setting
• Establishing theme of automated capture
• Room takes notes on behalf of student
• 4-year study of impact on teaching and learning experience.
• The living laboratoryEvaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Georgia Tech Aware Home
• What value is it to have a home that knows where its occupants are and what they are doing?
• A living laboratory?– Great for feasibility
studies and focus groups
– Never anyone’s “home”
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Georgia Tech Digital Family Portrait
• Great example of a formative study done in the wild
• Sensing replaced by phone calls
• Similar to Intel Research CareNet Display
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Georgia Tech Technology and Autism• Ubicomp got very personal for me in 2002
• Improving data collection
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
From whimsical to inspirational…Dec. 1998
Aidan, 18 months
From whimsical to inspirational…July 1999
Aidan, 26 months
DetailedScoring
ManualCalculations Hand Plotting
Abaris: Embedding Capture
Leverages basic therapy protocol to minimize intrusion
Speech detection to timestamp beginning of trial
Record handwriting using Anoto digital pen to collect grades and
timestamp end of trial
Julie Kientz, Ph.D.
04/20/23
Abaris: Embedding Access Julie Kientz
Abaris: Study
• 4 month real use deployment study– Case Study: Therapy team for one child
• 52 therapy sessions (50+ hours of video)• 6 team meetings
• Data collected– Video coding and analysis of team decisions during sampled meetings
• Meetings without Abaris: 39 decision points across 3 meetings• Meetings with Abaris: 42 decision points across 3 meetings
– Interviews with team members– Software logging of Abaris
Full study details: Chapter 5
26Julie A. Kientz, Georgia Tech
Results: Easier Data Capture
• Therapists were able to learn to use the Abaris system with very quick training
• Therapists spent less time processing paperwork
Julie A. Kientz, Georgia Tech
27
Results: Access to DataPercentage of decision points in which a given artifact was used. Oftentimes, therapists used multiple artifacts.
Artifacts %Without Abaris %With Abaris
Video* 0.0 45.2
Graphs 56.0 81.9
Data sheets** 20.5 45.2
Therapy samples* 0.0 19.0
Reenactment 0.0 4.8
Memory 92.3 83.3
Ext. Observations 25.6 21.4
Therapist Notes** 5.1 19.0
For full details, see Chapter 528
* p < .01 ** p < .05
Julie A. Kientz, Georgia Tech
Results: Improving Collaboration
29
Average Participation Levels With and Without Abaris
1.00
1.25
1.50
1.75
2.00
2.25
2.50
2.75
3.00
Conditions
Par
ticp
atio
n L
evel
(1=
low
, 3=
hig
h)
Series1 2.44 1.98
With Abaris Without Abaris
Analysis of decision points in team meetings indicated an increase in collaboration
Interviews with therapists after meeting confirmed these numbers
p < .01
For full details, see Chapter 5
Julie A. Kientz, Georgia Tech
(with Gillian Hayes (GT), Juane Heflin (Georgia State), Cobb County Special Ed. and Behavior Imaging Solutions, Inc.)
Collecting rich behavioral data in the unstructured natural environment
Retroactively saving important videoConscious selection of relevant video
episodes
Gillian Hayes
After-the fact capture and annotation
Examples of field studies
• Xerox PARC– PARCTab
– Liveboard
• Georgia Tech– Classroom 2000
– Digital Family Portrait
– Personal Audio Loop
– Abaris, CareLog and BabySteps
– Cellphone proximity
– SMS for asthmaEvaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Others
• Intel Research– CareNet Display
– Reno
– UbiFit, UbiGarden
• EQUATOR– Mixed reality games
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
Lessons from our ancestors
• Evaluation takes time– The experience of ubicomp does not come
overnight
– The “abowd” unit of evaluation; nothing substitutes for real use
• Bleeding edge technology means people must keep system afloat– The users cannot be expected or bothered with
installation or maintenance (C2K)
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
More lessons
• Sometimes you want to evaluate in the field without any working system– The idea of experience prototypes (or
paratypes), or humans as sensors
Evaluation in the field, Gregory D. Abowd, Ubicomp 2011 Tutorial
3 Types of Field Studies (Ch. 4 by Brush)• Or, why do the study?
• In somewhat chronological order:– Understand current behavior
– Proof of concept
– Experience using a prototype
36
Understanding Behavior
Insight into current practice, baseline of behaviors– To inform new designs
– To use as comparison at some later date
• Brush and Inkpen (2007)
• Patel et al. (2006)
37
Proof of Concept
Bleeding-edge prototypes but not in the lab.
• Context-Aware Power Management (2007)
• TeamAwear (2007)
38
Experience
Prolonged use that is not about feasibility of the technology, but about the impact on the everyday experience.
• CareNet (2004)
• Advanced User Resource Annotation
39
Study Design
• Study Design
• Data collection techniques– Surveys, interviews, field notes,
logging/instrumentation, experience sampling, diaries
• How long should the study be?– The “abowd” as a unit of time for a field study.
40
Participants
• Ethics
• Selection of the right participants
• Number of participants
• Compensation
41
Analysis of data
• Quantitative data– Relevant statistical methods based on data
collection
• Qualitative data– Often unstructured and must be processed (or
coded) to be understood and analyzed further
42
Let’s reflect on the other readings…• Provide summary based on:
– Type of study (Brush taxonomy)
– What was its purpose?
– How study was designed• Data to be collected, conditions of study, number and
kind of participants, length of study
– How results were analyzed
– Did the study meet its purpose?
43
top related