usability evaluation of multi-device/platform user interfaces generated by model-driven engineering
DESCRIPTION
Nowadays several Computer-Aided Software Engineering environments exploit Model-Driven Engineering (MDE) techniques in order to generate a single user interface for a given computing platform or multi-platform user interfaces for several computing platforms simultaneously. Therefore, there is a need to assess the usability of those generated user interfaces, either taken in isolation or compared to each other. This paper describes an MDE approach that generates multi-platform graphical user interfaces (e.g., desktop, web) that will be subject to an exploratory controlled experiment. The usability of user interfaces generated for the two mentioned platforms and used on multiple display devices (i.e., standard size, large, and small screens) has been examined in terms of satisfaction, effectiveness and efficiency. An experiment with a factorial design for repeated measures was conducted for 31 participants, i.e., postgraduate students and professors selected by convenience sampling. The data were collected with the help of questionnaires and forms and were analyzed using parametric and non-parametric tests such as ANOVA with repeated measures and Friedman’s test, respectively. Efficiency was significantly better in large screens than in small ones as well as in the desktop platform rather than in the web platform, with a confidence level of 95%. The experiment also suggests that satisfaction tends to be better in standard size screens than in small ones. The results suggest that the tested MDE approach should incorporate enhancements in its multi-device/platform user interface generation process in order to improve its generated usability.TRANSCRIPT
Usability Evaluation of Multi-Device/Platform User Interfaces Generated by Model-Driven Engineering
Nathalie Aquino1, Jean Vanderdonckt1,2, Nelly Condori-Fernández1, Óscar Dieste3, Óscar Pastor1
1Centro de Investigación en Métodos de Producción de Software, Universidad Politécnica de Valencia, Spain{naquino, nelly, opastor}@pros.upv.es
2Université catholique de Louvain, Louvain School of Management (LSM), [email protected]
3Facultad de Informática, Universidad Politécnica de Madrid, [email protected]
This work has been developed with the support of MICINN under the project SESAMO TIN2007-62894 and co-financed with ERDF, ITEA2 Call 3 UsiXML project under reference 20080026, MITYC under the project MyMobileWeb TSI-020301-2009-014, and GVA under grant BFPI/2008/209.
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 2
Agenda
• Introduction• Experiment Definition• Experiment Planning and Operation• Validity Evaluation• Analysis• Conclusion
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 3
Introduction
• Multiple computing platforms and display devices
• Interactive applications
Devices with one computing platformDevice
Devices with different computing platforms
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 4
Introduction
• Model-Driven Engineering (MDE) of User Interfaces (UIs)– Good option to develop multi-device/platform interactive systems– Several approaches currently exist: UIML, UsiXML, Teresa, Maria, Just-
UI, OO-Method, among othersUSER INTERFACE CONTEXT
INTERACTION REQUIREMENTS
DOMAIN TASKS
ABSTRACT USER INTERFACE
CONCRETE USER INTERFACE
USER INTERFACE CODE
CIM
PIM
PSM
CODE
USER ENVIRONMENT
PLATFORM
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 5
Introduction
• General question – What is the quality of the multi-device/platform UIs
generated by MDE?
Usability is a very important aspect of quality, especially in the case of interactive applications
• Research question– Is the usability of UIs generated by MDE the same in
different devices and platforms?
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 6
Experiment Definition
• In order to address the research question, an exploratory usability evaluation was conducted in an experimental controlled context
• Goal of the experimentAnalyze multi-device/platform graphical UIs generated by MDE for the purpose of evaluating their quality with respect to usabilityfrom the point of view of the researchers in the context of computer science professionals using UIs of an interactive
application with different computing platforms and devices
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 7
Experiment Definition
• Usability was measured in accordance with ISO 9241-11– User satisfaction– Effectiveness – Efficiency
• Computing platforms– Desktop: C# running on .NET– Web: JavaServer Faces running on Java
• Devices– Small screen: ultra-mobile PC and stylus– Standard size screen: PC, mouse, and keyboard– Wide screen: PC connected to a TV, mouse, and keyboard
• MDE approach– OO-Method/OlivaNOVA
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 8
Experiment Definition
• OO-Method/OlivaNOVA - Presentation Model
HIERARCHICAL ACTION TREE
SERVICEINTERACTION UNIT
INSTANCEINTERACTION UNIT
POPULATIONINTERACTION UNIT
MASTER/DETAILINTERACTION UNIT
INTRODUCTION
DEFINEDSELECTION
ARGUMENTGROUPING
POPULATION PRELOAD
DEPENDENCY
CONDITIONAL NAVIGATION
FILTER
ORDER CRITERION
DISPLAY SET
ACTIONS
NAVIGATIONS
MASTERINTERACTION UNIT
DETAILSINTERACTION
UNITS
A uses B
A B
Legend
Level 1 Level 2 Level 3
Interaction Units: main interactive operations.
SIU: scenario to execute a service.
PIU: scenario where multiple objects are presented.
IIU: scenario where information about a single object is presented. Special case of PIU.
MDIU: presents collections of objects that belong to different interrelated classes. Combination of other interaction units.
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 9
Experiment Planning and Operation
• Hypotheses– Null hypotheses: when using interfaces automatically
generated from InteractionUnit, the UsabilityAspect is the same for different platforms and devices
where the following values are combinedInteractionUnit = {SIU, PIU}UsabilityAspect = {user satisfaction, effectiveness, efficiency}
6 null and alternate hypotheses
– Alternate hypotheses: when using interfaces automatically generated from InteractionUnit, the UsabilityAspect is not the same for different platforms and devices
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 10
Experiment Planning and Operation
• Response variables
Variable Measure
Satisfaction Overall satisfactionSystem usefulnessInformation qualityInterface quality (CSUQ)
Effectiveness Task completion percentage
Efficiency Task completion percentage in relation to time in task
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 11
Experiment Planning and Operation
• Factors– Device
• Small screen• Standard size screen• Wide screen
– Platform• Desktop• Web
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 12
Experiment Planning and Operation
• Experimental subjects– Selected by convenience sampling
• Computer science postgraduate students and professors from Universidad Politécnica de Valencia were invited to participate
– Participation was voluntary– Subjects did not received incentives– 31 people participated– Subjects did not received training
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 13
Experiment Planning and Operation
• Objects of study– Multi-device/platform graphical UIs generated by MDE
http://www.pros.upv.es/users/naquino/mdp-usability-eval/
Web Site
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 14
Experiment Planning and Operation
• Experiment design– Factorial 3x2x2 design with repeated measures
• The order in which subjects tested the different combinations was randomized
Device
Small Standard Wide
Platform Platform Platform
Desktop Web Desktop Web Desktop Web
Interaction Unit
Interaction Unit
Interaction Unit
Interaction Unit
Interaction Unit
Interaction Unit
SIU PIU SIU PIU SIU PIU SIU PIU SIU PIU SIU PIU
Subjects
All All All All All All All All All All All All
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 15
Experiment Planning and Operation
• Tasks– 12 tasks were defined, 6 for SIU and 6 for PIU– Tasks were similar regarding complexity– Subjects used a different and randomly assigned task in
each of the 12 combinations of device, platform and interaction unit
Web Site
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 16
Experiment Planning and Operation
• Experimental procedure– Presentation with general information and instructions– Demographic questionnaire– Guideline
• Specifies the combination of device, platform, and interaction unit to use, together with the task to do
• For each task– Start and completion time– CSUQ
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 17
Experiment Planning and Operation
• Experimental procedure– Groups of at most three subjects at a time, on different
days– Two hours to complete the experiment
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 18
Experiment Planning and Operation
• Data collection– Tasks were corrected to obtain the task completion
percentage (effectiveness)– Time on task was derived from the start and completion
time of each task (efficiency)– Overall satisfaction, system usefulness, information
quality, and interface quality were aggregated from the answers of CSUQ, following the rules specified by the designers of CSUQ
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 19
Validity Evaluation
• Conclusion validityThreat Category Threat Actions taken
Reliability of measures
Satisfaction is a subjective measure
CSUQ has excellent psychometric reliability properties that have been reported in literature
Start and completion time of tasks were measured manually
Reliability of the application of treatments to subjects
Evaluations were carried out on different occasions
A standard procedure was designed to be equally applied in each occasionAssignment of devices, platforms, and tasks was carried out randomly
Random heterogeneity of subjects
Subjects had different levels of experience in using the different devices
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 20
Validity Evaluation
• Internal validityThreat Category Threat Actions taken
Instrumentation Badly designed instruments
Instrumentation, tasks, and objects of study were pre-validated by two persons
Maturation Repeated measures Learning effect
Different tasks, but with similar complexity, were proposed
Long experiment A five-minutes break was given to participants at each change of device
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 21
Validity Evaluation
• Construct validityThreat Actions taken
Not having representative material
We used an application implemented by the developers of the OlivaNOVA tool. This application is used in training courses about the tool.
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 22
Validity Evaluation
• External validityThreat Category Threat Actions taken
Interaction of selection and treatment
Not having a representative population from which to generalize results
Subjects had different levels of experience regarding the different devices, but all of them had a background on computer science
Interaction of setting and treatment
Not having representative material from which to generalize results
We selected OO-Method/OlivaNOVA as a representative approach of MDE of UIs since it has been patented and is currently being used in commercial and industrial environments
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 23
Analysis
• Statistical analysis– Statistical Package for the Social Sciences (SPSS) V16.0– Confidence level of 95% (α = 0.05)
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 24
Analysis
• Data distribution for SIU
Overall satisfaction Effectiveness Efficiency
Line 1: small screen and web platform Line4: standard screen and desktop platform
Line 2: small screen and desktop platform Line 5: wide screen and web platform
Line 3: standard screen and web platform Line 6: wide screen and desktop platform
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 25
Analysis
• Data distribution for PIU
Overall satisfaction Effectiveness Efficiency
Line 1: small screen and web platform Line4: standard screen and desktop platform
Line 2: small screen and desktop platform Line 5: wide screen and web platform
Line 3: standard screen and web platform Line 6: wide screen and desktop platform
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 26
Analysis
• Main analysis performed on response variables– Descriptive statistics
• Mean• Standard deviation
– Outliers identification• Box plots
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 27
Analysis
• Main analysis performed on response variables– Hypotheses testing
• Performed discarding and without discarding outliers• Normality test: Kolmogorov-Smirnov• Normal distribution
– Parametric test» ANOVA with repeated measures» If significant: Estimated Marginal Means (for paired comparisons) using
Bonferroni as confidence interval adjustment
• Not normal distribution– Non-parametric test
» Friedman test» If significant: Wilcoxon signed-rank test (for paired comparisons) using a
Bonferroni correction to control the error rate
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 28
Analysis
• Main results for SIU– Overall satisfaction, system usefulness, information quality, and
efficiency are significantly better (confidence level of 95%) in the desktop platform than in the web platform
– Efficiency is significantly better in the standard size screen than in the small one as well as it is significantly better in the wide screen than in the small one
– Effectiveness is significantly better when combining the small screen with the desktop platform than when combining the standard size screen with the web platform (only discarding outliers)
– Interface quality was not affected by the use of different sized devices or platforms
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 29
Analysis
• Main results for PIU– Efficiency is significantly better for large screens than for small ones
and for the desktop platform rather than for the web one
– Overall satisfaction and system usefulness tend to be better (confidence level of 90%) for standard size screens than for small ones
– Information quality, interface quality, and effectiveness were not affected by the use of different sized devices or platforms
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 30
Conclusion
• Regarding platforms, the desktop platform obtained the best results
• Regarding devices, the one with the small screen obtained the worst results
• Possible causes– OO-Method/OlivaNOVA is mainly used to develop organizational
information systems. In those environments the desktop platform and the standard size screens are more common than the other options.
– The kinds of user interfaces that people are used to using in small devices is different from the type of user interfaces generated with OlivaNOVA.
• OO-Method/OlivaNOVA should incorporate enhancements in order to generate multi-device/platforms user interfaces with a suitable usability
04/10/23 ESEM 2010 - Bolzano-Bozen, Italy 31
Thank you very much for your attention
www.pros.upv.es