asking users & experts. the aims discuss the role of interviews & questionnaires in...

Asking users & experts

The aims

Discuss the role of interviews & questionnaires in evaluation.

Teach basic questionnaire design. Describe how do interviews, heuristic evaluation &

walkthroughs. Describe how to collect, analyze & present data. Discuss strengths & limitations of these techniques

Interviews

• Unstructured - are not directed by a script. Rich but not replicable.

• Structured - are tightly scripted, often like a questionnaire. Replicable but may lack richness.

• Semi-structured - guided by a script but interesting issues can be explored in more depth. Can provide a good balance between richness and replicability.

Basics of interviewing

• Remember the DECIDE framework• Goals and questions guide all interviews• Two types of questions:

‘closed questions’ have a predetermined answer format, e.g., ‘yes’ or ‘no’‘open questions’ do not have a predetermined format

• Closed questions are quicker and easier to analyze

Things to avoid when preparing interview questions

Long questions Compound sentences - split into two Jargon & language that the interviewee may

not understand Leading questions that make assumptions e.g.,

why do you like …? Unconscious biases e.g., gender stereotypes

Components of an interview

• Introduction - introduce yourself, explain the goals of the interview, reassure about the ethical issues, ask to record, present an informed consent form.

• Warm-up - make first questions easy & non-threatening.

• Main body – present questions in a logical order

• A cool-off period - include a few easy questions to defuse tension at the end

• Closure - thank interviewee, signal the end, e.g, switch recorder off.

The interview process

• Use the DECIDE framework for guidance• Dress in a similar way to participants• Check recording equipment in advance• Devise a system for coding names of

participants to preserve confidentiality.• Be pleasant• Ask participants to complete an informed

consent form

Probes and prompts

• Probes - devices for getting more information.e.g., ‘would you like to add anything?’

• Prompts - devices to help interviewee, e.g., help with remembering a name

• Remember that probing and prompting should not create bias.

• Too much can encourage participants to try to guess the answer.

Group interviews

• Also known as ‘focus groups’• Typically 3-10 participants• Provide a diverse range of opinions• Need to be managed to:

- ensure everyone contributes- discussion isn’t dominated by one person- the agenda of topics is covered

Analyzing interview data

• Depends on the type of interview• Structured interviews can be analyzed like

questionnaires• Unstructured interviews generate data like that

from participant observation• It is best to analyze unstructured interviews as

soon as possible to identify topics and themes from the data

Questionnaires

• Questions can be closed or open• Closed questions are easiest to analyze, and may

be done by computer• Can be administered to large populations• Paper, email & the web used for dissemination• Advantage of electronic questionnaires is that

data goes into a data base & is easy to analyze• Sampling can be a problem when the size of a

population is unknown as is common online

Questionnaire style

• Varies according to goal so use the DECIDE framework for guidance

• Questionnaire format can include:- ‘yes’, ‘no’ checkboxes- checkboxes that offer many options- Likert rating scales- semantic scales- open-ended responses

• Likert scales have a range of points• 3, 5, 7 & 9 point scales are common• Debate about which is best

Developing a questionnaire• Provide a clear statement of purpose &

guarantee participants anonymity • Plan questions - if developing a web-based

questionnaire, design off-line first• Decide on whether phrases will all be positive,

all negative or mixed• Pilot test questions - are they clear, is there

sufficient space for responses• Decide how data will be analyzed & consult a

statistician if necessary

Encouraging a good response

• Make sure purpose of study is clear• Promise anonymity• Ensure questionnaire is well designed• Offer a short version for those who do not have

time to complete a long questionnaire• If mailed, include a s.a.e.• Follow-up with emails, phone calls, letters• Provide an incentive• 40% response rate is high, 20% is often

acceptable

Advantages of online questionnaires

Responses are usually received quickly No copying and postage costs Data can be collected in database for analysis Time required for data analysis is reduced Errors can be corrected easily Disadvantage - sampling problematic if

population size unknown Disadvantage - preventing individuals from

responding more than once

Problems with online questionnaires

Sampling is problematic if population size is unknown

Preventing individuals from responding more than once

Individuals have also been known to change questions in email questionnaires

Questionnaire data analysis & presentation

• Present results clearly - tables may help• Simple statistics can say a lot, e.g., mean,

median, mode, standard deviation• Percentages are useful but give population

size• Bar graphs show categorical data well• More advanced statistics can be used if

needed

Asking experts

• Experts use their knowledge of users & technology to review software usability

• Expert critiques (crits) can be formal or informal reports

• Heuristic evaluation is a review guided by a set of heuristics

• Walkthroughs involve stepping through a pre-planned scenario noting potential problems

Heuristic evaluation

• Developed Jacob Nielsen in the early 1990s• Based on heuristics distilled from an empirical

analysis of 249 usability problems• These heuristics have been revised for current

technology, e.g., HOMERUN for web• Heuristics still needed for mobile devices,

wearables, virtual worlds, etc.• Design guidelines form a basis for developing

heuristics

Nielsen’s heuristics

• Visibility of system status• Match between system and real world• User control and freedom• Consistency and standards• Help users recognize, diagnose, recover from

errors• Error prevention • Recognition rather than recall• Flexibility and efficiency of use• Aesthetic and minimalist design• Help and documentation

Discount evaluation

• Heuristic evaluation is referred to as discount evaluation when 5 evaluators are used.

• Empirical evidence suggests that on average 5 evaluators identify 75-80% of usability problems.

3 stages for doing heuristic evaluation

• Briefing session to tell experts what to do• Evaluation period of 1-2 hours in which:

- Each expert works separately- Take one pass to get a feel for the product- Take a second pass to focus on specific features

• Debriefing session in which experts work together to prioritize problems

Advantages and problems

• Few ethical & practical issues to consider• Can be difficult & expensive to find experts• Best experts have knowledge of application

domain & users• Biggest problems

- important problems may get missed- many trivial problems are often identified

Cognitive walkthroughs

• Focus on ease of learning• Designer presents an aspect of the design &

usage scenarios• One of more experts walk through the design

prototype with the scenario• Expert is told the assumptions about user

population, context of use, task details• Experts are guided by 3 questions

The 3 questions

• Will the correct action be sufficiently evident to the user?

• Will the user notice that the correct action is available?

• Will the user associate and interpret the response from the action correctly?

As the experts work through the scenario they note problems

Pluralistic walkthrough

• Variation on the cognitive walkthrough theme• Performed by a carefully managed team• The panel of experts begins by working

separately• Then there is managed discussion that leads

to agreed decisions• The approach lends itself well to participatory

design

Key points

• Structured, unstructured, semi-structured interviews, focus groups & questionnaires

• Closed questions are easiest to analyze & can be replicated

• Open questions are richer• Check boxes, Likert & semantic scales• Expert evaluation: heuristic & walkthroughs• Relatively inexpensive because no users• Heuristic evaluation relatively easy to learn• May miss key problems & identify false ones

Testing & modeling users

The aims Describe how to do user testing. Discuss the differences between user testing,

usability testing and research experiments. Discuss the role of user testing in usability testing. Discuss how to design simple experiments. Describe GOMS, the keystroke level model, Fitts’

law and discuss when these techniques are useful. Describe how to do a keystroke level analysis.

Experiments, user testing & usability testing

• Experiments test hypotheses to discover new knowledge by investigating the relationship between two or more things – i.e., variables.

• User testing is applied experimentation in which developers check that the system being developed is usable by the intended user population for their tasks.

• Usability testing uses a combination of techniques, including user testing & user satisfaction questionnaires.

User testing is not researchUser testing

• Aim: improve products• Few participants• Results inform design• Not perfectly replicable• Controlled conditions• Procedure planned• Results reported to

developers

Research experiments

• Aim: discover knowledge• Many participants• Results validated

statistically • Replicable• Strongly controlled

conditions• Experimental design• Scientific paper reports

results to community

User testing• Goals & questions focus on how well users

perform tasks with the product• Comparison of products or prototypes common• Major part of usability testing• Focus is on time to complete task & number &

type of errors• Informed by video & interaction logging• User satisfaction questionnaires provide data

about users’ opinions

Testing conditions

• Usability lab or other controlled space• Major emphasis on

- selecting representative users- developing representative tasks

• 5-10 users typically selected• Tasks usually last no more than 30 minutes• The test conditions should be the same for every

participant• Informed consent form explains ethical issues

Type of data (Wilson & Wixon, ‘97) Time to complete a task Time to complete a task after a specified time

away from the product Number and type of errors per task Number of errors per unit of time Number of navigations to online help or manuals Number of users making a particular error Number of users completing task successfully

Usability engineering orientation

Current level of performance Minimum acceptable level of performance Target level of performance

How many participants is enough for user testing?

• The number is largely a practical issue• Depends on:

- schedule for testing- availability of participants- cost of running tests

• Typical 5-10 participants • Some experts argue that testing should continue

until no new insights are gained

Experiments

• Predict the relationship between two or more variables

• Independent variable is manipulated by the researcher

• Dependent variable depends on the independent variable

• Typical experimental designs have one or two independent variable

Experimental designs

• Different participants - single group of participants is allocated randomly to the experimental conditions

• Same participants - all participants appear in both conditions

• Matched participants - participants are matched in pairs, e.g., based on expertise, gender

Advantages & disadvantagesDesign Advantages Disadvantages

Different No order effects Many subjects & individual differences a problem

Same Few individuals, no individual differences

Counter-balancing needed because of ordering effects

Matched Same as different participants but individual differences reduced

Cannot be sure of perfect matching on all differences

Predictive models• Provide a way of evaluating products or designs

without directly involving users• Psychological models of users are used to test

designs• Less expensive than user testing• Usefulness limited to systems with predictable

tasks - e.g., telephone answering systems, mobiles, etc.

• Based on expert behavior

GOMS (Card et al., 1983)

• Goals - the state the user wants to achieve e.g., find a website

• Operators - the cognitive processes & physical actions performed to attain those goals, e.g., decide which search engine to use

• Methods - the procedures for accomplishing the goals, e.g., drag mouse over field, type in keywords, press the go button

• Selection rules - determine which method to select when there is more than one available

Keystroke level modelGOMS has also been developed further into a quantitative model - the keystroke level model.This model allows predictions to be made about how long it takes an expert user to perform a task.

Response times for keystroke level operators

Operator Description Time (sec)K Pressing a single key or button

Average skilled typist (55 wpm)Average non-skilled typist (40 wpm)Pressing shift or control keyTypist unfamiliar with the keyboard

0.220.280.081.20

P

P1

Pointing with a mouse or other device on adisplay to select an object.This value is derived from Fitts’ Law which isdiscussed below.Clicking the mouse or similar device

0.40

0.20H Bring ‘home’ hands on the keyboard or other

device0.40

M Mentally prepare/respond 1.35R(t) The response time is counted only if it causes

the user to wait.t

Fitts’ Law (Paul Fitts 1954)

• The law predicts that the time to point at an object using a device is a function of the distance from the target object & the object’s size.

• The further away & the smaller the object, the longer the time to locate it and point.

• Useful for evaluating systems for which the time to locate an object is important such as handheld devices like mobile phones

Key points

User testing is a central part of usability testing Testing is done in controlled conditions User testing is an adapted form of experimentation Experiments aim to test hypotheses by manipulating certain

variables while keeping others constant The experimenter controls the independent variable(s) but not

the dependent variable(s) There are three types of experimental design: different-

participants, same- participants, & matched participants GOMS, Keystroke level model, & Fitts’ Law predict expert,

error-free performance Predictive models are used to evaluate systems with

predictable tasks such as telephones

asking users & experts. the aims discuss the role of interviews & questionnaires in...

Documents

data slide

questionnaires questions

easy questions

open questions

types of questions

leading questions

open closed questions

interview data