heuristic evaluation john kelleher. 1 what do you want for your product? good quality? inexpensive?...

Heuristic Evaluation

John Kelleher

2

What do you want for your product?Good quality? Inexpensive? Quick to get to the market?Good, cheap, quick: pick any two.

- Old engineer’s saying

3

Outline

Discount usability engineering Heuristic evaluation Heuristics How to perform an HE HE vs. user testing How well does HE work

4

Discount Usability Engineering

Cheap no special labs or equipment needed the more careful you are, the better it gets

Fast on order of 1 day to apply standard usability testing may take weeks

Easy to use can be taught in 2-4 hours

5

Expert Evaluation

Strongly diagnostic Overview of whole

interface Few resources needed

(except for experts) Cheap High potential return -

detects significant problems

Relies in role playing – can be restricting

Subject to bias Problems locating experts Cannot capture real user

behaviour

Advantages Disadvantages

6

Heuristic Evaluation Developed by Jakob Nielsen (www.useit.com)

Helps find usability problems in a UI design Small set (3-5) of evaluators examine UI

independently check for compliance with usability principles (“heuristics”)

different evaluators will find different problems

Can perform on working UI or on sketches

7

Heuristic Evaluation (cont.) Evaluators goes through UI several times

inspects various dialogue elements compares with list of usability principles consider any additional principles or results that come to

mind

Usability principles Nielsen’s “heuristics” supplementary list of category-specific heuristics

competitive analysis & user testing of existing products

Use violations to redesign/fix problems

8

Heuristics (original) H1-1: Simple and natural dialog H1-2: Speak the users’ language H1-3: Minimize users’ memory load H1-4: Consistency H1-5: Feedback H1-6: Clearly marked exits H1-7: Shortcuts H1-8: Precise and constructive error messages H1-9: Prevent errors H1-10: Help and documentation

9

Phases of Heuristic Evaluation

1) Pre-evaluation training give evaluators needed domain knowledge and

information on the scenario

2) Evaluation individuals evaluate and then aggregate results

3) Severity rating determine how severe each problem is (priority)

4) Debriefing discuss the outcome with design team

10

How to Perform Evaluation Design may be verbal description, paper mock-up, working

prototype, or running system. [when evaluating paper mock-ups, pay special attention to missing dialogue elements!] Optionally provide evaluators with some domain-specific training.

Each evaluator works alone ( ~1–2 hours). Interface examined in two passes: first pass focuses on general

flow, second on individual dialogue elements. Notes taken either by evaluator or evaluation manager.

Independent findings are aggregated Severity ratings are assigned first individually and are then

aggregated. Group debriefing session to suggest possible redesigns.

11

Severity Rating

Used to allocate resources to fix problems Estimates of need for more usability efforts Combination of

frequency impact number of affected users

Should be calculated after all evals. are in Should be done independently by all judges

12

Severity Ratings (cont.)

0 - don’t agree that this is a usability problem

1 - cosmetic problem

2 - minor usability problem

3 - major usability problem; important to fix

4 - usability catastrophe; imperative to fix

13

How Many Problems Found?

Evaluation Name

Number of Evaluators

Total Known Problems

Average No. Problems Found per

EvaluatorTeledata 37 52 51%Mantel 77 30 38%Savings 34 48 26%Transport 34 34 20%

Four heuristic evaluations were conducted by “usability novices” (Nielsen93, UE)

14

Aggregated Evaluations

Aggregate: 1 2 3 5 10Teledata 51% 71% 81% 90% 97%Mantel 38% 52% 60% 70% 83%Savings 26% 41% 50% 63% 78%Transport 20% 33% 42% 55% 71%

Individual evaluators found relatively few problems.

Aggregating the evaluations of several individuals produced much better results:

15

Aggregated Evaluations Average proportion of usability problems found by aggregates

of size 1 to 30.

16

Debriefing

Conduct with evaluators, observers, and development team members

Discuss general characteristics of UI Suggest potential improvements to address

major usability problems Make it a brainstorming session

little criticism until end of session

17

Examples Can’t copy info from one window to another

violates “Minimize the users’ memory load” (H1-3) fix: allow copying

Typography uses mix of upper/lower case formats and fonts violates “Consistency and standards” (H2-4) slows users down probably wouldn’t be found by user testing fix: pick a single format for entire interface

18

HE vs. User Testing HE is much faster

1-2 hours each evaluator vs. days-weeks

HE doesn’t require interpreting user’s actions User testing is far more accurate (by def.)

takes into account actual users and tasks HE may miss problems & find “false positives”

Good to alternate between HE and user testing find different problems don’t waste participants

19

Results of Using HE Discount: benefit-cost ratio of 48 [Nielsen94]

cost was $10,500 for benefit of $500,000 value of each problem ~15K (Nielsen & Landauer)

how might we calculate this value? in-house productivity; open market sales

Correlation between severity & finding w/ HE Single evaluator achieves poor results

only finds 35% of usability problems 5 evaluators find ~ 75% of usability problems why not more evaluators???? 10? 20?

20

# Evals vs Problems Found

21

Cost vs. Benefit

22

Experience of Evaluators

Experience of evaluators influences results. Study of one interface, the Banking System, a touch tone

“voice response” telephone banking system, by 3 groups of evaluators: 31 “novice” evaluators: computer science students with no

formal knowledge of UI or usability (no usability expertise). 19 “regular” specialists: people with UI and usability

experience, but no expertise in voice-response systems (usability expertise).

14 “double” specialists: people with expertise both in usability and in telephone-operated interfaces (usability and domain expertise).

Task: transfer $1000 from savings account to check account.

23

Sample Banking System Dialogue

[First there is a short dialogue in which the user is identified by entering an identification number and access code – this is not part of the evaluation exercise]

1) S: Enter one for account information, three for transfers between your own accounts, . . .

2) U: 3# {the user interrupts the system}3) S: Enter account to transfer from.4) U: 1234567890# {savings account number}5) S: Enter account to transfer to.6) U: # {an abbreviation for the checking account}7) S: Enter amount in cents.8) U: 100000#9) S: From account number twelve thirtyfour fiftysix seventyeight ninety to account

number primary account, a transfer of one thousand dollars is to be made. Press one to confirm, zero to cancel.

10) U: 1#11) S: You do not have access to this function.

24

Major Usability ProblemsNo. Problem Novice Regular Double

1 Error message appears too late. 68% 84% 100%2 Do not require dollar amount to be

entered in cents. 68% 74% 79%3 The error message is not precise. 55% 63% 64%4 The error message is not constructive. 6% 11% 21%5 Replace term "primary account" with

"checking account". 10% 47% 43%6 Let users choose account from a menu. 16% 32% 43%7 Only require a # where it is necessary. 3% 32% 71%8 Give feedback as name of chosen

account. 6% 26% 64%29% 46% 61%

Major Usability Problems

Average for major problems

Proportion of novice, specialist, and double specialist usability evaluators finding problems in the Banking

System. Results from Nielsen [1992].

25

Minor Usability ProblemsNo. Problem Novice Regular Double

9 Read menu item description before action number. 3% 11% 71%

10 Avoid gap in menu numbers between 1 and 3. 42% 42% 79%

11 Provide earlier feedback. 42% 63% 71%12 Replace use of 1/0 for

accept/reject with #/*. 6% 21% 43%13 Remove the field label "number"

when no number is given. 10% 32% 36%14 Change prompt "account" to

"account number". 6% 37% 36%15 Read numbers one digit at a time. 6% 47% 79%16 Use "press" consistently and avoid

"enter". 0% 32% 57%15% 36% 59%

Minor Usability Problems

Average for minor problems

26

Results

Average proportion of usability problems found by aggregates of novice evaluators, regular specialists, and double specialists. Results from Nielsen [1992].

27

Heuristic Evaluation Test

The following figure illustrates a checkout screen for an online store. We describe ten usability violations. Each violation is labelled with a number on the figure.

For each problem, suggest a solution to solve each of these problems.

28

Heuristic Evaluation Test

29

3

1

2

5

46

8

9

7

10

30

10 Heuristic Violations1. H2-1 Visibility of System Status

Problem: UI only says that you are in stage 3, not providing the user with information on how many more stages there are left.

Solution: Indicating that the user is in Stage 3 of 6 or providing a timeline along the top of the page stepping the user through the timeline as they progress through their transaction.

2. H2-2 Match between system and the real world Problem: The term “Wagon” does not match the user’s

conceptual model of shopping. Solution: Change the term “Wagon” to “Cart” or “Basket”.

31

10 Heuristic Violations (contd)3. H2-8 Aesthetic and minimalist design

Problem: The news from the net section has nothing to do with the user’s transaction. This information is distracting and can lead to the user leaving our site to explore a news story and not complete their transaction.

Solution: Remove this section. Can provide this kind of information after the transaction is completed.

4. H2-9 Help users recognize, diagnose, and recover from errors Problem: The message tells the user that the form has errors, but it

doesn’t tell them which fields have errors. Potentially the user could create more errors by changing fields that were originally correct.

Solution: Mark the fields that need to be changed. Move the error message to the top of the page and highlight the fields that the user needs to fix.

32

10 Heuristic Violations (contd)5. H2-4 Consistency and standards

Problem: The ‘Modify’ and ‘Change’ button seem to have the same functionality. Therefore they should be labeled the same. If they do have distinct functions, then they should be labeled clearer and moved so that they do not mislead the user.

Solution: Change the labels on the buttons to ‘Change Item’.

6. H2-3 User control and freedom Problem: The user is given only one choice that is to proceed to the

next page. There is not option to cancel or go back. Solution: A cancel and back button should be implemented

allowing the user to have more control over their process

33

10 Heuristic Violations (contd)7. H2-2 Match between system and the real world

Problem: ‘Transmit’ is not a common term, it is a technical term for sending a form to be processed.

Solution: Change the term to something more clear, like ‘Submit’.

8. H2-6 Recognition rather than recall Problem: To insert an item the user has to recall the item

number. This is too much for the user to remember, especially if there is no correlation between the code and the item.

Solution: Provide a link for the user to continue shopping. This will allow the user to go back to the initial page and search and browse items they might want to add to their cart.

34

10 Heuristic Violations (contd)9. H2-4 Consistency and standards

Problem: The text is in blue and underlined, signaling the user that the text is a hyperlink, which it probably isn’t.

Solution: Change the color and the underlining. Ideally this section should not even be on this page.

10. H2-5 Error prevention Problem: The fields for phone numbers are not fixed in

length. This can be an area that users enter in invalid data.

Solution: To prevent users from accidentally entering in incorrect data, set widths for the text fields so that a format is provided, or provide an example of how the entry should be made.

35

Summary Heuristic evaluation is a discount method Single evaluator finds only small subset of potential problems. Have evaluators go through the UI twice. Ask them to see if it complies with heuristics

note where it doesn’t and say why Combine the findings from 3 to 5 evaluators Have evaluators independently rate severity Discuss problems with design team Alternate with user testing May miss domain-specific problems

heuristic evaluation john kelleher. 1 what do you want for your product? good quality? inexpensive?...

Documents

usability problems

evaluation design

documentation slide

evaluation training

evaluation individuals

expert evaluation

evaluation manager

consistency h1