remote testing

Post on 27-Jan-2016

49 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

REMOTE TESTING. 39TUR 2012/2013. Remote Testing. Software for X. Example: A software needs to be tested for the market in the country of X. Possibilities: Invite 10 people from X to the Czech Republic Air tickets, accommodation, visa Not their own environment Go to X - PowerPoint PPT Presentation

TRANSCRIPT

REMOTE TESTING

39TUR 2012/2013

Remote Testing

(3)

Software for X

Example: A software needs to be tested for the market in the country of X.

Possibilities:– Invite 10 people from X to the Czech Republic

• Air tickets, accommodation, visa• Not their own environment

– Go to X• Use a local recruitment agency• Rent a usability lab• Vaccination

– Is it always necessary?– Use the remote testing

(4)

Traditional Methods vs. Remote Testing

•Traditional methods•Participants sit in the lab

•Testers physically observe & record

•Remote testing•Participants sit in their office/home

•Testers observe their screen via a cable & record

(5)

Hierarchy of remote testing methods

Same time Different time

Sa

me

pla

ce

Diffe

ren

t pla

ce

Classic usability testing(Up until now)

(Not included in the course)

Remote testingTeleconferencing

SurveysOn-line evaluation tools

Same time, different place

(7)

Same time, different place

Synchronous– People connected via teleconferencing

MODERATOR

PARTICIPANT

STAKEHOLDERS

(8)

Remote Testing

The testers observe the participants remotely– Via telephone– Via videoconferencing– Via screen capturing and streaming software

• Could be a combination of a remote desktop (VNC, …) + a screen grabber (Camtasia, …)

Methodology similar to the one of the classic usability tests– Certain differences

(9)

Selvaraj & Houck-Whitaker:– Remote tests have at least the same effectiveness

as the “traditional” methodology

Benefits– Time and costs savings

• You and your participants don’t need to spend time traveling

– Realistic context of use• You reach people in their own environment (might be also a drawback)

– Geographic representation• Different portions of the globe can be covered

– Access to professionals• It’s easier to ask a professional earning “over 9000” to take part in this

test because it will claim less of their time

Quality Comparison

(10)

Quality Comparison

Limitations– Lack of nonverbal signs

• Communication delay

• Low resolution of the video, or perhaps no video link at all

– No control over the participant’s conditions• To check the software is well installed

• To make sure the participant is not being disturbed

– The moderator can’t assist the users on-site• The users are on their own using the system

– Higher level of the user IT literacy is expected• Can not test with the novice users

– Computer security measures

Quality Comparison

Limitations– Will the users trust our application?

• People afraid of spyware

• Privately owned vs. corporate computers

– Will the stakeholders believe that it’s not fake?• There is no observer room

Costs Comparison

Traditional TestsLab

Equipment

Recruitment

Travel costs

User incentives –

physical presents,

money

Remote TestsOnline Meeting

User’s equipment

Recruitment

No travel costs

User incentives –

electronic coupons,

money

(13)

Remote Testing Overview

Very similar to the “classic usability testing”– Define Objectives & Target Audience– Set up Test Scenario– Recruit Test Users– Carry out Tests– Analyze Findings– Design Report & Brief Stakeholders

(14)

Remote Testing Overview

What are you testing? Who are you testing?

Representative Tasks– Within time-limits & user capabilities– In line with test objectives

Methods of data collection– Screen capture– Questionnaires, interviews

(15)

Remote Testing: Test preparation

Consult the objectives with the project stakeholders

Develop instructions for the participants Run pilot test with home users Apply changes suggested by the results of the pilot

test

(16)

Remote Testing: Recruitment

Define user profile & recruitment criteria Set up recruitment screener

– Screener can be filled out on the web• Questionnaires Database of potential participants

• Selection from the database

– Telephone screener• Very low success rate (telephone marketing failure)

Decide on the incentives

(17)

Remote Testing: Recruitment

Recruitment channels– Web

• Social networks, mailing lists, job portals

– Traditional: Newspapers, ads• With a URL to enter• Unlikely to succeed

– Recruitment agency• May be important when testing in an unknown market• Perhaps better targeted participants

– Web – advanced services• ethnio.com• clicktale.com

(18)

ethn.io

Recruiting people directly from a website Procedure:

– Set up a screener at your ethnio.com profile– Set up your website to display the screener– A website visitor will see the screener– If responds, you will be notified immediately– You contact the person by telephone / e-mail

Such participants are likelyto be of your target group.

(19)

ethn.io – Example of Screener

(23)

Remote Testing: Recruitment

Specific requirement– The users must be able to install:

• The software that is to be tested

• The tools used for the test

– The task sheet for the participants must be more specific• There is no moderator in their place

Consent solicitation– Can’t physically sign a sheet of paper– By voice, saying “Yes, I agree.”– By clicking “I agree” on the screener form

(24)

Remote Testing: Technology to use

Teleconferencing– Skype– Screen capture and streaming

• VNC• Remote Desktop in MS Windows

(25)

Carry out the Test

During the test:– Confirm user profile eligibility– Ask for permission to record session– Limit moderator intrusion– Encourage thinking aloud– Take notes– Deliver incentive/payment– Have fun

(26)

Analysis & Reports

•During tests: track all usability issues

•After each test: compare notes & analyze

•After all tests: summarize patterns & major problems

•Set up report & sample videos

•Communicate to all stakeholders

Same place, different time

Same place, different time

Data are physically acquired Data are picked up later on

Examples:– Customer satisfaction survey books– Elections– (Geocashing)

Different time, different place

(30)

Different time, different place

Asynchronous– Passing messages between the testers and the participants– The whole test can take a considerable amount of time due to

delays of communication between the testers and the participants

– Testers provide instructions • Through a website / e-mail message

– Participants provide data• answering a questionnaire

• by monitored interaction with the product

– The data are aggregated automatically

(31)

Different time, different place

Features– Can be done automatically– Good for quantitative data collection– Good when there are lot of participants (25 – 100)

Drawback– We control the conditions even less

(32)

Questionnaire-based Testing

Questionnaire– A set of questions

• With defined responses ([yes][no], [1][2][3][4][5], …)

• Open ended questions

– The same questionnaire administered to all participants– Easy to administer

• Point to a web form

• Send a structured e-mail

– Easy to process• Automatic processing of the web forms

• Automatic processing of returned e-mails

(33)

Questionnaire-based Testing

Not many people respond to questionnaires– Need to “market” the study well

How to aim for specific target group?– Questionnaire should contain some screening questions

• Questionnaire contains Screener

Danger of …– … self-selection!

(34)

SUMI

Software Usability Measurement Inventory Measuring software quality from the user’s point of view

– “Quality of Use”

Input:– The software or its prototype must exist– 10 users minimum

Output:– Five “grades”: Efficiency, Affect, Helpfulness, Control, Learnability– Based on existing database of gathered questionnaires

• Kept by the authors of SUMI

(35)

SUMI: Use

How can be used:– Assess new products during product evaluation– Make comparisons between products or versions of products– Set targets for future application developments

Able to test verifiable goals for quality of use Track achievement of targets during product development In a quantitative manner

Source: http://www.ucc.ie/hfrg/questionnaires/sumi/whatis.html

(36)

SUMI: Scales

Efficiency– Tasks are completed by the user in a direct and timely manner

Affect– How much the product captures emotional responses

Helpfulness– The product seems to assist the user

Control– Users feels that they set the pace, not the product (they are in

control)

Learnability– Ease with which the user can learn using the software and/or new

features

(37)

SUMI: Questionnaire

50 fixed and predefined questions, such as:– “This software responds too slowly to inputs”– “I would recommend this software to my colleagues”– “The instructions and prompts are helpful”– “I sometimes wonder if I am using the right command”– “I think this software is consistent”

Responses to these questions:– “Yes”– “No”– “Undecided”

(38)

SUMI: Processing

The assignment between questions and scales is not disclosed.– SUMI is a commercial service– Know-how of the authors

Procedure:– Participants try the tested system– SUMI questionnaires administered to the participants by testers– Responses to the questionnaires sent to the authors of SUMI

• e-mail, web form, …

– Testers receive the grades from SUMI• A nominal fee (hundreds USD)

(39)

SUMI: Example Evaluation

Reference Value

(40)

SUMI: Evaluation

The data provided are with respect to the corpus of previously gathered data– The values show the usability of the system compared

to the reference score (50 in each scale)

The data can be used to compare two different systems– Better score vs. worse score

SUMI: Evaluation

Enough to provide an unbiased and objective results?– YES

Enough to give insights into particular problems?– NO … we only have 5 numbers as an output– We know nothing about the sources of errors

Automated User Testing

(43)

Federico M. Facca

43

Automating Usability Testing

Usability Testing– a prototype or the final application is provided to a set of

users and the evaluator collect and analyze usage data– can be based on a set of predetermined tasks

What can be automated in such method?– capture of usage data– analysis based on predefined metrics or a model

Usability Evaluation of:– navigation– functionalities

(44)44

Capturing Data

Information– easy to record but difficult to interpret (e.g., keystrokes)– meaningful but difficult to label correctly (e.g., when a

task can be considered completed?)

Method Type:– Performance logging (e.g. events and time of

occurrence, no evaluator)– Remote testing (e.g. assigned task performed by user

and monitored by evaluators)

(45)Federico M. Facca 45

Capturing Data – the Web – Server-side Logging

Web Server commonly log each user request to the server

Available information is: – IP address, request time, requested page, referrer

We can derive:– Number of visitors– Breakdown by countries– Coverage by robots …

Server-side Logging

Pro– huge quantity of easily available data– do not require “ideal” users

Typical questions that we can answer:– “Which contents is interesting?”– “Do people reach all contents?”

• “Is all contents necessary?” … which is not the same as:

• “Is the navigation good?”

– “Does the new design keep people longer on site?”– “Does the new design make people buy more?”

Server-side Logging

Disadvantages:– Highly quantitative method– Almost no data of exact user interaction with the

interface

(48)Federico M. Facca 48

Client-side Logging

Dedicated tools and settings– The web client must be enhanced to log information on interaction– The client pushes information into a repository on the testers’

server Available information is:

– IP address, request time, requested page, referring page, mouse position on the screen, clicked links, back button…

Pro– actual data of exact user interaction with the interface– session are automatically reconstructed

Against:– The participant must use this enhanced browser.

(49)Federico M. Facca 49

Tools

“Formal” Client Side User Tracking/Analysis– Commercial tools

• ETHNIO (http://www.ethnio.com/)

• Ulog/Observer (http://www.noldus.com)

• UserZoom (http://www.userzoom.com)

• ClickTale (http://www.clicktale.com/)

• Usabilla (http://www.usabilla.com)

• Nielsen Eye Tracking (example in the next slides)

– Other tools (some are a bit old)• WebQuilt (http://guir.berkeley.edu/projects/webquilt/)

• SCONE/TEA (http://www.scone.de/docus.html)

• NIST WebMetrics (http://zing.ncsl.nist.gov/WebTools/, not only for tracking and relative analysis)

(50)Federico M. Facca 50

Tools

“Informal” Client Side Tracking/Analysis– Commercial Tools

• Google Analytics (http://www.google.com/analytics/)• Fireclick (http://www.fireclick.com/)• SiteCatalyst (http://www.omniture.com/products/web_analytics)• Hitslink (http://os.hitslink.com/)• Crazy Egg (http://crazyegg.com) nice example• Usabilla (http://usabilla.com)• …. tons really

– Free Tools• Search with Google, you can find some

Server side analysis– Again tons of solutions!

(51)

usabilla.com

Main principle– Testers present a website screenshot– Participants mark points on the screenshots according

to tasks, e.g.:• “Click on the element that you would remove from the page.”

– Comments can be added– The testers can see the results in an aggregate form

• Individual points and comments are anonymous

TUR 2010

(52)

Task: “Click on the element that you found most interesting.”

Responsesfrom the user

(53)

usabilla.com: Example

Object of the test– Czech version of the home page of the DCGI website

Participants– CTU students

Tasks– Click on the element that you found most interesting– Click on the elements that you like most– Click on the elements that you would remove from the

page– Where would you look for contact information?– Where would you look for CVs of the faculty members?

(54)

usabilla.com: Example

Recruitment– 18 people asked to participate– People who were online at the moment– Via ICQ and Skype– 10 users participated

Data acquired– 95 points (for all tasks)– 3 comments

Total time to carry out this test: 1 hour

usabilla.com: Example – All points (for all tasks)

usabilla.com: Example – Most interesting

usabilla.com: Example – What to remove

Comments

(58)

usabilla.com: Example – What to remove – Comments

The users clicked almost all elements to be removed– Can we trust such data?– We can not assume that all the users preferred to

remove all the elements– We need to interpret this as a possible dissatisfaction

with the layout of the website• This needs to be verified and concretized

• Separate test (different method) needs to be applied

(59)

usabilla.com: Example – What to remove – Comments

Actual responses from the participants:– "osklive menu"

• (the menu is ugly)

– "mno nevim jestli odstranil, ale mail to moc nepripomina"• (well, probably not really remove but it does not look like an e-mail)

– "no tohle je asi z nejakeho publikacniho systemu, me to trochu rozciluje"

• (this is probably from some CMS, makes me angry a bit)

How to interpret these?– Only suggestions for further testing– (Not enough data for conclusive statements)

usabilla.com: Example – Where to find faculty CVs?

Very niceand conclusive

cluster.

usabilla.com: Example – Where to find contact info?

Most people wouldlook here

But someassumed there

were other ways

(62)

usabilla.com:

General rule:– We should trust a point only as long as it is verified by

multiple instances

Benefits:– A rapid method of testing– Very easy analysis of data

Drawbacks:– No protection from “malevolent participants”

• “I’ll click you to death!”

– Motivation for placing the points not always understood

(63)

By Federico M. Facca

63

SCONE / TEA

Support for Formal User Testing:– task specification– browser control– user tracking according to task specification

top related