an initial study on ideal gui test case replayability · –we take test cases built for the first...

An Initial Study on Ideal GUI Test Case Replayability

Arthur-Jozsef Molnar Babes-Bolyai University

• GUI applications are ubiquitous

–Lots of tools for design

–Lots of platforms to choose from

–Lots of widget toolkits

–Great testing processes

An Initial Study on Ideal GUI Test Case Replayability 2

What’s the problem?

• GUI applications are ubiquitous

–Lots of tools for design

–Lots of platforms to choose from

–Lots of widget toolkits

–Great testing processes


What’s the problem?

• Capture/Replay tools – Marathon

• http://www.marathontesting.com/

– Abbot • http://abbot.sourceforge.net/doc/overview.shtml

– Pounder • http://pounder.sourceforge.net/

• Model based approaches – Automated testing using GUITAR[1]

• Reverse engineering the target GUI

• Creating Test Cases • Running Test Cases • Checking the outcome ?


Existing solutions…

[1] GUITAR framework - http://sourceforge.net/projects/guitar/

FreeMind application - http://freemind.sourceforge.net/wiki/index.php/Main_Page

GUIs change…

• Using genetic algorithms to repair non-executable tests – Si Huang, Myra B. Cohen, and Atif M. Memon. Repairing GUI test suites using a genetic algorithm.

Proceedings of the 2010 Third International Conference on Software Testing, Verification and Validation, pages 245–254, Washington, DC, USA. IEEE Computer Society, 2010

• Inserting and removing events – Atif M. Memon. Automatically repairing event sequence-based GUI test suites for regression

testing. ACM Transactions Software Engineering Methodology, 18, 4:1–4:36, November 2008.

• Identifying functionally equivalent GUI elements and updating tests – Scott McMaster and Atif M. Memon. An extensible heuristic-based framework for GUI test case

maintenance. Proceedings of the IEEE International Conference on Software Testing, Verification, and Validation Workshops, pages 251–254, Washington DC, USA. IEEE Computer Society, 2009


Reusing GUI test cases

• Proposed solutions …

– Using genetic algorithms to repair non-executable tests

– Inserting and removing events

– Identifying functionally equivalent GUI elements and updating tests

No turnkey solution … yet!


Reusing GUI test cases

How many test case can be repaired and …

how well

?


The catch

1. We use 28 versions of two well established open source applications – FreeMind. – jEdit.

2. We build information on functionally equivalent GUI widgets between consecutive versions. – Our oracle information.

3. We generate test cases of length 2, 3, 4 and 5. – A test case is a sequence of GUI interactions

4. We partition test cases into one of four categories. – We simulate test case execution due to computation constraints

5. We assess the results – Cross-sectional study – Longitudinal study


Our study

• 13 versions studied • November 2000 – September 2007


1. The applications - FreeMind • Taken from project CVS

• 3500 – 65K lines of code

• 13.5 million downloads

• 17 versions studied

• January 2000 – May 2010


1. The applications - jEdit • Versions 2.3pre2 – 4.3.2

• 23K – 106K lines of code

• 6.5 million downloads


2. Functionally equivalent widgets

Examples of functionally equivalent

widgets


GUI Element properties used

Name Description

Accelerator

The shortcut key combination used to trigger the binded action

Class The class implementing the widget

Icon The path to the widget’s icon

Text The text associated with the widget

Title The widget’s title.

X

The X coordinate of the widget’s upper corner in its containg window

Y The Y coordinate of the widget’s upper corner in its containg window

Width The widget’s width

Height The widget’s height

Index The widget’s index in the parent container



Name Description

Parent The Modes menu

Children -

Class javax.swing.JMenuItem

Accelerator Alt+3

Title -

Icon -

X 319

Y 45

Width 129

Height 21

Index 0

Name Description

Parent The javax.swing.JToolBar element

Children -

Class org.gjt.sp.jedit.gui.EnhancedButton

Accelerator -

Title -

Icon org/gjt/sp/jedit/icons/Cut24.gif

X 196

Y 47

Width 30

Height 30

Index 7


• All length-2 test cases (event interaction)

• 10.000 length-3, length-4 and length-5 test cases (each)

• Grand total: – 404.826 FreeMind test cases

– 564.869 jEdit test cases

• Test cases generated and replay simulated for all subsequent versions 3 times => ~ 24 million test case execution simulations


3. Generated test cases

• Replayable using widget Id – Widget Id computed by some of its properties – Used by GUITAR – Easy to implement but error prone – Sadly, the state of the art

• Replayable after repair – Perfectly replayable

• Repairable – Not the same intermediary steps

• Unrepairable – Some test steps cannot be replicated – Might still be usable – No correspondance with initial test case


4. Test case replayability

• Cross-sectional – We use consecutive application versions

– Test replay is simulated and test cases classified

– Can reveal version pairs that have been heavily changed

• Longitudinal – We take test cases built for the first studied version

– Simulate their execution on all subsequent versions

– Classify them

– A true measure of test case replayability


4. Study approaches


4. Cross-section results - FreeMind

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2000.112000.12

2000.122001.01

2001.012001.04

2001.042001.05

2001.052001.06

2001.062001.07

2001.072001.08

2001.082003.12

2003.122004.01

2004.012004.02

2004.022004.03

2004.032007.09

Replayable using widget Id Replayable after repair Repairable Unrepairable


4. Cross-section results - jEdit

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

2.3pre22.3final

2.3final2.4final

2.4final2.5pre5

2.5pre52.5final

2.5final2.6pre7

2.6pre72.6final

2.6final3.0final

3.0final3.1pre1

3.1pre13.1pre3

3.1pre33.1final

3.1final3.2final

3.2final4.0final

4.0final4.2pre2

4.2pre24.2final

4.2final4.3.0final

4.3.0final4.3.2final

Replayable using widget Id Replayable after repair Repairable Unrepairable


4. Longitudinal results - FreeMind

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

November2000

December2000

January 2001 April 2001 May 2001 June 2001 July 2001 August 2001 December2003

January 2004 February2004

March 2004

ID Replayable Replayable Repairable Unrepairable


4. Longitudinal results - jEdit

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

ID Replayable Replayable Repairable Unrepairable

• What affects test replayability?

– Major GUI changes

– Test case age

– Test case length

• Most test cases can be at least repaired


4. Conclusions

• A more comprehensive study

– More versions

– More applications

• .NET, Qt, SWT, mobile …

• Run test cases, not just simulate them


4. Future steps


Thank you !

an initial study on ideal gui test case replayability · –we take test cases built for the first...

Documents