remi: defect prediction for efficient api testing (esec/fse 2015, industrial track)

REMI: Defect Prediction for Efficient API Testing

ESEC/FSE 2015, Industrial track

September 3, 2015

Mijung Kim*, Jaechang Nam*, Jaehyuk Yeon+, Soonhwang Choi+, and Sunghun Kim*

*Department of Computer Science and Engineering, HKUST+Software R&D Center, Samsung Electronics CO., LTD

2

...

3

Motivation• Reliability of APIs

• Cost-intensive software quality assurance (QA) tasks at Samsung– Creating test cases for APIs– Testing APIs

4

Testing Phase at Samsung

Test Cases (TCs) Run

TCs

Passing

TestsFailin

g TestsFix

APIs

APIs

• Test Cases– 1 or 2 test cases for each API• Uniformly distributed across all APIs• Tedious task and not efficient

Modification and Enhancement

5

Goal

• Apply software defect prediction for the efficient API testing.

6

Proposed Approach: REMIRisk Evaluation Method for Interface testing

Test Cases (TCs) Run

TCs

Passing

TestsFailin

g TestsFix

APIs

APIs

REMI(API Rank)

Modification and Enhancement

7

Predict

Training

?

?

Model

API Project A

: Metric value: Buggy-labeled instance: Clean-labeled instance

?: Unlabeled instance

Software Defect Prediction

Related WorkMunson@TSE`92, Basili@TSE`95, Menzies@TSE`07,Hassan@ICSE`09, Bird@FSE`11,D’ambros@EMSE112Lee@FSE`11,...

8

ApproachREMI: Risk Evaluation Method for

Interface testing

CollectMetrics

Aggregate

Metrics

LabelBuggy/Clean

APIs

BuildPredictio

nModel

RankAPIs

API Ranks

SW Repository

DefectHisto

ry

Funtion-level metrics

CodeMetrics

(28)

AltCountLineBlank, AltCountLineCode, AltCountLineComment, CountInput, CountLine, CountLineBlank, CountLineCode, CountLineCodeDecl, CountLineCodeExe, CountLineComment, CountLineInactive, CountLinePreprocessor, CountOutput, CountPath, CountSemicolon, CountStmt, CountStmtDecl, CountStmtEmpty, CountStmtExe, Cyclomatic, CyclomaticModified, CyclomaticStrict, Essential, Knots, MaxEssentialKnots, MaxNesting, MinEssentialKnots, RatioCommentToCode

ProcessMetrics

(12)(Moser@ICSE`08)

numCommits, totalLocDeleted, avgLocDeleted, maxLocDeleted, totalLocAdded, avgLocAdded, maxLocAdded,totalLocChurn, avgLocChurn, maxLocChurn, totalDistinctAuthors, numFixes

Rank APIs in Package A

Probability

1 api_a 98%2 api_b 90%3 api_c 79%... ... ...25 api_y 40%26 api_z 30%Test

Failures

Depth 2(LoC=10)

Depth 1(LoC=7)

Depth 0(LoC=3)

my_api()

func_1()

func_3()

func_4()

func_2()

func_5()

9

Research Questions• Test Development Phase– Can writing test cases based on REMI

ranking identify additional/distinct defects?

• Test Execution Phase– Can test execution with REMI identify

defects more quickly than that without REMI?

10

Experimental Setup• Random Forest• Subject– Tizen-wearable APIs• Applied REMI for 36 functional packages

with about 1100 APIs– Release Candidates (RC)• RC2 to RC3

RCn-1 RCn

Build Rank

11

RESULT

12

Results for Test Development Phase

Detect additional defects after applying REMI

RC2 before REMI RC2 after REMI0

1

2

The number of additional defects by modifying and enhancing test cases

13

Results for Test Development Phase

Detect additional defects after applying REMI

RC3 before REMI RC3 after REMI0

1

2

The number of additional defects by modifying and enhancing test cases

14

Results for Test Execution Phase

Quickly detect defects with the same number of test cases

Version REMIBug Detection Ability

Test Run DefectsDetected

Detection Rate

RC2without

REMI 873 6.5 0.74%

with REMI 873 18 2.06%

RC3without

REMI 845 8.1 0.96%

with REMI 845 9 1.07%

Test cases from the selected 30% of APIs

15

Developer Feedbacks• “The list of risky APIs provided before

conducting QA activities is helpful for testers to allocate their testing effort efficiently, especially with tight time constraints.”

• “In the process of applying REMI, overheads arise during the tool configuration and executions (approximately 1 to 1.5 hours).”

• “It is difficult to collect the defect information to label buggy/clean APIs without noise.”

16

Conclusion• REMI– Defect Prediction is helpful in

industry• Test Case Development

– Could identify additional defects by developing new test cases for risky APIs.

• Test Execution Phase– Identify defects more quickly and efficiently

when with the tight time constraints.