remi: defect prediction for efficient api testing (esec/fse 2015, industrial track)

16
REMI: Defect Prediction for Efficient API Testing ESEC/FSE 2015, Industrial track September 3, 2015 Mijung Kim*, Jaechang Nam* , Jaehyuk Yeon + , Soonhwang Choi + , and Sunghun Kim* *Department of Computer Science and Engineering, HKUST + Software R&D Center, Samsung Electronics CO., LTD

Upload: sung-kim

Post on 10-Apr-2017

1.195 views

Category:

Software


0 download

TRANSCRIPT

Page 1: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

REMI: Defect Prediction for Efficient API Testing

ESEC/FSE 2015, Industrial track

September 3, 2015

Mijung Kim*, Jaechang Nam*, Jaehyuk Yeon+, Soonhwang Choi+, and Sunghun Kim*

*Department of Computer Science and Engineering, HKUST+Software R&D Center, Samsung Electronics CO., LTD

Page 2: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

2

...

Page 3: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

3

Motivation• Reliability of APIs

• Cost-intensive software quality assurance (QA) tasks at Samsung– Creating test cases for APIs– Testing APIs

Page 4: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

4

Testing Phase at Samsung

Test Cases (TCs) Run

TCs

Passing

TestsFailin

g TestsFix

APIs

APIs

• Test Cases– 1 or 2 test cases for each API• Uniformly distributed across all APIs• Tedious task and not efficient

Modification and Enhancement

Page 5: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

5

Goal

• Apply software defect prediction for the efficient API testing.

Page 6: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

6

Proposed Approach: REMIRisk Evaluation Method for Interface testing

Test Cases (TCs) Run

TCs

Passing

TestsFailin

g TestsFix

APIs

APIs

REMI(API Rank)

Modification and Enhancement

Page 7: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

7

Predict

Training

?

?

Model

API Project A

: Metric value: Buggy-labeled instance: Clean-labeled instance

?: Unlabeled instance

Software Defect Prediction

Related WorkMunson@TSE`92, Basili@TSE`95, Menzies@TSE`07,Hassan@ICSE`09, Bird@FSE`11,D’ambros@EMSE112Lee@FSE`11,...

Page 8: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

8

ApproachREMI: Risk Evaluation Method for

Interface testing

CollectMetrics

Aggregate

Metrics

LabelBuggy/Clean

APIs

BuildPredictio

nModel

RankAPIs

API Ranks

SW Repository

DefectHisto

ry

Funtion-level metrics

CodeMetrics

(28)

AltCountLineBlank, AltCountLineCode, AltCountLineComment, CountInput, CountLine, CountLineBlank, CountLineCode, CountLineCodeDecl, CountLineCodeExe, CountLineComment, CountLineInactive, CountLinePreprocessor, CountOutput, CountPath, CountSemicolon, CountStmt, CountStmtDecl, CountStmtEmpty, CountStmtExe, Cyclomatic, CyclomaticModified, CyclomaticStrict, Essential, Knots, MaxEssentialKnots, MaxNesting, MinEssentialKnots, RatioCommentToCode

ProcessMetrics

(12)(Moser@ICSE`08)

numCommits, totalLocDeleted, avgLocDeleted, maxLocDeleted, totalLocAdded, avgLocAdded, maxLocAdded,totalLocChurn, avgLocChurn, maxLocChurn, totalDistinctAuthors, numFixes

Rank APIs in Package A

Probability

1 api_a 98%2 api_b 90%3 api_c 79%... ... ...25 api_y 40%26 api_z 30%Test

Failures

Depth 2(LoC=10)

Depth 1(LoC=7)

Depth 0(LoC=3)

my_api()

func_1()

func_3()

func_4()

func_2()

func_5()

Page 9: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

9

Research Questions• Test Development Phase– Can writing test cases based on REMI

ranking identify additional/distinct defects?

• Test Execution Phase– Can test execution with REMI identify

defects more quickly than that without REMI?

Page 10: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

10

Experimental Setup• Random Forest• Subject– Tizen-wearable APIs• Applied REMI for 36 functional packages

with about 1100 APIs– Release Candidates (RC)• RC2 to RC3

RCn-1 RCn

Build Rank

Page 11: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

11

RESULT

Page 12: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

12

Results for Test Development Phase

Detect additional defects after applying REMI

RC2 before REMI RC2 after REMI0

1

2

The number of additional defects by modifying and enhancing test cases

Page 13: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

13

Results for Test Development Phase

Detect additional defects after applying REMI

RC3 before REMI RC3 after REMI0

1

2

The number of additional defects by modifying and enhancing test cases

Page 14: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

14

Results for Test Execution Phase

Quickly detect defects with the same number of test cases

Version REMIBug Detection Ability

Test Run DefectsDetected

Detection Rate

RC2without

REMI 873 6.5 0.74%

with REMI 873 18 2.06%

RC3without

REMI 845 8.1 0.96%

with REMI 845 9 1.07%

Test cases from the selected 30% of APIs

Page 15: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

15

Developer Feedbacks• “The list of risky APIs provided before

conducting QA activities is helpful for testers to allocate their testing effort efficiently, especially with tight time constraints.”

• “In the process of applying REMI, overheads arise during the tool configuration and executions (approximately 1 to 1.5 hours).”

• “It is difficult to collect the defect information to label buggy/clean APIs without noise.”

Page 16: REMI: Defect Prediction for Efficient API Testing (ESEC/FSE 2015, Industrial track)

16

Conclusion• REMI– Defect Prediction is helpful in

industry• Test Case Development

– Could identify additional defects by developing new test cases for risky APIs.

• Test Execution Phase– Identify defects more quickly and efficiently

when with the tight time constraints.