components - graph based detection of library api limitations

39
Graph-based Detection of Library API Imitations October 6, 2011 1 Chengnian Sun, Siau-Cheng Khoo, Shao Jie Zhang National University of Singapore

Upload: icsm-2011

Post on 13-Jan-2015

433 views

Category:

Technology


3 download

DESCRIPTION

Paper: Graph-based Detection of Library API ImitationsAuthors: Chengnian Sun, Siau-Cheng Khoo, Shao Jie Zhang (All from National University of Singapore)Session: Research Track Session 7: Component

TRANSCRIPT

Page 1: Components - Graph Based Detection of Library API Limitations

Graph-based Detection of

Library API Imitations

October 6, 20111

Chengnian Sun, Siau-Cheng Khoo, Shao Jie Zhang

National University of Singapore

Page 2: Components - Graph Based Detection of Library API Limitations

Motivation – Software Libraries

Common practice to employ 3rd-party software libraries

Providing certain functionalities / hiding implementation details

Improving productivity

Well tested

Enhancing program quality

Application Programming Interfaces (APIs)

Exported by libraries

Ways for programmers to interact with libraries

October 6, 20112

Page 3: Components - Graph Based Detection of Library API Limitations

Motivation – Problem

APIs are not always effectively used by programmers

Imitation: client code re-implements the behavior of library

APIs

Reasons

Unfamiliar with the library,

Library evolution

Cost

Waste unnecessary resources, time and energy

Error-prone, software maintenance issue

October 6, 20113

Page 4: Components - Graph Based Detection of Library API Limitations

Motivation – Example from JBoss

October 6, 20114

Page 5: Components - Graph Based Detection of Library API Limitations

Motivation – Example from JBoss

October 6, 20115

Imitation (1): method.getInterceptors() == null ||

method.getInterceptors().length < 1

Page 6: Components - Graph Based Detection of Library API Limitations

Motivation – Example from JBoss

October 6, 20116

Imitation (1): method.getInterceptors() == null ||

method.getInterceptors().length < 1

API: return (interceptors != null && interceptors.length > 0)

Page 7: Components - Graph Based Detection of Library API Limitations

Motivation – Example from JBoss

October 6, 20117

Imitation (1): method.getInterceptors() == null ||

method.getInterceptors().length < 1

Refactor to: !method.hasAdvices()

Page 8: Components - Graph Based Detection of Library API Limitations

Motivation – Example from JBoss

October 6, 20118

Refactor to: !method.hasAdvices()

Imitation (1): method.getInterceptors() == null ||

method.getInterceptors().length < 1

Page 9: Components - Graph Based Detection of Library API Limitations

Motivation

October 6, 20119

A library API imitation can be

Not exactly the same

Inter-procedural

Page 10: Components - Graph Based Detection of Library API Limitations

Motivation

October 6, 201110

A library API imitation can be

Not exactly the same

Inter-procedural

Goal: to accurately detect such imitations

Page 11: Components - Graph Based Detection of Library API Limitations

Detection of Library API Imitations

Motivation

Definitions

Data Dependency Graph

Trace & Subtrace

Trace Subsumption

Potential Imitation

Algorithms

Pre- & Post-processing

Case Studies

Conclusion

October 6, 201111

Page 12: Components - Graph Based Detection of Library API Limitations

Definitions – Overview

October 6, 201112

Employing Data Dependency Graphs (DDG) to represent

code

Semantic representation

Capturing data flows within a method

Carrying a portion of control flow information

A library DDG is trace-subsumed by a client DDG

potential API imitation

Relaxation of sub-graph isomorphism

More efficient

Minor-difference tolerant

Page 13: Components - Graph Based Detection of Library API Limitations

Definitions – Data Dependency Graph

October 6, 201113

DDG – a graphical representation of a method

Vertices: basic statements (three address form)

Edges v u: direction represents data dependency

vertex u is data dependent on vertex v

a variable var

defined at v

used at u

and there is an execution path P from v to u, and along P, the

var is not redefined.

Page 14: Components - Graph Based Detection of Library API Limitations

Definitions – Trace & Subtrace

October 6, 201114

A trace in a data dependency graph

A path of vertices, <v1, v2, …, vm>

The first vertex is an entry of the graph

Page 15: Components - Graph Based Detection of Library API Limitations

Definitions – Trace & Subtrace

October 6, 201115

A trace in a data dependency graph

A path of vertices, <v1, v2, …, vm>

The first vertex is an entry of the graph

Given two traces T1 = <v1, v2, …, vm> and T2 = <u1, u2, …, un>, T1

is a subtrace of T2 (T1 ≤ T2) if there exists an integer i,

0 ≤ i ≤ n – m

match(v1, u1 + i), match(v2, u2 + i), …, match(vm, um + i)

Subtrace is a generalization of substring relation.

T1 = <C, D, E>

T2 = <A, B, C, D, E, F>

Page 16: Components - Graph Based Detection of Library API Limitations

Definitions – Trace & Subtrace

October 6, 201116

A trace in a data dependency graph

A path of vertices, <v1, v2, …, vm>

The first vertex is an entry of the graph

Given two traces T1 = <v1, v2, …, vm> and T2 = <u1, u2, …, un>, T1

is a subtrace of T2 (T1 ≤ T2) if there exists an integer i,

0 ≤ i ≤ n – m

match(v1, u1 + i), match(v2, u2 + i), …, match(vm, um + i)

Subtrace is a generalization of substring relation.

T1 = <C, D, E>

T2 = <A, B, C, D, E, F>

i = 2

Page 17: Components - Graph Based Detection of Library API Limitations

Definitions – Trace Subsumption

October 6, 201117

A data dependency graph Glib

A data dependency graph Gclt

Gclt trace subsumes Glib , if and only if

for each trace there exists at least one trace

such that is a subtrace of

Page 18: Components - Graph Based Detection of Library API Limitations

Definitions – Potential Imitation

October 6, 201118

A client method Clt potentially imitates a library

method Lib, if

A DDG Gclt of Clt, resulting from inlining zero or some

method calls into Clt

A DDG Glib of Lib, resulting from inlining zero or some

method calls into Lib

Gclt trace subsumes Glib

Page 19: Components - Graph Based Detection of Library API Limitations

Detection of Library API Imitations

Motivation

Definitions

Algorithms

Overall Algorithm

Trace Subsumption Checking

Pre- & Post-processing

Case Studies

Conclusion

October 6, 201119

Page 20: Components - Graph Based Detection of Library API Limitations

Algorithms – Overall Algorithm

October 6, 201120

Input

A library API Lib

A client method Clt

A set S of all method calls in both Lib and Clt

Output true if Clt potentially imitates Lib

Body

for each sub-set s of S {

Lib’ = a copy of Lib with calls in s inlined

Clt’ = a copy of Clt with calls in s inlined

if the DDG of Clt’ trace subsumes the DDG of Lib’

return true

}

return false;

Page 21: Components - Graph Based Detection of Library API Limitations

Algorithms – Trace Subsumption

October 6, 201121

Input

A DDG of a library API Glib

A DDG of a client method Gclt

Output

true if Gclt trace subsumes Glib

Depth-first Search,

Step-by-step checking

Page 22: Components - Graph Based Detection of Library API Limitations

Algorithms – An Example

October 6, 201122

Current:

Stack:

Page 23: Components - Graph Based Detection of Library API Limitations

Algorithms – An Example

October 6, 201123

Locating all vertices in client matching each entry of the library (A, {A, A})Stack:

Current:

Page 24: Components - Graph Based Detection of Library API Limitations

Algorithms – An Example

October 6, 201124

Locating client vertices matching library A’s successor D Stack:

Current: (A, {A, A})

Page 25: Components - Graph Based Detection of Library API Limitations

Algorithms – An Example

October 6, 201125

Locating client vertices matching library A’s successor D (D, {D})Stack:

Current: (A, {A, A})

Page 26: Components - Graph Based Detection of Library API Limitations

Algorithms – An Example

October 6, 201126

Locating client vertices matching library A’s successor B (D, {D})Stack:

Current: (A, {A, A})

Page 27: Components - Graph Based Detection of Library API Limitations

Algorithms – An Example

October 6, 201127

Locating client vertices matching library A’s successor B (B, {B})

(D, {D})

Stack:

Current: (A, {A, A})

Page 28: Components - Graph Based Detection of Library API Limitations

Algorithms – An Example

October 6, 201128

Locating client vertices matching B’s successor {} in library (D, {D})Stack:

Current: (B, {B})

Page 29: Components - Graph Based Detection of Library API Limitations

Algorithms – An Example

October 6, 201129

Locating client vertices matching library D’s successor M Stack:

Current: (D, {D})

Page 30: Components - Graph Based Detection of Library API Limitations

Detection of Library API Imitations

Motivation

Definitions

Algorithms

Pre-processing & Post-validation

Case Studies

Conclusion

October 6, 201130

Page 31: Components - Graph Based Detection of Library API Limitations

Pre-processing Libraries

October 6, 201131

Remove nullness checks

Remove assertions

Remove exception handlers

If (a ==) {

return Constant;

} else {

a.XXX();

}

if (…)

throw Exception();

…….

try {

} catch (…) {}

Page 32: Components - Graph Based Detection of Library API Limitations

Post-validating Reported Imitations

October 6, 201132

Reject the following two cases

Unmatched InlinedVertices in Client

Matching All References to Library Locals

Page 33: Components - Graph Based Detection of Library API Limitations

Detection of Library API Imitations

Motivation

Definitions

Algorithms

Pre-processing & Post-validation

Case Studies

Conclusion

October 6, 201133

Page 34: Components - Graph Based Detection of Library API Limitations

Case Studies

October 6, 201134

Evaluation measure

Subjects – 10 open-source Java projects

Testbed:

Intel Core 2 Quad CPU 3.00GHz and 8GB memory

Page 35: Components - Graph Based Detection of Library API Limitations

Case Studies – Two Experiments

October 6, 201135

Detecting Imitations of Imported Libraries

Testing all method pairs (lib, clt), where the declaring class of

lib is already imported in the client class

Precision = 313 / 383 = 82%

Runtime = 314 seconds

Page 36: Components - Graph Based Detection of Library API Limitations

Case Studies – Two Experiments

October 6, 201136

Detecting Imitations of Imported Libraries

Testing all method pairs (lib, clt), where the declaring class of

lib is already imported in the client class

Precision = 313 / 383 = 82%

Runtime = 314 seconds

Detecting Imitations of Static Libraries

Testing all method pairs (lib, clt), where lib is a public static

method

Precision = 116 / 155 = 75%

Runtime = 396 seconds

Page 37: Components - Graph Based Detection of Library API Limitations

Case Studies – Example of Static API

October 6, 201137

Page 38: Components - Graph Based Detection of Library API Limitations

Conclusion

October 6, 201138

A common practice to employ 3rd party software libraries

Client code re-implements behavior of existing APIs

An algorithm based on data dependency graphs to detect

complex imitations

Average precision 82% & 75%

Page 39: Components - Graph Based Detection of Library API Limitations

Thank you.

Q&A

October 6, 201139