lase : locating and applying systematic edits by learning from examples

31
Lase: Locating and Applying Systematic Edits by Learning from Examples Na Meng* Miryung Kim* Kathryn S. McKinley* + The University of Texas at Austin* Microsoft Research +

Upload: arella

Post on 23-Feb-2016

47 views

Category:

Documents


0 download

DESCRIPTION

Lase : Locating and Applying Systematic Edits by Learning from Examples. Na Meng * Miryung Kim* Kathryn S. McKinley* + The University of Texas at Austin* Microsoft Research +. Motivating Scenario. Pat needs to update database transaction code to prevent SQL injection attacks. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lase : Locating and Applying Systematic Edits by Learning from Examples

Lase: Locating and Applying Systematic Edits by Learning from Examples

Na Meng* Miryung Kim* Kathryn S. McKinley*+

The University of Texas at Austin*Microsoft Research+

Page 2: Lase : Locating and Applying Systematic Edits by Learning from Examples

2

Motivating Scenario

Aold

Anew

Bold

Bnew

Cold

Cnew

Pat needs to update database transaction code to prevent SQL injection attacks

Page 3: Lase : Locating and Applying Systematic Edits by Learning from Examples

3

Systematic Editing

• Similar but not identical changes to multiple contexts

• Manual, tedious, and error-prone• Source transformation tools require describing

edits in a formal language[CHP91, ER02, LR95] • Bug fixing tools handle simple stylized code

changes[WNGF09, JZDLL12, SMS13]• Sydit does not find edit locations automatically

when applying systematic edits[MKM11]

Page 4: Lase : Locating and Applying Systematic Edits by Learning from Examples

4

Lase Workflow

Dold

Dsuggested

LASE selects methods & suggests edits

Aold

Anew

Bold

Bnew

User selects examples

Iold

Isuggested

Xold

Xsuggested

… …

Page 5: Lase : Locating and Applying Systematic Edits by Learning from Examples

5

Dnew

Syntactic program differencing

Aold

Apply edit script

Identify common edit

Generalize identifier

Bold

Anew

Bnew

Extract context

Find edit location

Object next = e.next();

Object next = iter.next();

Object next = v$0.next();

Approach Overview

✗ no match

✔ match

DoldColdPhase: I. Create edit script II. III.

Page 6: Lase : Locating and Applying Systematic Edits by Learning from Examples

6

Step 1. Syntactic Program Differencing

operation definitioninsert(u, v, k) insert node u and position it as the (k+1)th

child of node vdelete(u) delete node uupdate(u, v) replace u with vmove(u, v, k) delete u from its current position and insert

u as the (k+1)th child of v

Input: mold, mnew

Output: Edit operations

Page 7: Lase : Locating and Applying Systematic Edits by Learning from Examples

7

Step 2: Identify Common Edit

• Longest Common Edit Operation Subsequence

insert(Object next = e.next()…)insert(if(next instanceof MVAction)insert(((MVAction)next).update())update(print(next.toString())) to …

insert(Object next = iter.next()…)update(print(next.getString())) to …insert(if(next instanceof MVAction)insert(((MVAction)next).update())delete(System.out.println(…))

insert(Object next = e.next()…)insert(if(next instanceof MVAction))insert(((MVAction)next).update())

insert(Object next = iter.next()…)insert(if(next instanceof MVAction))insert(((MVAction)next).update())

Edit script A Edit script B

Page 8: Lase : Locating and Applying Systematic Edits by Learning from Examples

8

Step 3: Generalize Identifier

• Keep the original type, method, and variable names if examples agree

• Abstract identifiers if examples disagree

8

Generalized name

Name in mA

Namein mB

Variable Map next next nextv$0 e iter

Method Map next next nextType Map Object Object Object

Iterator Iterator Iterator

Object next = e.next(); Object next = iter.next();

Object next = v$0.next();

Page 9: Lase : Locating and Applying Systematic Edits by Learning from Examples

9

Step 4: Extract Context

Object next = e.next(); Object next = iter.next();

Iterator e = fActions.values().iterator();… …while(e.hasNext())

Iterator iter = getActions().values().iterator();… …while(iter.hasNext())

Iterator v$0 = u$0:FieldAccessOrMethodInvocation.values().iterator();… …while(v$0.hasNext())

Generalized name Namein mA

Namein mB

Uncertain Map u$0:FieldAccessOrMethodInvocation fActions getActions()Variable Map v$0 e iterMethod Map values values values

iterator iterator iteratorhasNext hasNext hasNext

TypeMap Iterator Iterator Iterator

Page 10: Lase : Locating and Applying Systematic Edits by Learning from Examples

10

Phase II. Find Edit Locations

Dold

Iterator e = fActions.values().iterator();

Iterator v$0 = u$0:FieldAccessOrMethodInvocation.values().iterator();

Generalized name Name in mDUncertain Map u$0:FieldAccessOrMethodInvocation fActionsVariable Map v$0 eMethod Map values values

iterator iteratorTypeMap Iterator Iterator

Aold

Bold

Anew

Bnew

Page 11: Lase : Locating and Applying Systematic Edits by Learning from Examples

11

Phase III. Applying Edit Script

• Customize general edit scripts– Identifier concretization– Edit position concretization

• Apply the customized edit scripts

Page 12: Lase : Locating and Applying Systematic Edits by Learning from Examples

12

Example 1:Comment[] getLeadingComments(ASTNode node) {- if (this.leadingComments != null) {+ if (this.leadingPts >= 0) {- int[] range = (int[]) this.leadingComments.get(node);+ int[] range = null;+ for (int i = 0; range == null && i <= this.leadingPtr; i++) {+ if (this.leadingNodes[i] == node) + range = this.leadingIndexes[i]; + } if (range != null) { … … return leadingComments; }} return null; }

Example 2:Comment[] getTrailingComments(ASTNode node) {- if (this.trailingComments != null) {+ if (this.trailingPts >= 0) {- int[] range = (int[]) this.trailingComments.get(node);+ int[] range = null;+ for (int i = 0; range == null && i <= this.trailingPtr; i++) {+ if (this.trailingNodes[i] == node) + range = this.trailingIndexes[i]; + } if (range != null) { … … return trailingComments; }} return null;}

update (if (this.v$0 != null) ) to (if (this.v$1 >= 0) )insert (int[] range = null; …)insert (for (int i = 0; range == null && i <= this.v$1; i++) …)insert (if (this.v$2[i] == node) …)insert (range = this.v$3[i]; …)delete (int[] range = (int[]) this.v$0.get(node); )

Page 13: Lase : Locating and Applying Systematic Edits by Learning from Examples

13

Found location:public int getExtendedEnd (ASTNode node) { int end = node.getStartPosition() + node.getLength(); if (this.trailingComments != null) { int[] range = (int[]) this.trailingComments.get(node);

if (range != null) { … … } } else { … … } return end - 1;}

Suggested version:public int getExtendedEnd (ASTNode node) { int end = node.getStartPosition() + node.getLength(); if (this.v$1 >= 0) { int[] range = null; for (int i = 0; range == null && i <= this.v$1; i++) { if (this.v$2[i] == node) range = this.v$3[i]; } if (range != null) { … … } } else { … … } return end - 1;}

Page 14: Lase : Locating and Applying Systematic Edits by Learning from Examples

14

Outline

• Phase I: Creating Abstract Edit Scripts– Syntactic Program Diff– Identify Common Edit– Generalize Identifier– Extract Context

• Phase II: Find Edit Locations • Phase III: Apply Edit Script• Evaluation

Page 15: Lase : Locating and Applying Systematic Edits by Learning from Examples

15

Test Suite

• 24 repetitive bug fixes that require multiple check-ins [Park et al., MSR 2012]– 2 from Eclipse JDT and 22 from Eclipse SWT– Each bug is fixed in multiple commits– Clones of at least two lines between patches checked in at

different times– We use the first two changed methods as input examples

• 37 systematic edits that require similar changes to different methods

Page 16: Lase : Locating and Applying Systematic Edits by Learning from Examples

16

RQ1: Precision, Recall, and Accuracy

Precision (P): What percentage of all found locations are correctly identified?

Recall (R): What percentage of all expected locations are correctly identified?

Accuracy (A): How similar is Lase-generated version to developer-generated version?

Page 17: Lase : Locating and Applying Systematic Edits by Learning from Examples

17

On average, Lase finds edit locations with 99% precision, 89% recall, and 91% accuracy.

For three bugs, Lase suggests in total 9 edits that developers missed and later confirmed.

Index Bug(patches) mi

Edit Location Operations

Σ ✔ P% R% A% E C AE%

2 82429(2) 16 13 12 92 75 81 9 9 100

4 139329(3) 6 2 2 100 33 74 6 3 50

7 103863(5) 7 7 7 100 100 100 34 34 100

8 129314(3) 3 4 4 100 100 100 2 2 100

16 95409(3) 7 9 9 100 100 78 4 4 100

24 98198(2) 9 15 15 100 100 95 3 3 100

Page 18: Lase : Locating and Applying Systematic Edits by Learning from Examples

18

RQ2: Sensitivity to number of exemplar edits

• 7 cases in the oracle data set• Enumerate subsets of exemplar edits

Page 19: Lase : Locating and Applying Systematic Edits by Learning from Examples

19

# of exemplars

P% R% A%

Index 4

1 100 17 100

2 100 51 72

3 100 82 67

4 100 96 67

5 100 100 67

Index 7

1 100 59 100

2 100 83 100

3 100 84 100

4 100 88 100

5 100 92 100

6 100 96 100

Index 12

1 100 54 92

2 78 90 85

3 49 98 83

4 31 100 82

As the number of exemplar edits increases,

P does not change because exemplar edits are similar, except for case 12R is more sensitive to the number of exemplar editsR increases as a function of exemplar editsA decreases when exemplar edits are differentA remains the same or increases when the exemplar edits are very similar

Page 20: Lase : Locating and Applying Systematic Edits by Learning from Examples

20

Conclusion

• Lase automates edit location search and program transformation application

• Lase achieves 99% precision, 89% recall, and 91% accuracy

• Future Work– Integrate with automated compilation and testing– Automatically detect repetitive change examples

to infer program transformations

TOOL DEMO: @MARINA, 13:30 ON FRIDAY

Page 21: Lase : Locating and Applying Systematic Edits by Learning from Examples

21

Acknowledgement

• This work was supported in part by the National Science Foundation under grants CAREER-1149391, CCF-1117902, CCF-1043810, SHF-0910818, CCF-1018271, CCF-0811524, and a Microsoft SEIF award

Thank you!

Page 22: Lase : Locating and Applying Systematic Edits by Learning from Examples

22

References I• [Meng et al. 2011] Na Meng, Miryung Kim and Kathryn S.

McKinley. Systematic editing: Generating program transformations from an example. In PLDI ‘11.

• [Kamiya et al. 2002] Toshihiro Kamiya and Shinji Kusumoto and Katsuro Inoue. CCFinder: A multilinguistic token-based code clone detection system for large scale source code. In TSE ’02.

• [Lozano et al. 2004] Antoni Lozano and Gabriel Valiente. On the maximum common embedded subtree problem for ordered trees. In C. Iliopoulos and T Lecroq, editors, String Algorithmics, 2004.

• [Park et al. MSR 2012] J. Park, M. Kim, B. Ray, and D.-H. Bae. An empirical study of supplementary bug fixes. In MSR ’12.

Page 23: Lase : Locating and Applying Systematic Edits by Learning from Examples

23

References II• [JZDLL12] G. Jin,W. Zhang, D. Deng, B. Liblit, and S. Lu.

Automated concurrency bug fixing. In PLDI ’12.• [CHP91] J. R. Cordy, C. D. Halpern, and E. Promislow. Txl: A

rapid prototyping system for programming language dialects. Computer Languages, 1991.

• [G10] S. Gulwani. Dimensions in program synthesis. In PPDP ’10.

• [WNGF09] W. Weimer, T. Nguyen, C. Le Goues, and S. Forrest. Automatically finding patches using genetic programming. In ICSE ’09.

Page 24: Lase : Locating and Applying Systematic Edits by Learning from Examples

24

References III• [ER02] M. Erwig and D. Ren. A rule-based language for

programming software updates. In RULE ’02.• [LR95] D. A. Ladd and J. C. Ramming.A*: A language for imple-

menting language processors. In TSE’95.• [SMS13] S. Son, K. S. McKinley, and V. Shmatikov. Fix Me Up:

Repairing access-control bugs in web applications. In NDSS’13.

Page 25: Lase : Locating and Applying Systematic Edits by Learning from Examples

25

Step 4: Common Edit Context Extraction

• Extract all potential common context• Refine the common context– Consistent identifier mapping – Embedded subtree isomorphism– Program dependence equivalence

Page 26: Lase : Locating and Applying Systematic Edits by Learning from Examples

26

Step 4: Common Edit Context Extraction (1/4)

1 12 23

3

• Finding common text with clone detection (CCFinder [Kamiya et al. 2002])

Page 27: Lase : Locating and Applying Systematic Edits by Learning from Examples

27

Step 4: Common Edit Context Extraction (2/4)

• Identifier generalizationIterator e = fActions.values().iterator();while (e.hasNext()) {

Iterator iter = getActions().values().iterator();while (iter.hasNext()) {

Iterator v$0 = u$0:FieldAccessOrMethodInvocation.values().iterator();while (v$0.hasNext()) {

Abstract identifier Identifier in mA

Identifierin mB

Uncertain Map u$0:FieldAccessOrMethodInvocation fActions getActions()Variable Map v$0 e iterMethod Map values values values

iterator iterator iteratorhasNext hasNext hasNext

TypeMap Iterator Iterator Iterator

Page 28: Lase : Locating and Applying Systematic Edits by Learning from Examples

28

Step 4: Common Edit Context Extraction (3/4)

• Maximum Common Embedded Subtree Extraction (MCESE) [Lozano et al. 2004]

1

2

3

1

2 3

1,2,3,-3,-2,-1 1,2,-2,3,-3,-1

1,2,-2,-1

1

2

1

2

Page 29: Lase : Locating and Applying Systematic Edits by Learning from Examples

29

Step 4: Common Edit Context Extraction (4/4)

• Program dependence analysis

Abstractidentifier

Identifier in mA

Identifierin mB

Variable Map v$0 e iterMethod Map values values values

… ….

Object next = e.next();

while (e.hasNext()) {

Iterator e = fActions.values().iterator();

Page 30: Lase : Locating and Applying Systematic Edits by Learning from Examples

30

?When more than two examples?

Aold

Anew

Bold

Bnew

Cold

Cnew

EAB EAC

Dold

Dnew

EAD

EABCEACD

EABCD

Page 31: Lase : Locating and Applying Systematic Edits by Learning from Examples

31

public void setBackgroundPattern (Pattern pattern){ if (handle == 0) SWT.error(SWT.ERROR_GRAPHIC_DISPOSED); if (pattern == null) SWT.error(SWT.ERROR_NULL_ARGUMENT); if (pattern.isDisposed()) SWT.error(SWT.ERROR_INVALID_ARGUMENT); initGdip(false, false); if (data.gdipBrush != 0) destroyGdipBrush(data.gdipBrush); data.gdipBrush = Gdip.Brush_Clone(pattern.handle);

data.backgroundPattern = pattern;}

public void setBackgroundPattern (Pattern pattern){ if (handle == 0) SWT.error(SWT.ERROR_GRAPHIC_DISPOSED); if (pattern != null && pattern.isDisposed()) SWT.error(SWT.ERROR_INVALID_ARGUMENT); initGdip(false, false); if (data.gdipBrush != 0) destroyGdipBrush(data.gdipBrush); if (pattern != null) { data.gdipBrush = Gdip.Brush_Clone(pattern.handle); } else { data.gdipBrush = 0; } data.backgroundPattern = pattern;}