icse2010 adams presentation

116
Identifying Crosscutting Concerns Using Historical Code Changes Bram Adams Zhen Ming Jiang Ahmed E. Hassan SAIL, Queen's University http://sailhome.cs.queensu.ca/~bram/

Upload: sailqu

Post on 12-Apr-2017

89 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Icse2010 adams presentation

IdentifyingCrosscutting Concerns

Using Historical Code Changes

Bram AdamsZhen Ming JiangAhmed E. Hassan

SAIL, Queen's Universityhttp://sailhome.cs.queensu.ca/~bram/

Page 2: Icse2010 adams presentation

What are crosscutting concerns?

2

Page 3: Icse2010 adams presentation

CrosscuttingConcerns

3

Page 4: Icse2010 adams presentation

CrosscuttingConcerns

3

Page 5: Icse2010 adams presentation

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

3

Page 6: Icse2010 adams presentation

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

3

Page 7: Icse2010 adams presentation

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

3

Page 8: Icse2010 adams presentation

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

3

Page 9: Icse2010 adams presentation

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

3

Page 10: Icse2010 adams presentation

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

3

Page 11: Icse2010 adams presentation

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

3

Page 12: Icse2010 adams presentation

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

3

Page 13: Icse2010 adams presentation

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

3

Page 14: Icse2010 adams presentation

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

3

Page 15: Icse2010 adams presentation

1. Which concerns are implemented?

2. Where?

3. How are concerns composed together?

(Crosscutting) Concern Mining

4

Page 16: Icse2010 adams presentation

1. What is a Crosscutting Concern?

2. The Concern Mining Process and its Shortcomings

3. COMMIT

4. Case Study

5. Conclusion

5

Page 17: Icse2010 adams presentation

Concern Mining Process

datasource

6

Page 18: Icse2010 adams presentation

Concern Mining Process

datasource

concernseeds

1

6

Page 19: Icse2010 adams presentation

Concern Mining Process

datasource

concernseeds

Concern Mining Techniques :-)

1

6

Page 20: Icse2010 adams presentation

Concern Mining Process

datasource

concernseeds

Concern Mining Techniques :-)

concerns

1 2

6

Page 21: Icse2010 adams presentation

Concern Mining Process

datasource

concernseeds

expandedconcerns

Concern Mining Techniques :-)

concerns

1 2 3

6

Page 22: Icse2010 adams presentation

Concern Mining Process

datasource

concernseeds

expandedconcerns

concerncomposition

Concern Mining Techniques :-)

concerns

1 2 3 4

6

Page 23: Icse2010 adams presentation

Concern Mining Process

datasource

concernseeds

expandedconcerns

concerncomposition

Concern Mining Techniques :-)

MANUAL :-(

concerns

1 2 3 4

6

Page 24: Icse2010 adams presentation

Concern Mining Process

datasource

concernseeds

expandedconcerns

concerncomposition

Concern Mining Techniques :-)

MANUAL :-(

concerns

1 2 3 4

6

Page 25: Icse2010 adams presentation

S1: Limited Context

1 2 3 4

7

Page 26: Icse2010 adams presentation

S1: Limited Context

thread()

process()

block()

clean()

1 2 3 4

7

Page 27: Icse2010 adams presentation

S1: Limited Context

thread()

process()

mutex

semaphore_t

addresssender subject

block()

DEFINED_LINUX

clean()

1 2 3 4

7

Page 28: Icse2010 adams presentation

S1: Limited Context

thread()

process()

mutex

semaphore_t

addresssender subject

CVS

block()

DEFINED_LINUX

clean()

thread()

1 2 3 4

7

Page 29: Icse2010 adams presentation

S2: Noise1 2 3 4

8

Page 30: Icse2010 adams presentation

S2: Noise1 2 3 4

8

Page 31: Icse2010 adams presentation

S2: Noise1 2 3 4

8

Page 32: Icse2010 adams presentation

S3: NoComposition

random

encrypt

decrypt

seed

1 2 3 4

9

Page 33: Icse2010 adams presentation

S3: NoComposition

random

encrypt

decrypt

seed

random

encrypt

decrypt

seed

1 2 3 4

9

Page 34: Icse2010 adams presentation

1. What is a Crosscutting Concern?

2. The Concern Mining Process and its Shortcomings

3. COMMIT

4. Case Study

5. Conclusion

10

Page 35: Icse2010 adams presentation

COncern Mining using Mutual Information over Time

CVS

11

limited context

noise

no composition

Page 36: Icse2010 adams presentation

COncern Mining using Mutual Information over Time

analyze historical changes to all code entities

CVS

11

limited context

noise

no composition

Page 37: Icse2010 adams presentation

COncern Mining using Mutual Information over Time

analyze historical changes to all code entities

statistical clustering based on mutual information

CVS

11

limited context

noise

no composition

Page 38: Icse2010 adams presentation

S1. Historical Data Sources

CVS

CVS

12

Page 39: Icse2010 adams presentation

S1. Historical Data Sources

CVS

transactions

CVS

12

Page 40: Icse2010 adams presentation

S1. Historical Data Sources

CVS

transactions

CVS

12

Page 41: Icse2010 adams presentation

S1. Historical Data Sources

CVS

transactions

CVS

function call or variable access added

12

Page 42: Icse2010 adams presentation

S1. Historical Data Sources

CVS

transactions

CVS

function call or variable access added

intentional co-addition of calls and

accesses12

Page 43: Icse2010 adams presentation

S1. Historical Data Sources

CVS

transactions

CVS

function call or variable access added

intentional co-addition of calls and

accesses

concernseed

12

Page 44: Icse2010 adams presentation

S2. MutualInformation

13

Page 45: Icse2010 adams presentation

S2. MutualInformation

13

Page 46: Icse2010 adams presentation

S2. MutualInformation

13

Page 47: Icse2010 adams presentation

S2. MutualInformation

13

Page 48: Icse2010 adams presentation

S2. MutualInformation

13

Page 49: Icse2010 adams presentation

S2. MutualInformation

13

Page 50: Icse2010 adams presentation

S2. MutualInformation

13

Page 51: Icse2010 adams presentation

S2. MutualInformation

13

Page 52: Icse2010 adams presentation

S2. MutualInformation

13

Page 53: Icse2010 adams presentation

S2. MutualInformation

13

Page 54: Icse2010 adams presentation

S2. MutualInformation

13

Page 55: Icse2010 adams presentation

S2. MutualInformation

13

Page 56: Icse2010 adams presentation

S2. MutualInformation

13

Page 57: Icse2010 adams presentation

S2. MutualInformation

14

Page 58: Icse2010 adams presentation

How much does occurrence of

reveal about occurrence of ?

S2. MutualInformation

14

Page 59: Icse2010 adams presentation

How much does occurrence of

reveal about occurrence of ?

S2. MutualInformation

14

Page 60: Icse2010 adams presentation

How much does occurrence of

reveal about occurrence of ?

S2. MutualInformation

14

Page 61: Icse2010 adams presentation

How much does occurrence of

reveal about occurrence of ?

S2. MutualInformation

14

Page 62: Icse2010 adams presentation

How much does occurrence of

reveal about occurrence of ?

S2. MutualInformation

14

Page 63: Icse2010 adams presentation

S3. Concern Relations

seed graph

15

Page 64: Icse2010 adams presentation

S3. Concern Relations

15

Page 65: Icse2010 adams presentation

S3. Concern Relations

compositeconcern

simpleconcern

15

Page 66: Icse2010 adams presentation

1. What is a Crosscutting Concern?

2. The Concern Mining Process and its Shortcomings

3. COMMIT

4. Case Study

5. Conclusion

16

Page 67: Icse2010 adams presentation

Case Study

1996-2002 1993-2003(800 kLOC) (2 MLOC)

17

Page 68: Icse2010 adams presentation

Comparative Study

18

CBFA

HAM

COMMIT

similar entity names ✖

identical set of callers ✖

mutual information ✔

limited context

noise no composition

Page 69: Icse2010 adams presentation

Comparative Study

18

CBFA

HAM

COMMIT

similar entity names ✖

identical set of callers ✖

mutual information ✔

limited context

noise no composition

Page 70: Icse2010 adams presentation

Comparative Study

18

CBFA

HAM

COMMIT

similar entity names ✖

identical set of callers ✖

mutual information ✔

limited context

noise no composition

CVS

CVS

snapshot

Page 71: Icse2010 adams presentation

Comparative Study

18

CBFA

HAM

COMMIT

similar entity names ✖

identical set of callers ✖

mutual information ✔

limited context

noise no composition

CVS

CVS

snapshot

Page 72: Icse2010 adams presentation

Comparative Study

18

CBFA

HAM

COMMIT

similar entity names ✖

identical set of callers ✖

mutual information ✔

limited context

noise no composition

CVS

CVS

snapshot

Page 73: Icse2010 adams presentation

Study Design

19

Page 74: Icse2010 adams presentation

Study Design

19

Page 75: Icse2010 adams presentation

Study Design

CBFA

HAM

COMMIT

19

Page 76: Icse2010 adams presentation

Study Design

CBFA

HAM

COMMIT

top 20

top 20

top 20

19

Page 77: Icse2010 adams presentation

Study Design

CBFA

HAM

COMMIT

top 20

top 20

top 20

19

concern?

Page 78: Icse2010 adams presentation

Study Design

CBFA

HAM

COMMIT

top 20

top 20

top 20

19

Page 79: Icse2010 adams presentation

Study Design

CBFA

HAM

COMMIT

19

Page 80: Icse2010 adams presentation

Study Design

CBFA

HAM

COMMIT

19

Page 81: Icse2010 adams presentation

Study Design

CBFA

HAM

COMMIT

top 20

top 20

top 20

19

Page 82: Icse2010 adams presentation

Study Design

CBFA

HAM

COMMIT

top 20

top 20

top 20

19

top 20

top 20

top 20

Page 83: Icse2010 adams presentation

H1. Richer Data Sources Yield richer Seeds

CVS

20

Page 84: Icse2010 adams presentation

H1. Richer Data Sources Yield richer Seeds

0

8

16

24

32

40

CBFA HAM COMMIT0

45

90

135

180

225

CBFA HAM COMMIT

#non-function entities

#functions

CVS

20

Page 85: Icse2010 adams presentation

H1. Richer Data Sources Yield richer Seeds

0

8

16

24

32

40

CBFA HAM COMMIT0

45

90

135

180

225

CBFA HAM COMMIT

#non-function entities

#functions

CVS

20

50%

79% 83%

29% 88%75%

Page 86: Icse2010 adams presentation

H2. COMMIT Identifies a Larger Percentage of unique Concerns

21

Page 87: Icse2010 adams presentation

H2. COMMIT Identifies a Larger Percentage of unique Concerns

21

0

20

40

60

80

100

CBFA HAM COMMIT0

20

40

60

80

100

CBFA HAM COMMIT

Page 88: Icse2010 adams presentation

H2. COMMIT Identifies a Larger Percentage of unique Concerns

21

0

20

40

60

80

100

CBFA HAM COMMIT

56% 56%

0

20

40

60

80

100

CBFA HAM COMMIT

87.5% 50%

Page 89: Icse2010 adams presentation

H3. COMMIT complements CBFA and HAM (1)

22

Page 90: Icse2010 adams presentation

H3. COMMIT complements CBFA and HAM (1)

22

CBFA HAM

COMMIT

0 0 1

08

14

9

CBFA HAM

COMMIT

1 0 0

09

14

9

Page 91: Icse2010 adams presentation

H3. COMMIT complements CBFA and HAM (2)

23

Page 92: Icse2010 adams presentation

d1 d2 d3 d4 d5 d6 d7 d8 d9

H3. COMMIT complements CBFA and HAM (2)

device drivers23

kernel

Page 93: Icse2010 adams presentation

d1 d2 d3 d4 d5 d6 d7 d8 d9

H3. COMMIT complements CBFA and HAM (2)

CBFA concern(e.g., driver API)

23

kernel

Page 94: Icse2010 adams presentation

d1 d2 d3 d4 d5 d6 d7 d8 d9

H3. COMMIT complements CBFA and HAM (2)

CBFA concern(e.g., driver API)

HAM concern(e.g., cloned driver code)

23

kernel

Page 95: Icse2010 adams presentation

d1 d2 d3 d4 d5 d6 d7 d8 d9

H3. COMMIT complements CBFA and HAM (2)

CBFA concern(e.g., driver API)

HAM concern(e.g., cloned driver code)

COMMIT concern(e.g., driver + infrastructure)

23

kernel

Page 96: Icse2010 adams presentation

d1 d2 d3 d4 d5 d6 d7 d8 d9

H3. COMMIT complements CBFA and HAM (2)

CBFA concern(e.g., driver API)

HAM concern(e.g., cloned driver code)

COMMIT concern(e.g., driver + infrastructure)

23

kernel

Page 97: Icse2010 adams presentation

24

ODBC Data RetrievalComposite Concern

Page 98: Icse2010 adams presentation

24

ODBC Data RetrievalComposite Concern1. connection configuration

Page 99: Icse2010 adams presentation

1

24

ODBC Data RetrievalComposite Concern

ODBC

1. connection configuration

Page 100: Icse2010 adams presentation

12

24

ODBC Data RetrievalComposite Concern

ODBC

1. connection configuration2. connection error handling

Page 101: Icse2010 adams presentation

1 32

24

ODBC Data RetrievalComposite Concern

ODBC

1. connection configuration2. connection error handling3. data transfer

Page 102: Icse2010 adams presentation

1 32

424

ODBC Data RetrievalComposite Concern

ODBC

1. connection configuration2. connection error handling3. data transfer4. SQL-to-ODBC conversion

Page 103: Icse2010 adams presentation

1 32

4524

ODBC Data RetrievalComposite Concern

ODBC

1. connection configuration2. connection error handling3. data transfer4. SQL-to-ODBC conversion5. ODBC-to-ESQL conversion

Page 104: Icse2010 adams presentation

1 32

645

24

ODBC Data RetrievalComposite Concern

ODBC

1. connection configuration2. connection error handling3. data transfer4. SQL-to-ODBC conversion5. ODBC-to-ESQL conversion6. conversion error handling

Page 105: Icse2010 adams presentation

13

4

62

1 32

645

5

24

ODBC Data RetrievalComposite Concern

ODBC

1. connection configuration2. connection error handling3. data transfer4. SQL-to-ODBC conversion5. ODBC-to-ESQL conversion6. conversion error handling

Page 106: Icse2010 adams presentation

36 seeds

25

Page 107: Icse2010 adams presentation

36 seeds

ODBC Data Retrieval Concern

25

Page 108: Icse2010 adams presentation

36 seeds

ODBC Data Retrieval Concern

25

5 other composite concerns

Page 109: Icse2010 adams presentation

Threats to Validity

• generalizability to other systems

• subjectivity ↔ substantial agreement (Kappa)

• seed quality not checked

• threshold optimization is task-specific

26

Page 110: Icse2010 adams presentation

1. What is a Crosscutting Concern?

2. The Concern Mining Process and its Shortcomings

3. COMMIT

4. Case Study

5. Conclusion

27

Page 111: Icse2010 adams presentation

28

Page 112: Icse2010 adams presentation

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

28

Page 113: Icse2010 adams presentation

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

28

Concern Mining Shortcomings

S1. limited seed context

S2. noise between seeds

S3. no composition of concerns

Page 114: Icse2010 adams presentation

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

28

Concern Mining Shortcomings

S1. limited seed context

S2. noise between seeds

S3. no composition of concerns

COMMIT

CVS

transactions

function call or variable access added

intentional co-addition of calls and

accesses

concernseed

Page 115: Icse2010 adams presentation

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

28

COMMIT complementsCBFA and HAM

CBFA HAM

COMMIT

0 0 1

08

14

9

CBFA HAM

COMMIT

1 0 0

09

14

9

Concern Mining Shortcomings

S1. limited seed context

S2. noise between seeds

S3. no composition of concerns

COMMIT

CVS

transactions

function call or variable access added

intentional co-addition of calls and

accesses

concernseed

Page 116: Icse2010 adams presentation

QUESTIONS?

CrosscuttingConcerns

multi-threading

tracingexception

handling

data

persistence

security

memory

cleanup

3D rendering

performance

soundsupport

28

COMMIT complementsCBFA and HAM

CBFA HAM

COMMIT

0 0 1

08

14

9

CBFA HAM

COMMIT

1 0 0

09

14

9

Concern Mining Shortcomings

S1. limited seed context

S2. noise between seeds

S3. no composition of concerns

COMMIT

CVS

transactions

function call or variable access added

intentional co-addition of calls and

accesses

concernseed