architecture extraction from code

39
Architecture Extraction from code Course Seminar By Sanyam Goyal (06005008) Guide : Prof. R.K. Joshi 1

Upload: sanyamgoyal

Post on 28-Jan-2015

111 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Architecture Extraction From Code

Architecture Extraction from code

Course Seminar

By

Sanyam Goyal (06005008)

Guide : Prof. R.K. Joshi

1

Page 2: Architecture Extraction From Code

Outline

Motivation

Different Techniques

◦ Clustered based

FOCUS

ROMANTIC

◦ Pattern based Conclusion

Conclusion

References

2

Page 3: Architecture Extraction From Code

Motivation

Old software systems are often not

documented or very less documentation

is available, even in the systems where

documentation is available there is no

explicit mention of the architecture that

the code possesses.

New changes to the system need a

knowledge of implicit architecture that

the system possess.

3

Page 4: Architecture Extraction From Code

Motivation

Legacy Transformation

It is a tough task to convert a 10,000 line

Cobol code to C/C++ code if the

programmer is unaware of the underlying

architecture .

4

Page 5: Architecture Extraction From Code

Motivation

System evolution

As system evolves , it tends to drift from

it’s original architecture .

So it is very important to recover or

reconstruct the architecture of the

system in the spirit that new changes to

the system do not affect the existing

working model.

5

Page 6: Architecture Extraction From Code

Different Techniques

Approaches to architecture extraction can be classified mainly into clustered based techniques and pattern-based techniques.

The clustered-based techniques gradually build the architecture by grouping the components .It is a Bottom up process which starts with a low-level knowledge like source code and gradually discover the complete architecture.

Pattern-based techniques on the other hand is a Top-down process which first build a conceptual architecture of the system in terms of some pattern and then the software system is searched to find instances of that pattern in a top-down manner

6

Page 7: Architecture Extraction From Code

Clustering based Techniques.

The focus of techniques here is to view

different parts of the source code as a

single component (cluster) , then use an

hierarchical approach to find the

architecture .

7

Page 8: Architecture Extraction From Code

Clustering based Techniques.

Cluster

Cluster

Architecture

8

Page 9: Architecture Extraction From Code

Clustering based Techniques.

FOCUS

ROMANTIC

9

Page 10: Architecture Extraction From Code

FOCUS

In software evolution , architecture erosion is a common problem where architecture is modified to a point such that the basic properties of the architecture no longer hold.

FOCUS is an approach to recover architecture in such applications.

It allows engineers to focus their attention directly to the part that has affected mainly from the change and recover the architecture in an incremental fashion.

10

Page 11: Architecture Extraction From Code

FOCUS

11

Page 12: Architecture Extraction From Code

FOCUS

Architecture Recovery

System Evolution

12

Page 13: Architecture Extraction From Code

Architecture Recovery

13

Page 14: Architecture Extraction From Code

Architecture recovery

1. Indentifying components.• Use call-graphs, class-diagram

• Inheritance ,Aggregation

2. Propose idealized architectural model• basis of the future evolution’s requirements

• may incorrectly characterize some of the application aspects

3. Map identified components to different architectural elements• intermediate architecture, which contains those components

which fits with the idealized architecture.

4. Identify key use-cases • UML use-cases to express the application requirements

• Three categories of the use-cases .

• First is the category of cases which is unaffected by the desired modification,

• Second are those corresponding to the new changes

• Third are the earlier existing cases which now needs to be modified according to the changes

14

Page 15: Architecture Extraction From Code

Architecture recovery

5. Analyze Component interaction

• Along with the static relationship of classes their interactions and

control flow must also be analyzed

• UML sequence diagram

6. Generate refined Architecture

• The control flow identified in the previous step can be used to find

remaining components mapping from the architectural elements

and their inter-relations with earlier discovered components.

• At the same time this can be used to find out inconsistencies

(missing component interactions) which were introduced by the

step 2.

15

Page 16: Architecture Extraction From Code

System evolution

16

Page 17: Architecture Extraction From Code

System evolution

1. Propose idealized architecture Evolution plan• high-level architecture evolution plan like distributed client server

model

2. Add/Modify components • identify which components needs modification and what new

components need to be added.

3. Update components interactions • modify the existing component interactions also. Adding a new feature

may cause the earlier component interactions to change

4. Generate Evolved Architecture• changes made in the previous steps now needs to be integrated with

original architecture that was recovered.

• This generated architecture serves as the basis for all the design issue related to the new modifications to the system

5. Set the new Focus• the components which were affected by the previous iteration now

become new focus, so that they can be more refined.

17

Page 18: Architecture Extraction From Code

FOCUS

Figure4. Architecture Obtain in first iteration of Recovery of

Drawcli Application 18

Page 19: Architecture Extraction From Code

FOCUS

In summary this is an approach to recover and evolve architecture of moderately sized OO applications.

The approach reduces the complexity involved in the recovery process by allowing engineers to focus on a particular segment of the architecture.

The idea is that the recovery is not just one-time process. You need to maintain the architecture as the system evolves

19

Page 20: Architecture Extraction From Code

ROMANTIC

The main idea behind this approach is to use

system structural and semantic properties to

come up with a quasi-automatic process of

extracting architecture from OO Systems

semantic information about the system like

architecture patterns, architectural quality is

used to decrease the need for human-

interaction

The approach is based on the two principals

20

Page 21: Architecture Extraction From Code

Principle 1 - Extraction of architecture from

object-oriented systems

21

Page 22: Architecture Extraction From Code

Principle 1 - Extraction of architecture from

object-oriented systems

define the mapping between the object concepts and the architectural elements.

extract the architecture in terms of classes, shapes, components and connectors

shapes as collection of components where each component is composed of several classes which may belong to different packages.

Each shape is portioned in two parts namely 1)shape interface 2) center .

“shape interface” contains all the classes(components) which have links to the outside of shape like a class may call a function that is defined in a different class(component) . The remaining classes of shape are referred to as “center”.

22

Page 23: Architecture Extraction From Code

Principle 2 -guides of the architecture

extraction process

23

Page 24: Architecture Extraction From Code

Principle 2 -guides of the architecture

extraction process

Architecture is considered relevant if it satisfies these four guides.

Firstly it must be semantically correct, meaning that it must not convey any meaningless information .

Secondly architectural quality must be good

Thirdly it should respect the recommendation specified by the architect and the constraint and specification specified in system documentation as far as possible .

lastly it must be adaptable to the specificity of the deployment hardware architecture.

24

Page 25: Architecture Extraction From Code

How to evaluate software architecture

semantics

semantic sub-characteristics defined are

composability, autonomy and specificity.

refine these sub-characteristics into

component properties like cohesion,

coupling. We link these properties to the

shape properties.

shape properties are measured mainly by

two parameters , Firstly Cohesion ,

Secondly Coupling

25

Page 26: Architecture Extraction From Code

How to evaluate software architecture

semantics

Semantic measurement function based on our three sub-characteristics, autonomy, composability and specificity). say Spe ,C, Auto respectively

Where Ai is designed by the engineers .This function is used in a hierarchical clustering algorithm to obtain a shape (partition of classes) and consequently a semantically correct architecture

26

Page 27: Architecture Extraction From Code

How to evaluate software architecture

semantics

27

Page 28: Architecture Extraction From Code

Romantic

In summary This approach is based on the

component semantic characteristics.

These characteristics helps in creating a

partioning the system classes and

extracting components from it

28

Page 29: Architecture Extraction From Code

Pattern based techniques

29

Page 30: Architecture Extraction From Code

Pattern based techniques

Top-down approach.

Uses interaction with human extensively to put mental model of architecture as well as to verify the results of recovery.

Two phases◦ Offline phase Call-graphs, class-diagrams, dependency graphs

◦ Online User provides architectural patterns to in the form of a

query

A* search algorithm to match different sub-graphs that satisfy the pattern given by user.

30

Page 31: Architecture Extraction From Code

System Representation

system software is represented in terms of a entity-relationship source graph Gs = (Ns

,Rs) .

Nodes of the graph ( ni )represents system entities like data-types , files , classes ,variables ,functions ,interfaces etc.

edges( ej ) represent the interaction between various system entities ,like calling a function from a class ,inheritance etc .

graph is decomposed into sub-graphs where each sub-graph works as a separate search space

31

Page 32: Architecture Extraction From Code

Pattern representation

The abstract patterns are represented in a query language (AQL) .

An AQL query can be represented as a graph which consists of composite nodes and links.

A node is a expanded into a pattern region Gi

pr, and each link is expanded into a group of edges Ri

m<->pr .

In the ith matching phase a pattern region is matched with source graph Gg(i)

sr and the edges Ri

m<->pr are matched with corresponding connectors edges Ri

m<->sr

32

Page 33: Architecture Extraction From Code

Pattern representation

Sample AQL Query

BEGIN-AQLSUBSYSTEM S1

MAIN-SEEDS files server.c client.c

IMPORTS: rsrc ?IR

RESOURCES

Rsrc ?R1(2…4) S2

EXPORTS

RESOURCES: rsrc ?ER

Rsrc ?R3(2…10) S2

CONTAINS FILES : file $CFI(3…13) Files client.c server.c

RELOCATES: NO files file1.php TO S4

END-AQL

33

Page 34: Architecture Extraction From Code

graph matching

matches the query graph (pattern-graph)

Gip with the input graph Gi

I

k(no of different AQL queries) different

phases .

The results obtained in each phase should

satisfy the constraint specified by the AQL

query

34

Page 35: Architecture Extraction From Code

graph matching

A* search algorithm.

Algorithm generates a search tree which represents the recovery of module Mi (corresponding to ith AQL query). This graph contains a root node which is the matching of main-seed ni in source graph and the first place-holder (node) ni , in Gi

pr.

Different non-leafs nodes at one level of the search graph represents the alternative matching of the placeholder from the source graph and the leaf nodes which represents the solution obtained from the marching process.

It follows A*, so at each step the cost of matching the source node and corresponding pattern region is evaluated and the node with least cost is expanded.

The difference from normal A* is that depth of each search tree is bounded by numbers of placeholders –nodes in the particular pattern region or number of modules in the AQL query.

35

Page 36: Architecture Extraction From Code

CONCLUSION

Legacy systems, architectural erosion and need for modularity are few features that engineers require.

The report compares different approaches to recover the software architecture. The comparison was based on different inputs that it takes user in- cooperation, knowledge of previous architecture and documentation available.

The common characteristic in all these approaches is that they all recover the architecture in an incremental way.

Both clustered-based techniques as well as the pattern-based techniques have proven to be efficient in recovering architecture.

36

Page 37: Architecture Extraction From Code

References/Readings

Lei Ding; Medvidovic, N., "Focus: a light-weight, incremental approach to software architecture recovery and evolution," Software Architecture, 2001. Proceedings. Working IEEE/IFIP Conference on , vol., no., pp.191-200,

Sartipi, K.; Kontogiannis, K., "A graph pattern matching approach to software architecture recovery ," Software Maintenance, 2001. Proceedings. IEEE International Conference on , vol., no., pp.408-419, 2001

Chardigny, S.; Seriai, A.; Oussalah, M.; Tamzalit, D., "Extraction of Component-Based Architecture from Object-Oriented Systems," Software Architecture, 2008. WICSA 2008. Seventh Working IEEE/IFIP Conference on , vol., no., pp.285-288, 18-21 Feb. 2008

D Pollet, S Ducasse, L Poyet, I Alloui, S Cimpan, … - … European Conference on Software Maintenance and Reengineering -people.untyped.org

http://www.sparxsystems.com.au/uml-tutorial.html

Czeranski, J.; Eisenbarth, T.; Kienle, H.; Koschke, R.; Simon, D., "Analyzing xfig using the Bauhaus tool," Reverse Engineering, 2000. Proceedings. Seventh Working Conference on , vol., no., pp.197-199, 2000

37

Page 38: Architecture Extraction From Code

References/Readings

Eixelsberger, W.; Ogris, M.; Gall, H.; Bellay, B., "Software architecture

recovery of a program family," Software Engineering, 1998.

Proceedings of the 1998 International Conference on , vol., no., pp.508-

511, 19-25 Apr 1998

J. M. Bieman and B.-K. Kang Cohesion and reuse in an object-

oriented system. In Proc. of the Symp. on Software reusability,SSR ’95,

pages 259–262, 1995.

R. Agrawal and R. Srikant. Fast algorithms for mining association

rules. In Proceedings of the 20th International Conference on Very Large

Databases, pages 487–499, 1994

ISO/IEC-9126-1. In Software engineering - Product quality - Part 1:

Quality Model. ISO-IEC, 2001.

C. Szyperski. Component Software. ISBN: 0-201-17888-5. Addison-

Wesley, 1998

38

Page 39: Architecture Extraction From Code

Thanks!

39