refactoring erlang programs

63
Refactoring Erlang Programs Huiqing Li Simon Thompson University of Kent

Upload: zita

Post on 07-Jan-2016

36 views

Category:

Documents


1 download

DESCRIPTION

Refactoring Erlang Programs. Huiqing Li Simon Thompson University of Kent. Overview. What is refactoring? Examples The process of refactoring Tool building and infrastructure What is in Wrangler … demo Latest advances: data, processes, erlide. Introducing refactoring. Soft-ware. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Refactoring Erlang Programs

RefactoringErlang Programs

Huiqing Li

Simon Thompson

University of Kent

Page 2: Refactoring Erlang Programs

Overview

What is refactoring?

Examples

The process of refactoring

Tool building and infrastructure

What is in Wrangler … demo

Latest advances: data, processes, erlide.

Page 3: Refactoring Erlang Programs

Introducing refactoring

Page 4: Refactoring Erlang Programs

Soft-ware

There’s no single correct design …

… different options for different situations.

Maintain flexibility as the system evolves.

Page 5: Refactoring Erlang Programs

Refactoring

Refactoring means changing the design or structure of a program … without changing its behaviour.

RefactorModify

Page 6: Refactoring Erlang Programs

Examples

Page 7: Refactoring Erlang Programs

Generalisation

-module (test).

-export([f/1]).

add_one ([H|T]) ->

[H+1 | add_one(T)];

add_one ([]) -> [].

f(X) -> add_one(X).

-module (test).

-export([f/1]).

add_one (N, [H|T]) ->

[H+N | add_one(N,T)];

add_one (N,[]) -> [].

f(X) -> add_one(1, X).

-module (test).

-export([f/1]).

add_int (N, [H|T]) ->

[H+N | add_int(N,T)];

add_int (N,[]) -> [].

f(X) -> add_int(1, X).

Generalisation and renaming

Page 8: Refactoring Erlang Programs

Generalisation

-export([printList/1]).

printList([H|T]) ->

io:format("~p\n",[H]),

printList(T);

printList([]) -> true.

printList([1,2,3])

-export([printList/2]).

printList(F,[H|T]) ->

F(H),

printList(F, T);

printList(F,[]) -> true.

printList(

fun(H) ->

io:format("~p\n", [H])

end,

[1,2,3]).

Page 9: Refactoring Erlang Programs

Generalisation

-export([printList/1]).

printList([H|T]) ->

io:format("~p\n",[H]),

printList(T);

printList([]) -> true.

-export([printList/1]).

printList(F,[H|T]) ->

F(H),

printList(F, T);

printList(F,[]) -> true.

printList(L) ->

printList(

fun(H) ->

io:format("~p\n", [H]) end,

L).

Page 10: Refactoring Erlang Programs

Asynchronous to synchronous

pid! {self(),msg}

{Parent,msg} ->

body

pid! {self(),msg}, receive {pid, ok}-> ok

{Parent,msg} ->

Parent! {self(),ok},

body

Page 11: Refactoring Erlang Programs

Refactoring

Page 12: Refactoring Erlang Programs

Refactoring = Transformation + Condition

Transformation

Ensure change at all those points needed.

Ensure change at only those points needed.

Condition

Is the refactoring applicable?

Will it preserve the semantics of the module? the program?

Page 13: Refactoring Erlang Programs

Transformations

full stop one

Page 14: Refactoring Erlang Programs

Condition > Transformation

Renaming an identifier

"The existing binding structure should not be affected. No binding for the new name may intervene between the binding of the old name and any of its uses, since the renamed identifier would be captured by the renaming. Conversely, the binding to be renamed must not intervene between bindings and uses of the new name."

Page 15: Refactoring Erlang Programs

Which refactoring exactly?

Generalise f by making 23 a parameter of f:

f(X) ->

Con = 23,

g(X) + Con + 23.

• This one occurrence?• All occurrences (in the body)?• Some of the occurrences … to be selected.

Page 16: Refactoring Erlang Programs

Compensate or crash?

-export([oldFun/1,

newFun/1]).

oldFun(L) ->

newFun(L).

newFun(L) ->

… .

-export([newFun/1]).

newFun(L) ->

… .

or ?

Page 17: Refactoring Erlang Programs

Refactoring tools

Page 18: Refactoring Erlang Programs

Tool support

Bureaucratic and diffuse.

Tedious and error prone.

Semantics: scopes, types, modules, …

Undo/redo

Enhanced creativity

Page 19: Refactoring Erlang Programs

Semantic analysis

Binding structure• Dynamic atom creation, multiple binding occurrences, pattern semantics etc.

Module structure and projects• No explicit projects for Erlang; cf Erlide / Emacs.

Type and effect information• Need effect information for e.g. generalisation.

Page 20: Refactoring Erlang Programs

Erlang refactoring: challenges

Multiple binding occurrences of variables.

Indirect function call or function spawn: apply (lists, rev, [[a,b,c]])

Multiple arities … multiple functions: rev/1

Concurrency

Refactoring within a design library: OTP.

Side-effects.

Page 21: Refactoring Erlang Programs

Static vs dynamic

Aim to check conditions statically.

Static analysis tools possible … but some aspects intractable: e.g. dynamically manufactured atoms.

Conservative vs liberal.

Compensation?

Page 22: Refactoring Erlang Programs

Architecture of Wrangler

Page 23: Refactoring Erlang Programs

Wrangler in Emacs

Page 24: Refactoring Erlang Programs

Wrangler in Emacs

Page 25: Refactoring Erlang Programs

Wrangler refactorings

Rename variable/function/module

Generalise function definition

Move a function definition to another (new) module

Function extraction

Fold expression against function

Expression search

Detect duplicate code

Tuple function parameters

From tuple to record

Page 26: Refactoring Erlang Programs

Wrangler demo

Page 27: Refactoring Erlang Programs
Page 28: Refactoring Erlang Programs

Tool building

Page 29: Refactoring Erlang Programs

Wrangler and RefactorErl

Lightweight.

Better integration with interactive tools (e.g. emacs).

Undo/redo external?

Ease of implementing conditions.

Higher entry cost.

Better for a series of refactorings on a large project.

Transaction support.

Ease of implementing transformations.

Page 30: Refactoring Erlang Programs

Integration … with IDEs

Back to the future? Programmers' preference for emacs and gvim …

… though some IDE interest: Eclipse, NetBeans …

Issue of integration with multiple IDEs: building common interfaces.

Page 31: Refactoring Erlang Programs

Integration … with tools

Test data sets and test generation.

Makefiles, etc.

Working with macros e.g. QuickCheck uses Erlang macros …

… in a particular idiom.

Page 32: Refactoring Erlang Programs

APIs … programmer / user

API in Erlang to support user-programmed refactorings: • declarative, straightforward and complete • but relatively low-level.

Higher-level combining forms? • OK for transformations, but need a separate condition language.

Page 33: Refactoring Erlang Programs

Verification and validation

Possible to write formal proofs of correctness:• check conditions and transformations• different levels of abstraction

• possibly-name binding substitution for renaming etc.• more abstract formulation for e.g. data type changes.

Use of Quivq QuickCheck to verify refactorings in Wrangler.

Page 34: Refactoring Erlang Programs

Clone detection

Page 35: Refactoring Erlang Programs

The Wrangler Clone Detector

Uses syntactic and static semantic information.

Syntactically well-formed code fragments

… identical after consistent renaming of variables,

… with variations in literals, layout and comments.

Integrated within the refactoring environment.

Page 36: Refactoring Erlang Programs

The Wrangler Clone Detector

Make use of token stream and annotated AST. Token–based approaches Efficient. Report non-syntactic clones. AST-based approaches.

Report syntactic clones. Checking for consistent renaming is easier.

Page 37: Refactoring Erlang Programs

The Wrangler Clone Detector

Source Files

TokenisationTokenisation

Token Stream

NormalisationNormalisation

Normalised Token Stream

Suffix Tree ConstructionSuffix Tree Construction

Suffix tree

Page 38: Refactoring Erlang Programs

The Wrangler Clone Detector

Source Files

TokenisationTokenisation

Token Stream

NormalisationNormalisation

Normalised Token Stream

Suffix Tree ConstructionSuffix Tree Construction

Suffix tree

Clone CollectorClone Collector

Initial Clones

Clone FilterClone Filter

Filtered Initial Clones

Clone DecompositionClone Decomposition

Parsing + Static AnalysisParsing + Static Analysis

Annotated ASTs

Syntactic Clones

Page 39: Refactoring Erlang Programs

The Wrangler Clone Detector

Source Files

TokenisationTokenisation

Token Stream

NormalisationNormalisation

Normalised Token Stream

Suffix Tree ConstructionSuffix Tree Construction

Suffix tree

Clone CollectorClone Collector

Initial Clones

Clone FilterClone Filter

Filtered Initial Clones

Clone DecompositionClone Decomposition

Parsing + Static AnalysisParsing + Static Analysis

Annotated ASTs

Syntactic Clones Consistent Renaming

CheckingConsistent Renaming

Checking

Clones to report

Page 40: Refactoring Erlang Programs

The Wrangler Clone Detector

Source Files

TokenisationTokenisation

Token Stream

NormalisationNormalisation

Normalised Token Stream

Suffix Tree ConstructionSuffix Tree Construction

Suffix tree

Clone CollectorClone Collector

Initial Clones

Clone FilterClone Filter

Filtered Initial Clones

Clone DecompositionClone Decomposition

Parsing + Static AnalysisParsing + Static Analysis

Annotated ASTs

Syntactic Clones Consistent Renaming

CheckingConsistent Renaming

Checking

Clones to report

FormattingFormatting

Reported Code Clones

Page 41: Refactoring Erlang Programs

Clone detection demo

Page 42: Refactoring Erlang Programs
Page 43: Refactoring Erlang Programs
Page 44: Refactoring Erlang Programs
Page 45: Refactoring Erlang Programs

Support for clone removal

Refactorings to support clone removal.

Function extraction.

Generalise a function definition.

Fold against a function definition.

Page 46: Refactoring Erlang Programs

Case studies

Applied the clone detector to Wrangler itself with threshold values of 30 and 2.

36 final clone classes were reported …12 are across modules, and 3 are duplicated function definitions. Without syntactic checking and consistent variable renaming checking, 191 would have been reported.

Applied to third party code base (32k loc, 89 modules),109 clone classes reported.

Page 47: Refactoring Erlang Programs

Data-oriented refactorings

Page 48: Refactoring Erlang Programs

-module(tup1).

-export([gcd/1]).

gcd({X,Y}) ->

if X>Y ->

gcd({X-Y,Y});

Y>X ->

gcd({Y-X,X});

true ->

X

end.

Tupling parameters

-module(tup1).

-export([gcd/2]).

gcd(X,Y) ->

if X>Y ->

gcd(X-Y,Y);

Y>X ->

gcd(Y-X,X);

true ->

X

end.

2

Page 49: Refactoring Erlang Programs

-module(rec1).

-record(rec,{f1, f2}).

g(#rec{f1=A, f2=B})->

A + B.

h(X, Y)->

g(#rec{f1=X,f2=X}),

g(#rec{

f1=element(1,Y),

f2=element(2,Y)}).

Introduce records …

-module(rec1).

g({A, B})->

A + B.

h(X, Y)->

g({X, X}),

g(Y).

f1 f2

Page 50: Refactoring Erlang Programs

Introduce records in a project

Need to replace other expressions …

• Replace tuples with record

• Record update expression

• Record access expression

Chase dependencies across functions …

… and across modules.

Page 51: Refactoring Erlang Programs

Refactoring and Concurrency

Page 52: Refactoring Erlang Programs

Wrangler and processes

Refactorings which address processes

• Register a process.

• Rename a registered process.

• From function to process.

• Add tags to messages sent / received.

Page 53: Refactoring Erlang Programs

Challenges to implementation

Data gathering is a challenge because

• Processes are syntactically implicit.

• Pid to process links are implicit.

• Communication structure is implicit.

• Side effects.

Page 54: Refactoring Erlang Programs

Underlying analysis

Analyses include

• Annotation of the AST, using call graph.

• Forward program slicing.

• Backwards program slicing.

Page 55: Refactoring Erlang Programs

Wrangler and Erlide

Page 56: Refactoring Erlang Programs

Wrangler and Erlide

Erlide is an Eclipse plugin for Erlang.

• Distribution simplified.

• Integration with the edit undo history.

• Notion of project.

• Refactoring API in the Eclipse LTK.

Ongoing support for Erlide from Ericsson.

Page 57: Refactoring Erlang Programs
Page 58: Refactoring Erlang Programs
Page 59: Refactoring Erlang Programs

Issues on integration

LTK has a fixed workflow for interactions.

• New file vs set of diffs as representation.

• Fold and generalise interaction pattern.

• Cannot support rename / create file.

Other refactorings involve search … a different API.

Page 60: Refactoring Erlang Programs

Conclusions

Page 61: Refactoring Erlang Programs

Future work

Concurrency: continue work.

Refactoring within a design library: OTP.

Working with Erlang Training and Consulting.

Continue integration with Eclipse + other IDEs.

Test and property refactoring in .

Clone detection: fuller integration.

Page 62: Refactoring Erlang Programs

Ackonwledgements

Wrangler development funded by EPSRC.

The developers of syntax-tools, distel and Erlide.

George Orosz and Melinda Toth.

Zoltan Horvath and the RefactorErl group at Eotvos Lorand Univ., Budapest.

Page 63: Refactoring Erlang Programs

http://projects.cs.kent.ac.uk