Download - Reverse Engineering automation
![Page 1: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/1.jpg)
Reverse Engineering automation
by Anton Dorfman PHDAYS 2014, Moscow
![Page 2: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/2.jpg)
Fan of & Fun with Assembly language Researcher Scientist Teach Reverse Engineering since 2001 Candidate of technical science Lecturer at Samara State Technical
University and Samara State Aerospace University
About me
![Page 3: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/3.jpg)
Intro Simple Trace & Coverage Graph Program Slicing All Together
Agenda
![Page 4: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/4.jpg)
Intro
![Page 5: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/5.jpg)
Iterative process Understand small piece of code – make
abstraction in mind Understand all pieces of code in procedure –
unite all abstractions – make abstraction about function
And etc Good visualization important Many routine tasks
Reverse Engineering
![Page 6: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/6.jpg)
Code localization Data flow dependencies Code flow dependencies Local variables checking Input output procedures parameters
checking Variables range checking Labels naming Function naming Function prototyping
Routine tasks of RE
![Page 7: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/7.jpg)
Biggest science school - Professor Thomas W. Reps - University of Wisconsin-Madison - http://pages.cs.wisc.edu/~reps/
In Russia – Institute for System Programming Russian Academy of Science - http://www.ispras.ru
Automatic program analysis - Science
![Page 8: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/8.jpg)
Dynamic Binary Instrumentation (DBI) Intermediate representation (IR) System emulators
Technologies that helps
![Page 9: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/9.jpg)
Simple
![Page 10: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/10.jpg)
Function Variable Label
Just naming
![Page 11: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/11.jpg)
Trace & Coverage
![Page 12: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/12.jpg)
Also called Execution Trace Trace of program execution Simpe case - just a list of addresses that
instruction pointer takes on single run
Code Trace
![Page 13: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/13.jpg)
Code Trace example
![Page 14: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/14.jpg)
Firstly used as a measure to describe the degree to which the source code of a program is tested by a particular test suite.
List of instructions that executed during single run
List of unique addresses from program trace
Code Coverage
![Page 15: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/15.jpg)
Code Coverage example
![Page 16: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/16.jpg)
Difference between code coverage can help to locate code that do some functionality
Common code coverage – common functionality
More runs – more diff between code coverage – precise code localization
Code Coverage Diff
![Page 17: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/17.jpg)
Code Coverage Diff Example
![Page 18: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/18.jpg)
The collection of all memory accesses performed by an application in single run
Include both writes and reads
Memory Trace
![Page 19: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/19.jpg)
Include Code Trace Include all registers values and memory
values at every execution point May be absolute – save all values Relative – just save values that changed at
this execution point
Full Program Trace
![Page 20: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/20.jpg)
Graph
![Page 21: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/21.jpg)
Directed graph that shows control dependencies between blocks of commands
Each node represents basic block Basic block – piece of code ends with jump,
starts with jump target without any jump or jump target inside block
Two special blocks – entry block and exit block
Control Flow Graph (CFG)
![Page 22: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/22.jpg)
![Page 23: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/23.jpg)
Directed graph that represents calling relationships between subroutines in a computer program
Each node represents procedure Each edge (a, b) indicates that procedure a calls
procedure b Cycle in the graph indicates recursive procedure
calls Static call graph represents every possible run of
the program Dynamic call graph is a record of an execution of
the program
Call Graph
![Page 24: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/24.jpg)
Call Graph example
![Page 25: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/25.jpg)
Directed graph that represents data dependencies between a number of operations
Each node represents operation Each edge represents variable
Data Flow Graph (DFG)
![Page 26: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/26.jpg)
Data Flow Graph example
![Page 27: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/27.jpg)
Ottenstein & Ottenstein – PDG, 1984 Actually – Procedure dependence graph because
introduced for programs with one procedure Each node represents a statement Two types of edges Control Dependence – between a predicate and
the statements it controls Data Dependence – between statements
modifying a variable and those that may reference it
Special “Entry” node is connected to all nodes that are not control dependant
Program Dependence Graph (PDG)
![Page 28: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/28.jpg)
PDG example
![Page 29: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/29.jpg)
Horowitz, Reps & Binkly – SDG, 1990 PDG included for procedures New nodes: Call Site, Procedure Entry, Actual-in-
argument, Actual-out-argument, Formal-in-parameter, Formal-out-parameter
3 new edge types Call Edge – connect “call site” and “procedure
entry” Parameter-In Edge – connect “Actual-in” with
“Formal-in” Parameter-Out-Edge – connect “Actual-out” with
“Formal-out”
System Dependence Graph (SDG)
![Page 30: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/30.jpg)
![Page 31: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/31.jpg)
![Page 32: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/32.jpg)
Program Slicing
![Page 33: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/33.jpg)
Large programs must be decomposed for understanding and manipulation.
However, it should be into procedures and abstract data types.
Program Slicing is decomposition based on data flow and control flow analysis.
A study showed, experienced programmers mentally slicing while debugging.
“The mental abstraction people make when they are debugging a program” [Weiser]
Program Slicing - Mark Weiser, 1979
![Page 34: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/34.jpg)
All the statements of a program that may affect the values of some variables in a set V at some point of interest i.
A slicing criterion of a program P is a tuple (i, V), where i is a statement in P and V is a subset of variables in P.
Slicing Criterion:C = (i , V)
What is a Slice?
![Page 35: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/35.jpg)
Example of Slices
![Page 36: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/36.jpg)
Direction of slicing◦ Backward◦ Forward
Slicing techniques◦ Static ◦ Dynamic◦ Conditioned
Levels of slices◦ Intraprocedural slicing◦ Interprocedural slicing
Slicing classifications
![Page 37: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/37.jpg)
Original Slicing Method Backward slice of a program with respect to a
program point i and set of program variables V consists of all statements and predicates in the program that may affect the value of variables in V at I
Answer the question “what program components might effect a selected computation?”
Preserve the meaning of the variable (s) in the slicing criterion for all possible inputs to the program
Backward slicing
![Page 38: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/38.jpg)
Slice criterion <12,i>◦ 1 main( )◦ 2 {◦ 3 int i, sum;◦ 4 sum = 0;◦ 5 i = 1;◦ 6 while(i <= 10)◦ 7 {◦ 8 Sum = sum + 1;◦ 9 ++ i;◦ 10 }◦ 11 Cout<< sum;◦ 12 Cout<< i;◦ 13 }
Backward slicing example
![Page 39: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/39.jpg)
• Forward slice of a program with respect to a program point i and set of program variables V consists of all statements and predicates in the program that may be affected by the value of variables in V at I
• Answers the question “what program components might be effected by a selected computation?”
• Can show the code affected by a modification to a single statement
Forward Slicing
![Page 40: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/40.jpg)
Slice criterion <3,sum>◦ 1 main( )◦ 2 {◦ 3 int i, sum;◦ 4 sum = 0;◦ 5 i = 1;◦ 6 while(i <= 10)◦ 7 {◦ 8 sum = sum + 1;◦ 9 ++ i;◦ 10 }◦ 11 Cout<< sum;◦ 12 Cout<< i;◦ 13}
Forward Slicing example
![Page 41: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/41.jpg)
Static Slicing does not make any assumptions regarding the input.
Slices derived from the source code for all possible input values
May lead to relatively big slices Contains all statements that may affect a
variable for every possible execution Current static methods can only compute
approximations
Static Slicing
![Page 42: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/42.jpg)
Slice criterion (12,i)◦ 1 main( )◦ 2 {◦ 3 int i, sum;◦ 4 sum = 0;◦ 5 i = 1;◦ 6 while(i <= 10)◦ 7 {◦ 8 sum = sum + 1;◦ 9 ++ i;◦ 10 }◦ 11 Cout<< sum;◦ 12 Cout<< i;◦ 13 }
Static Slicing example
![Page 43: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/43.jpg)
First introduced by Korel and Laski Dynamic Slicing assumes a fixed input for a
program. Only the dependences that occur in a specific
execution of the program are taken into account Computed on a given input Dynamic slicing criterion is a triple (input,
occurrence of a statement, variable) – it specifies the input, and distinguishes between different occurrences of a statement in the execution history
Dynamic Slicing
![Page 44: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/44.jpg)
1. read (n)2. for I := 1 to n do3. a := 24. if c1==1 then5. if c2==1 then6. a := 47. else8. a := 69. z := a10. write (z)
Dynamic Slicing example
• Assumptions– Input n is 1– C1, c2 both true– Execution history is 11, 21, 31, 41, 51, 61, 91, 22,
101
– Slice criterion<1, 101, z>
![Page 45: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/45.jpg)
Assumptions - Input ‘a’ is positive number
1. read(a) 2. if (a < 0)3. a = -a4. x = 1/a
Conditioned slice example
![Page 46: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/46.jpg)
Computes slice within one procedure Consists basically of two steps: A single slice of the procedure containing
the slicing criterion is made. Procedure calls from within this procedure
are sliced using new criteria.
Intraprocedural slicing
![Page 47: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/47.jpg)
Compute slice over an entire program Two ways for crossing procedure boundary Up – going from sliced procedure into
calling procedure Down – going from sliced procedure into
called procedure Must Be Context Sensitive
Interprocedural Slicing
![Page 48: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/48.jpg)
Chopping Value Set Analysis
Also
![Page 49: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/49.jpg)
CodeSurfer◦ Commercial product by GammaTech Inc.◦ GUI Based◦ Scripting language-Tk
Unravel◦ Static program slicer developed at NIST◦ Slices ANSI C programs◦ Limitations are in the treatment of Unions, Forks
and pointers to functions
Program slicing tools
![Page 50: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/50.jpg)
All together
![Page 51: Reverse Engineering automation](https://reader033.vdocuments.us/reader033/viewer/2022061611/5551a7f1b4c9053c488b4fd4/html5/thumbnails/51.jpg)
Slicing of Register on Code Coverage Graph based view of file reading and moves
between memory blocks
Some Results