template-guided concolic testing via online learning - korea...
Post on 08-Mar-2021
4 Views
Preview:
TRANSCRIPT
Template-Guided Concolic Testing via Online Learning
Sooyoung Cha
Korea University
ASE'18 @Montpellier, France
(co-work with Seonho Lee and Hakjoo Oh)
2
Concolic Testing
● Concolic testing (Concrete and Symbolic executions)– An effective software testing method.– SAGE : Find 30% of all Windows 7 WEX security bugs.
SAGE
3
Concolic Testing
● Concolic testing (Concrete and Symbolic executions)– An effective software testing method.– SAGE : Find 30% of all Windows 7 WEX security bugs.
● Open Challenge: Path Explosion– # of execution paths: 2
● ex) grep-2.2(3,836): 2 paths (worst case) – Exploring all paths is impossible.
3,836
SAGE
# of branches
4
My Research Area
● Motivation. Search Heuristic
Path-ExplosionSearch Space Reduction
Seed Input, Constraint Solver, …Experts
(Manually)
5
My Research Area
● Motivation. Search Heuristic
Path-ExplosionSearch Space Reduction
Seed Input, Constraint Solver, …Experts
(Manually)
● Data-Driven Concolic Testing.
Search Heuristic (ICSE’18)
Search Space Reduction (ASE’18) Machine
(Automatically)
6
Search Heuristic
● Selecting branches that are likely to maximize code coverage.● Having its own criteria to pick a branch.
– DFS, BFS, Random, Generational, CFDS, CGS, ParaDySE, ...
b1
b2
b3
Solve(b1∧b2∧¬b3)
b1
b2
b3b4
b5
Solve(b1∧¬b2)
1st execution 2nd execution
...
Motivation
● Using search heuristics alone is not sufficient.– Code coverage converges in practical settings.
7h and 10 cores in parallel !!
8
Goal: Draw red graph line
● Improving branch coverage in practical settings.– Branch coverage Bug-finding↑ → Bug-finding → Bug-finding ↑ → Bug-finding
9
Key Observation
● Tracks all input values as symbolic. – Conventional Concolic Testing.
α1
α2
α3
α4
α5
α6
α7
α8Input :
search heuristic
...
...
......
......
......
10
Key Observation
...
α1
α2
α3
α4
α5
α6
α7
α8Input :
search heuristic
...
Reducing the search space of concolic testing !
search heuristic
...
...
......
......
......
● Tracks all input values as symbolic. – Conventional Concolic Testing.
11
Key Ideas
● Template-Guided Concolic Testing.– Template: a partially symbolized input.
ex)
● Selectively treat input values as symbolic.● Replace unselected input values with concrete inputs.
● Online Learning.– Automatically generating, using, and refining templates.
‘-’ ‘g’ α3
α4 ‘1’ ‘d’ α
7α
8
12
Effectiveness
● Achieve greater branch coverage in practical settings.
● Find real bugs in the latest versions of C programs. – grep-3.1, sed-4.4, gawk-4.21
13
Template-Guided Concolic Testing
● Template– A set of concrete values and their positions.
(ex) Template1: {(3, ‘-’), (4, ‘S’), (6, ‘A’), (8,‘L’)}
● Challenge: Finding effective templates ! – Choosing input values to track symbolically.– Replacing the remaining inputs with appropriate concrete values.
α1
α2
α3
α4
α5
α6
α7
α8
α1
α2 ‘-’ ‘S’ α
5 ‘A’ α7 ‘L’
Conventional Concolic Testing Template-Guided Concolic Testing
14
Template-Guided Concolic Testing with Online Learning
1. ConventionalConcolic Testing
2. SequentialPatternMining
3. PatternRanking
4. Pattern to Template
5. Template-GuidedConcolic Testing
pgm
● Goal – Perform concolic testing while learns useful templates online.
15
1. Conventional Concolic Testing
● Collect effective test-cases during concolic testing. – “Effective test-cases” :– Collecting all test-cases can cause serious performance degradation.
ConventionalConcolic Testing
Effective Test-Cases− X * *− 2 R L2 X ? #
...− Y − 5
− − P −− − s y− − c l
...3 − s h
Input2 :Input1 :
α1 α2 α3 α4 α5 α6 α7 α8
pgm SequentialPatternMining
16
2. Sequential Pattern Mining
● Extract common patterns from the effective test-cases.– Call “sequential pattern mining” in data mining community.– Use a recent algorithm: CloFast(1).
ex) 14,604 effective test-cases → Bug-finding 6,176 patterns (in 5 min)
(1). Fabio Fumarola, Pasqua Fabiana Lanotte, Michelangelo Ceci, and Donato Malerba. 2016. CloFAST: closed sequential pattern mining using sparse and vertical id-lists. Knowledge and Information Systems.
− − s
− − − − X − − − X −
P1 :
P2 : P3 :P4 :
Candidate PatternsEffective Test-Cases− X * *− 2 R L2 X ? #
...− Y − 5
− − P −− − s y− − c l
...3 − s h Sequential
PatternMining
PatternRanking
17
3. Pattern Ranking
● Choose the top-k patterns from the candidates.● The idea for ranking.
– Reflect the experience of previously evaluated patterns.
− − −
− − s − X − − − X −
P1 :
P2 : P3 :P4 : Pattern
Ranking
1. Candidate Patterns
2. Good, Bad Pattern sets
The top-K Patterns − X − −
− X − − − − − − s
Top 1 :
Top 2 : Top 3 : Top 4 :
Good P P1 : − X X −
Bad P P2 : − s −
Pattern to Template
18
4. Pattern to Template
● Transform the top-k patterns into templates.
(1). Collect the test-cases containing the pattern.
(2). Identify the positions where each value appears most frequently.
− −
Effective Test-Cases− X * *− 2 R X− X ? #
− − P −− − s y− c − l
P1: − X − − + − XT1:
Templates− X T1 :
− X − T2 :
The top-K Patterns − X − −
− X − − − − − − s
Top 1 :
Top 2 : Top 3 : Top 4 : Pattern to
TemplateTemplate-GuidedConcolic Testing
− −
19
5. Template-Guided Concolic Testing
● Run concolic testing with templates.– Evaluate the quality of each template. – Accumulate in good or bad patterns.
● Good P : # of branches covered by T1 > threshold● Bad P: # of branches covered by T2 ≤ 1
Templates− X
− X −Template-GuidedConcolic Testing
− − PatternRanking
ConventionalConcolic Testing
Good P P1 : − X X − +P3: − X − −
Bad P P2 : − s − +P4: − X −
T1 :
T2 :
20
Template-Guided Concolic Testing with Online Learning
● Can select more useful patterns based on increased knowledge.
1. ConventionalConcolic Testing
2. SequentialPatternMining
3. PatternRanking
4. Pattern to Template
5. Template-GuidedConcolic Testing
pgm
Good P
P1: − X X −
P3: − X − −
…
+ P19: − d u
Bad P
P2 : − s −
P4: − X −
…
+ P20: − k p
Knowledge
21
Experiments
● Implemented in CREST.● Used 5 open-source C programs.
● Compared with conventional concolic testing.– CGS, CFDS, Random, Generational, DFS, ParaDySE.
● Applied our technique on the best search heuristic.– T-CGS, T-CFDS.
Program # Total branches LOC
vim-5.7 35,464 165K
gawk-3.0.3 8,038 30K
grep-2.2 3,836 15K
sed-1.17 2,650 9K
tree-1.6.0 1,440 4K
22
The Same Evaluation Settings
● The same testing budget.– vim: 70h
– gawk, grep, sed, tree: 7h
● The same cores.– using 10 cores in parallel.
● The same initial inputs.
23
Effectiveness
● Accumulated branch coverage (5 red lines )
T-CGS exclusively covered 833 branches
24
Effectiveness
● Bug-finding.– The five bug inputs for the latest versions of C programs.– our technique(all inputs) > conventional(2/5 inputs)
25
Effectiveness
● Bug-finding.– The five bug inputs for the latest versions of C programs.– our technique(all inputs) > conventional(2/5 inputs)
● Demo: attacking our lab server. – All the memory of our lab server will be exhausted.
26
Learned Patterns
● Top 5 good and bad patterns for increasing branch coverage.– Good and bad patterns are hardly distinguishable.
27
Learned Patterns
● Top 5 good and bad patterns for increasing branch coverage.– Good and bad patterns are hardly distinguishable.
Manually selecting good patterns is highly tricky.
>
Tool
● Make our tool publicly available.● Name our tool via template-guided approach.
– Challenge : Choosing input values to track symbolically.
C o n c o l i c T e s t i n g
Tool: ConTest
● Make our tool publicly available.● Name our tool via template-guided approach.
– Challenge : Choosing input values to track symbolically.
– Learned Template:● {(0, ‘C’), (1, ‘o’), (2, ‘n’), (9,‘T’), (10, ‘e’), (11, ‘s’), (12, ‘t)}
C o n T e s t
Thank You
URL: https://github.com/kupl/ConTest
top related