software design rules
TRANSCRIPT
![Page 1: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/1.jpg)
Software Design RulesSoftware Design RulesSoftware Design Rules
Monica LamMonica LamMonica Lam
Stanford UniversityStanford UniversityStanford University
Joint work with: Sudheendra Hangal, David Heine, Ben Livshits, Michael Martin, John Whaley
![Page 2: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/2.jpg)
1
Software is Full of Errors
Error rate: 1-4.5 errors per 1000 linesWindows 2000
35M LOC, 63000 known bugs at time of release2 per 1000 lines
Large consumer softwareFormal specification & verification infeasible
![Page 3: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/3.jpg)
2
Buffer Overruns
A buffer access must stay within boundsBuffer overruns are responsible for over 50% of major vulnerabilitiesMS Blaster, Slammer, Code red, nimda, …, Internet Worm, 1988Responsible for damages in billions
![Page 4: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/4.jpg)
3
Memory Leaks
All unused memory should be freed once and only onceMemory exhaustion may cause long running programs to fail.
![Page 5: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/5.jpg)
4
SQL Injection Errors
Database
Web serverFront end
user “Give me Bob’s credit card #”or
“Delete all records”
User may not supply SQL queries to databases directlyOne of top ten vulnerabilities
![Page 6: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/6.jpg)
5
Happy-go-lucky SQL Query
User supplies: name, password
SELECT UserID, CreditcardFROM RecordsWHERE
Name = ‘ + name+’ AND PW =‘ + password + ’
![Page 7: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/7.jpg)
6
Fun with SQL
“ — ”: means “the rest are comments” in SQLSELECT UserID, CreditCard
FROM RecordsWHERE:
Name = ‘bob ’ AND PW = ‘foo’Name = ‘bob’— ’ AND PW = ‘x’Name = ‘bob’ or 1=1—’ AND PW = ‘x’Name = ‘bob’; DROP Records—’ AND PW = ‘x’
![Page 8: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/8.jpg)
7
Design Rules
Buffer overruns, memory leaks, SQL injections
Same error is often repeated many times
May be specific to a language, a class of applications, a programImportant: may be critical to securitySuccinct: governs many lines of code
![Page 9: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/9.jpg)
8
General Practice
Design rules are implicitViolations of design rules rampant Tools: purify,
grep, emacs, program environments
![Page 10: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/10.jpg)
9
Leverage Computer Power
Courtesy: Intel
![Page 11: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/11.jpg)
10
Programs Don’t Grow Exponentially!
01020304050
1990 1995 2000 2005
Size of Microsoft Windows
Mill
ion
Line
s of
Cod
e
NT3.195
98NT5.0
2000XP
![Page 12: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/12.jpg)
11
This Talk
New generation of“Computer-Aided Programming Tools”
to enforce critical software design rules
![Page 13: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/13.jpg)
12
Custom Design Rule Checkers
Intrinsa, Coverity: C checkersBuilt-in rules for operating systemsRelatively simple analysis UnsoundFound thousands of critical errors in Windows, Linux, BSD, …
![Page 14: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/14.jpg)
13
New-Generation Tools
Design Rules
AdvancedStatic
Analysis
DynamicDetection
& Recovery
User-Supplied, Application-
Specific
AutomaticExtraction
![Page 15: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/15.jpg)
14
PQL: PQL: a Program Query Languagea Program Query Language
![Page 16: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/16.jpg)
15
PQL: a Program Query Language
Design Rules
AdvancedStatic
Analysis
DynamicDetection
& Recovery
User-Supplied, Application-
Specific
AutomaticExtraction
![Page 17: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/17.jpg)
16
PQL Query
Pattern: Illegal sequences of operations on related objectsLooks like the simplest code excerptwith pattern
Action:Print out result, halt program, or recover
![Page 18: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/18.jpg)
17
SQL Injectionx = req.getParameter();stmt.executeQuery (x);
o1 = req.getParameter(); formal = o1;o2.f = formal;o3.g = o2.f;stmt.executeQuery(o3.g);
getParameter
executeQuery
x
PQL:
![Page 19: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/19.jpg)
18
Basic Question
p1 and p2 point to the same object?Undecidable!
getParameter executeQueryx
p1 p2
![Page 20: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/20.jpg)
19
Dynamic Checker
p1 and p2 point to the same object?Instrument Java byte codesgetParameter: record ID of p1’s pointeeexecuteQuery:
check if p2’s pointee has been recorded
getParameter executeQueryx
p1 p2
![Page 21: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/21.jpg)
20
Static Checker
p1 and p2 point to same object?Pointer alias analysis
getParameter executeQueryx
p1 p2
![Page 22: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/22.jpg)
21
Pointer Alias AnalysisPointer Alias Analysis
[Whaley & Lam, PLDI 2004]
![Page 23: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/23.jpg)
22
Pointer Analysis
h1: v1 = new Object();h2: v2 = new Object();
v1.f = v2;v3 = v1.f;
Input RelationsvPointsTo(v1,h1)vPointsTo(v2,h2)Store(v1,f,v2)Load(v1,f,v3)
Output RelationshPointsTo(h1,f,h2)vPointsTo(v3,h2)
v1 h1
v2 h2
fv3
![Page 24: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/24.jpg)
23
hPointsTo(h1, f, h2) :- Store(v1, f, v2),vPointsTo(v1, h1),vPointsTo(v2, h2).
v1 h1
v2 h2
f
Inference Rule in Datalog
v1.f = v2;
Stores:
![Page 25: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/25.jpg)
24
Pointer Alias Analysis
Specified by 5 Datalog rulesCreation sitesAssignmentsStoresLoadsType filter
Apply rules until they converge
![Page 26: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/26.jpg)
25
Method Invocations
Context insensitive is impreciseUnrealizable paths
Object id(Object x) {return x;
}
a = id(b); c = id(d);
![Page 27: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/27.jpg)
26
Object id(Object x) {return x;
}
Object id(Object x) {return x;
}
Context Sensitivity
Context sensitivity is important for precision.Conceptually give each caller its own copy.
a = id(b); c = id(d);
![Page 28: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/28.jpg)
27
Cloning-Based Analysis
Simple brute force technique.Clone every path through the call graph.Run context-insensitive algorithm on expanded call graph.
The catch: exponential blowup
![Page 29: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/29.jpg)
28
Cloning is exponential!
![Page 30: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/30.jpg)
29
Recursion
Actually, cloning is unbounded in the presence of recursive cycles.Technique: We treat all methods within a strongly-connected component as a single node.
![Page 31: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/31.jpg)
30
Recursion
A
G
B C D
E F
A
G
B C D
E F E F E F
G G
![Page 32: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/32.jpg)
31
Top 20 Sourceforge Java AppsNumber of Clones
1.E+001.E+021.E+041.E+061.E+081.E+101.E+121.E+141.E+16
1000 10000 100000 1000000Size of program (variable nodes)
Num
ber o
f clo
nes
1016
1012
108
104
100
![Page 33: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/33.jpg)
32
Cloning is infeasible (?)Typical large program has ~1014 pathsIf you need 1 byte to represent a clone:
256 terabytes of storage> 12 times size of Library of Congress1GB DIMMs: $98.6 million
Power: 96.4 kilowatts (128 homes)300 GB hard disks: 939 x $250 = $234,750
Time to read sequential: 70.8 days
![Page 34: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/34.jpg)
33
BDD comes to the rescue
Many similarities across contexts.Many copies of nearly-identical results.
BDDs (binary decision diagrams)can represent large sets of redundant data efficiently.
![Page 35: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/35.jpg)
34
Static Checker Generation
PQL Query
Datalog
BDD code
2 lines
10 lines
1000 lines
![Page 36: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/36.jpg)
35
BDD: Binary Decision DiagramsBDD: Binary Decision Diagrams
![Page 37: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/37.jpg)
36
Call Graph Relation
Call graph expressed as a relation.Five edges:
Calls(A,B)Calls(A,C)Calls(A,D)Calls(B,D)Calls(C,D)
B
D
C
A
![Page 38: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/38.jpg)
37
Call Graph RelationRelation expressed as a binary function.
A=00, B=01, C=10, D=11
01011
111100000101001001011110100011
001111
1001100x3
0111
0010011000101100100011000000fx4x2x1
B
D
C
A 00
1001
11
![Page 39: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/39.jpg)
38
Binary Decision DiagramsGraphical encoding of a truth table.
x2
x4
x3 x3
x4 x4 x4
0 0 0 1 0 0 0 0
x2
x4
x3 x3
x4 x4 x4
0 1 1 1 0 0 0 1
x1 0 edge1 edge
![Page 40: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/40.jpg)
39
Binary Decision DiagramsCollapse redundant nodes.
x2
x4
x3 x3
x4 x4 x4
0 0 0 0 0 0 0
x2
x4
x3 x3
x4 x4 x4
0 0 0 0
x1
11 1 1 1
![Page 41: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/41.jpg)
40
Binary Decision DiagramsCollapse redundant nodes.
x2
x4
x3 x3
x4 x4 x4
x2
x4
x3 x3
x4 x4 x4
0
x1
1
![Page 42: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/42.jpg)
41
Binary Decision DiagramsCollapse redundant nodes.
x2
x4
x3 x3
x2
x3 x3
x4 x4
0
x1
1
![Page 43: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/43.jpg)
42
Binary Decision DiagramsCollapse redundant nodes.
x2
x4
x3 x3
x2
x3
x4 x4
0
x1
1
![Page 44: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/44.jpg)
43
Binary Decision DiagramsEliminate unnecessary nodes.
x2
x4
x3 x3
x2
x3
x4 x4
0
x1
1
![Page 45: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/45.jpg)
44
Binary Decision DiagramsEliminate unnecessary nodes.
x2
x3
x2
x3
x4
0
x1
1
![Page 46: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/46.jpg)
45
Binary Decision Diagrams
Represent tiny and huge relations compactlySize depends on redundancy
Similar contexts have similar numberingsVariable ordering in the BDD
10 minutes or “runs out of memory”Active machine learning algorithm
![Page 47: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/47.jpg)
46
Expanded Call GraphA
DB C
E
F G
H
0 1 2
A
DB C
E
F G
H
E E
F F GG
H H H H H
0 12
210
![Page 48: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/48.jpg)
47
Numbering ClonesA
DB C
E
F G
H
A
DB C
E
F G
H
E E
F F GG
H H H H H
0 0 0
0 1 2
0-2 0-2
0-2 3-5
0 0 0
0 1 2
0 12
210
0 1 2 3 4 5
0
![Page 49: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/49.jpg)
48
ExperienceContext-sensitive points-to analysis
800K byte codes in less than 20 minutes
PQL suitable for many known error patternsSQL injectionResource leakagePersistence object management
Applied to 6 large programs unknown to usFound 44 critical errors easily
![Page 50: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/50.jpg)
49
DynamicDetection
& Recovery
AutomaticExtraction
DynamicDetection
& Recovery
AdvancedStatic
Analysis
User-Supplied, Application-
Specific
AutomaticExtraction
DynamicDetection
& Recovery
AutomaticExtraction
New-Generation Tools
Design Rules
DynamicDetection
& Recovery
AdvancedStatic
Analysis
User-Supplied, Application-
Specific
AutomaticExtraction
PQL
![Page 51: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/51.jpg)
50
Automatic Design Rule Extraction
Too many design rules to specifyPrinciple: Most of the code is correctInconsistency = Errors
![Page 52: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/52.jpg)
ClouseauClouseau: : a static memory leak detectora static memory leak detector
![Page 53: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/53.jpg)
52
Ownership Model
Owner pointerObligated to delete object, orpass ownership to another pointer
Every object has an owner pointerNo memory leaks, no double deletes
![Page 54: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/54.jpg)
53
Past Experience
Decorate each function parameter with “own” “not-own”Not enforced many mistakes
![Page 55: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/55.jpg)
54
Assignment Statement
a = new int;
b = a;
delete a;
a = new int;
b = a;
delete b;
int
a b
int
a b
b = a own(b) = 0 own(a) = own(a’) + own(b’)
own(b),own(a)
own’(b),own’(a)
![Page 56: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/56.jpg)
ClouseauAutomatically infers function signatures
Based on “new” “delete”Constraint: there is always 1 owning pointer
Inconsistency: errorA 125K-line commercial program
50 lines of user specification on container structure (generic data types)Root-causes 82% of memory leaked dynamicallyFound many additional errors
[Heine&Lam, PLDI 2003]
![Page 57: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/57.jpg)
56
Clouseau
DynamicDetection
& Recovery
AutomaticExtraction
DynamicDetection
& Recovery
AdvancedStatic
Analysis
User-Supplied, Application-
Specific
AutomaticExtraction
DIDUCE
New-Generation Tools
Design Rules
DynamicDetection
& Recovery
AdvancedStatic
Analysis
AutomaticExtraction
![Page 58: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/58.jpg)
57
DIDUCEDIDUCEDIDUCE
Dynamic Invariant Deduction ∪
Checking Engine(deduces, but sometimes incorrectly)
![Page 59: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/59.jpg)
58
Motivation
Difficulty in finding root cause of complex errors
DIDUCE can potentially pinpoint the culprit automatically
![Page 60: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/60.jpg)
59
Example
For line # 1234,
Times 1-1000000: 0 <= X <= 3Time 1000001: X = 0xa3d025ef
Aha! must be an error
![Page 61: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/61.jpg)
60
DIDUCE Design
Monitors every memory access!Learns signature for the data accessed by each bytecode from correct runsReport anomalies observedAnomalies signal cause of errors
![Page 62: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/62.jpg)
61
Experience
Pinpointed line of code causing errorsErrors in imap servers:Change triggers a latent error elsewhereMemory subsystem simulator: otherwise unknown errors
Reported corner cases of interest
![Page 63: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/63.jpg)
62
LessonsDumb but tireless
High overhead but pain-freeUnaffected by programmers’misconception
Finds many serendipitous design rulesNeed to trigger just 1 of the rules[Hangal & Lam, ICSE 02]
![Page 64: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/64.jpg)
63
Summary
Design Rules
AdvancedStatic
Analysis
DynamicDetection
& Recovery
User-Supplied, Application-
Specific
AutomaticExtraction
![Page 65: Software Design Rules](https://reader034.vdocuments.us/reader034/viewer/2022052310/58a2e2861a28ab7f678b843b/html5/thumbnails/65.jpg)
64
ConclusionsNew tools that exploit computing power to manage software complexityNew advanced analyses
Pointer alias analysis--Binary decision diagrams
Analysis available to programmersPQL: easy to express execution patterns
Automatic design rule extractionInconsistencies errors