kmax: finding all configurations of kbuild makefiles statically · 2020-04-21 · what kmax offers...
TRANSCRIPT
![Page 1: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/1.jpg)
Kmax: Finding All Configurations of Kbuild Makefiles Statically
Paul Gazzillo
Stevens Institute
ESEC/FSE 2017 Paderborn, Germany
![Page 2: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/2.jpg)
Let’s Talk About Makefiles
2
![Page 3: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/3.jpg)
Variability in Linux Kbuild
• Kbuild is Linux’s Makefile-based build system
• Linux has 14,000+ configuration options• 2^14,000 configurations in the worst case
• 1,985 Kbuild Makefiles
• 29,525 SLoC
• Controlling 19,651 C files
3
![Page 4: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/4.jpg)
What Kmax Offers
• Lack tools to reason about Makefile variability
• Simple questions are hard• What C files comprise the Linux kernel?
• Kmax is a static analysis of Kbuild Makefiles
• Finds all C files and their configurations• 1-2k more C files compared to previous heuristics
• Takes minutes
• Finds dead code
4
![Page 5: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/5.jpg)
Makefile Syntax
• Variable expansion: $(CONFIG_A)• Expands to runtime value of CONFIG_A
• String concatenation: obj-$(CONFIG_B)• “obj-” plus the value of CONFIG_B
• String values are not quoted
• All values are strings
• In Linux, boolean inputs are “y” or undefined• Simulates booleans with string values
5
![Page 6: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/6.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
8 built-in.o: $(obj-y)
9 # do compilation
• Takes CONFIG_A and CONFIG_B as boolean inputs• “y” or undefined
• Sets obj-y to set of object files, conditioned on inputs
• Compiles and links C files in obj-y6
![Page 7: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/7.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
8 built-in.o: $(obj-y)
9 # do compilation
• Assignment: obj-y gets fork.o to compile
7
![Page 8: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/8.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
8 built-in.o: $(obj-y)
9 # do compilation
• Conditional: value of BITS depends on CONFIG_A
8
Kbuild-speak for Boolean ”true”
![Page 9: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/9.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
8 built-in.o: $(obj-y)
9 # do compilation
• Concatenation: right-hand side computed from BITS, implicitly depends on CONFIG_A
9
![Page 10: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/10.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
8 built-in.o: $(obj-y)
9 # do compilation
• Runtime variable name construction:• Variable to assign depends on value of CONFIG_B
• Appends probe_*.o to either obj-y or obj-
• Challenge for static approaches10
Kbuild won’t build these files
Also a string!
![Page 11: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/11.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
8 built-in.o: $(obj-y)
9 # do compilation
What C files does this build and when?
11
![Page 12: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/12.jpg)
Compute All Configurations
• Take all combinations of CONFIG_A and CONFIG_B
• Exponential in number of configuration options
• Has duplicate information
12
CONFIG_A CONFIG_B obj-y obj-
on on fork.o probe_32.o (undefined)
on off fork.o probe_32.o
off on fork.o probe_64.o (undefined)
off off fork.o probe_64.o
![Page 13: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/13.jpg)
Solution Approaches?
• Brute force• Too many possible configurations
• Dynamic analysis• GOLEM heuristically chooses configurations to run• Still too many configurations
• grep• Runtime string manipulation limits effectiveness
• Parsing• Syntax is not enough, need semantics• KBuildMiner is an example of the parsing approach
14
![Page 14: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/14.jpg)
Key Insight
Paths are configurations. A static analysis can collect configuration information if it is path-sensitive and has a precise string abstraction.
15
![Page 15: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/15.jpg)
Kmax’s Static Analysis
• Static analysis analyzes all paths• Paths are configurations
• Path abstraction treats configurations symbolically
• String abstraction enumerates concrete values
• Scalability and precision• Efficient symbolic representation
• Aggressively trim infeasible paths
17
![Page 16: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/16.jpg)
Path Abstraction
• Boolean expressions of configuration options• Symbolic, e.g, CONFIG_B ∧ ¬CONFIG_A
• Implemented with binary decision diagrams (BDDs)• Easy to join and deduplicate paths
• Easy to trim infeasible paths
18
![Page 17: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/17.jpg)
String Abstraction
• Enumerate all possible concrete strings
• Relies on path abstraction to be efficient
• For example, one string may be
• Akin to conditional symbol tables• Previous variability-aware approaches
[Garrido & Johnson ’05, Kaestner et al ‘11, Gazzillo &Grimm ’12, Walkingshaw et al ‘14, Nguyen et al ‘14, Meinicke et al ‘16]
[ “probe_32.o” if BITS==32 ∧ CONFIG_B ,
“probe_64.o” if BITS==64 ∧ CONFIG_B ]
19
![Page 18: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/18.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
Current path
Symbol table
True (all configurations)
(empty)
20
![Page 19: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/19.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
True (all configurations)
obj-y = [ “fork.o” if True ]
Current path
Symbol table
21
![Page 20: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/20.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
CONFIG_A
obj-y = [ “fork.o” if True ]
Current path
Symbol table
22
![Page 21: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/21.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
CONFIG_A
obj-y = [ “fork.o” if True ]
BITS = [ “32” if CONFIG_A ]
Current path
Symbol table
23
![Page 22: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/22.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
¬CONFIG_A
obj-y = [ “fork.o” if True ]
BITS = [ “32” if CONFIG_A ]
Current path
Symbol table
24
![Page 23: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/23.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
CONFIG_B ∧ ¬CONFIG_A
obj-y = [ “fork.o” if True ]
BITS = [ “32” if CONFIG_A,
“64” if ¬CONFIG_A ]
Current path
Symbol table
25
![Page 24: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/24.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
obj-y = [ “fork.o” if True ]
BITS = [ “32” if CONFIG_A,
“64” if ¬CONFIG_A ]
Current path
Symbol table
True (all configurations)
26
![Page 25: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/25.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
obj-y = [ “fork.o” if True ]
BITS = [ “32” if CONFIG_A,
“64” if ¬CONFIG_A ]
Current path
Symbol table
True (all configurations)
27
?
![Page 26: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/26.jpg)
obj-$(CONFIG_B) += probe_$(BITS).o
Runtime Variable Names
28
ifeq ($(CONFIG_B),y)
obj-y += probe_$(BITS).o
else
obj- += probe_$(BITS).o
endif
Expand to all assignments
Evaluate under resulting new paths
![Page 27: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/27.jpg)
1 obj-y := fork.o
2 ifeq ($(CONFIG_A),y)
3 BITS := 32
4 else
5 BITS := 64
6 endif
7 obj-$(CONFIG_B) += probe_$(BITS).o
• obj-y’s final value tells us that
• “fork.o” is in all configurations
• “probe_32.o” when CONFIG_B ∧ CONFIG_A
• “probe_64.o” when CONFIG_B ∧ ¬ CONFIG_A
29
![Page 28: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/28.jpg)
More Details in the Paper
• Complete analysis algorithm
• Handling runtime variable name construction
• Updating the symbol table with assignments• Disjoint and complete configuration coverage
• Undefined variable configurations
• Trimming infeasible configurations
• Converting conditionals to BDDs
• Gathering configuration options from Kconfig
30
![Page 29: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/29.jpg)
Evaluation
31
![Page 30: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/30.jpg)
Experimental Setup
• Kmax evaluated on two Kbuild clients• Linux v3.19
• BusyBox v1.25.0
• Experiment #1: correctness• Checks for missing C files in Kmax output
• Experiment #2: comparison to previous work• Check C files against two previous heuristics
• Experiment #3: running time• Compare running time with previous tools
33
![Page 31: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/31.jpg)
Experiment #1: Correctness
• Compare .c files in source tree with Kmax output
• Not all .c files destined for kernel binary
• Verify Kmax excluded only non-kernel .c files
34
![Page 32: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/32.jpg)
Experiment #1: Correctness
35
These are not compilation units
Dead code!
Kmax misses none
![Page 33: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/33.jpg)
Experiment #2: Comparison
• Compared to two previous tools’ heuristics• KBuildMiner parses Kbuild Makefiles
• GOLEM runs Makefiles one configuration a time
• These were not advertised as complete solutions
• Missing: should be included but weren’t
• Misidentified: shouldn’t be included, eg, dead code
37
Tool x86 C Files Missed Misidentified
Kmax 14,783 — —
KBuildMiner 14,904 319 440
GOLEM 14,460 713 390
![Page 34: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/34.jpg)
Experiment #3: Running Time
• x86 version of kernel source
• 5 running time collections per tool
• KBuildMiner’s parsing approach is the fastest
• GOLEM far slower than both, taking hours
• Kmax is more precise with little additional overhead
39
Tool Mean Running Time
Kmax 84.15 sec
KBuildMiner 45.00 sec
GOLEM 3.42 hrs
![Page 35: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/35.jpg)
Future Work
• Integration into variability-aware analyses, e.g., bugfinders
• Variability-aware dependence graphs
• Application to other Makefiles
40
![Page 36: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/36.jpg)
Conclusion
• Kmax algorithm• Path-sensitive static analysis
• Enumerates concrete strings
• Symbolic configuration expressions
• Evaluation on Linux and BusyBox• Finds all C files and their configurations
• More precise than heuristics with little overhead
• Finds dead code
41
![Page 37: Kmax: Finding All Configurations of Kbuild Makefiles Statically · 2020-04-21 · What Kmax Offers •Lack tools to reason about Makefile variability •Simple questions are hard](https://reader033.vdocuments.us/reader033/viewer/2022050212/5f5e54452157de7fa043fba4/html5/thumbnails/37.jpg)
Thank You! Questions?
42
https://github.com/paulgazz/kmax
Kmax Repository
https://paulgazzillo.com @paul_gazzillo