fuzzan: efficient sanitizer metadata design for fuzzing · 2020. 7. 15. · address sanitizer...

17
This paper is included in the Proceedings of the 2020 USENIX Annual Technical Conference. July 15–17, 2020 978-1-939133-14-4 Open access to the Proceedings of the 2020 USENIX Annual Technical Conference is sponsored by USENIX. FuZZan: Efficient Sanitizer Metadata Design for Fuzzing Yuseok Jeon, Purdue University; WookHyun Han, KAIST; Nathan Burow, Purdue University; Mathias Payer, EPFL https://www.usenix.org/conference/atc20/presentation/jeon

Upload: others

Post on 01-Feb-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • This paper is included in the Proceedings of the 2020 USENIX Annual Technical Conference.

    July 15–17, 2020978-1-939133-14-4

    Open access to the Proceedings of the 2020 USENIX Annual Technical Conference

    is sponsored by USENIX.

    FuZZan: Efficient Sanitizer Metadata Design for Fuzzing

    Yuseok Jeon, Purdue University; WookHyun Han, KAIST; Nathan Burow, Purdue University; Mathias Payer, EPFL

    https://www.usenix.org/conference/atc20/presentation/jeon

  • FuZZan: Efficient Sanitizer Metadata Design for Fuzzing

    Yuseok JeonPurdue University

    Wookhyun HanKAIST

    Nathan BurowPurdue University

    Mathias PayerEPFL

    AbstractFuzzing is one of the most popular and effective techniques

    for finding software bugs. To detect triggered bugs, fuzzersleverage a variety of sanitizers in practice. Unfortunately,sanitizers target long running experiments—e.g., developertest suites—not fuzzing, where execution time is highlyvariable ranging from extremely short to long. Designdecisions made for developer test suites introduce highoverhead on short lived fuzzing executions, decreasing thefuzzer’s throughput and thereby reducing effectiveness.

    The root cause of this sanitization overhead is the heavy-weight metadata structure that is optimized for frequentmetadata operations over long executions. To address this, wedesign new metadata structures for sanitizers, and proposeFuZZan to automatically select the optimal metadata structurewithout any user configuration. Our new metadata structureshave the same bug detection capabilities as the ones theyreplace. We implement and apply these ideas to AddressSanitizer (ASan), which is the most popular sanitizer.

    Our evaluation shows that on the Google fuzzer test suite,FuZZan improves fuzzing throughput over ASan by 48%starting with Google’s provided seeds (52% when starting withempty seeds on the same applications). Due to this improvedthroughput, FuZZan discovers 13% more unique paths giventhe same 24 hours and finds bugs 42% faster. Furthermore,FuZZan catches all bugs ASan does; i.e., we have not tradedprecision for performance. Our findings show that sanitizerperformance overhead is avoidable when metadata structuresare designed for fuzzing, and that the performance differencewill have a meaningful difference in squashing software bugs.

    1 Introduction

    Fuzzing [33] is a powerful and widely used software securitytesting technique that uses randomly generated inputs to findbugs. Fuzzing has seen near ubiquitous adoption in industry,and has discovered countless bugs. For example, the state-of-the-art fuzzer American Fuzzy Lop (AFL) has discovered

    hundreds of bugs in widely-used software [57], while Googlehas found 16,000 bugs in Chrome and 11,000 bugs in over160 other open source projects using fuzzing [10]. On its own,fuzzing only discovers a subset of all triggered bugs, e.g., failedassertions or memory errors causing segmentation faults. Bugsthat silently corrupt the program’s memory state, without caus-ing a crash, are missed. To detect such bugs, fuzzers must bepaired with sanitizers that enforce security policies at runtimeby turning a silent corruption into a crash. To date, around 34sanitizers [47] have been prototyped. So far, only the LLVM-based sanitizers ASan, MSan, LeakSan, UBSan, and TSanhave seen wide-spread use. For brevity, we use sanitizers torefer to such frequently used sanitizers in the rest of the paper.

    Unfortunately, sanitizers are designed for developer-drivensoftware testing rather than fuzzing, and are consequentlyoptimized for minimal per-check cost, not startup/teardown ofthe metadata structure. Consequently, they are based arounda shadow-memory data structure wherein the address space ispartitioned, and metadata is encoded into the “shadow” mem-ory at a constant offset from program memory. Optimizing forlong executions makes sense in the context of developer-drivensoftware testing, which generally verifies correct behavior onexpected input, leading to relatively long test execution times.Fuzzing has a more diverse set of inputs that cause both short(i.e., invalid inputs) and long running executions with billionsof executions. For example, the Chrome developers useAddress Sanitizer (ASan) for their unit tests and long-runningintegration tests [39]. However, the underlying designdecisions that make ASan a highly performant sanitizer forlong running tests result in high performance overhead—upto 6.59×—for short executions, as observed in a fuzzingenvironment1. This high overhead reduces throughput,thereby preventing a fuzzer from finding bugs effectively.

    We analyze the source of this overhead across a varietyof sanitizers, and attribute the cost to heavy-weight metadatastructures employed by these sanitizers. For example, AddressSanitizer maps an additional 20TB of memory for each exe-

    1The average time for a single execution across the first 500,000 tests forthe full Google fuzzer test suite is 0.61ms.

    USENIX Association 2020 USENIX Annual Technical Conference 249

  • cution, Memory Sanitizer (MSan) 72TB, and Thread Sanitizer(TSan) 97TB on a 64-bit platform. The high setup/teardowncost of heavy-weight metadata structures is amortized overthe long execution of programs due to the low per-check cost.In contrast, a fuzzing campaign typically consists of massiveamounts of short-lived executions, effectively transformingwhat is a large one-time cost into a large runtime cost. For ex-ample, Table 1 indicates that memory management is the mainsource of overhead for ASan under fuzzing on the Google fuzztest suite, accounting for 40.16% of the total execution time weobserve. Memory management is the key bottleneck for usingsanitizers with fuzzers, and has to date gone unaddressed.

    Instead, increasing the efficiency and efficacy of fuzzing hasreceived significant research attention on two fronts: (i) mech-anisms that reduce the overhead of fuzzers [27,55,57]; and (ii)mechanisms that reduce the overhead of sanitization on longerrunning tests and conflicts between sanitizers [25, 37, 38, 52,54]. These works address fuzzers and sanitizers in isolation,ignoring the core sanitizer design decision to optimize for longrunning test cases using a heavy-weight metadata structure thatlimits sanitizer performance in combination with fuzzers. Con-sequently, optimization of sanitizer memory management forshort execution times remains an open challenge, motivated bythe need to design sanitizers that are optimal under fuzz testing.

    We present FuZZan, which uses a two-pronged approachto optimize sanitizers for short execution times, as seen underfuzzing: (i) two new light-weight metadata structures that tradesignificantly reduced startup/teardown costs 2 for moderatelyhigher (or equivalent) per access costs and (ii) a dynamic meta-data structure switching technique, which dynamically selectsthe optimal metadata structure during a fuzzing campaignbased on the current execution profile of the program; i.e., howoften the metadata is accessed. Each of our proposed metadatastructures is optimized for different execution patterns; i.e.,they have different costs for creating an entry when an objectis allocated versus looking up information in the metadatatable. By observing the metadata access and memory usagepatterns at runtime, FuZZan dynamically switches to the bestmetadata structure without user interaction, and tunes thisconfiguration throughout the fuzzing campaign.

    We apply our ideas to ASan, which is the most widelyused sanitizer [43, 44, 47]. ASan focuses on memory safetyviolations—arguably the most dangerous class of bugs,accounting for 70% of vulnerabilities at Microsoft [34]—andhas already detected over 10,000 memory safety viola-tions [9,12,50] in various applications (e.g., over 3,000 bugs inChrome in 3 years [50]) and the Linux kernel (e.g., over 1,000bugs [12, 51]) by using a customized kernel address sanitizer(KASan). We further apply FuZZan to MSan and MOpt-AFL.

    FuZZan improves fuzzing throughput over ASan by 52%when starting with empty seeds and 48% when starting with

    2Compared to ASan, our min-shadow memory mode reduces the time thatstartup/teardown functions spend in the kernel by 62% on the first 500,000tests across the full Google fuzzer test suite.

    ModesASan’s

    init timems (%)

    ASan’slogging time

    ms (%)

    Memorymgmt. time

    ms (%)# page faults

    Native 0.00 (0.00%) 0.00 (0.00%) 0.05 (11.49%) 2,569ASan 0.17 (10.58%) 0.30 (18.86%) 0.63 (40.16%) 11,967

    Table 1: Comparison between native and ASan executionswith a breakdown of time spent in memory management,and time spent for ASan’s initialization and logging. Resultsare aggregated over 500,000 executions of the full Googlefuzzer test suite [11]. Times are shown in milliseconds, and% denotes the ratio to total execution time.

    Google’s seed corpus, averaged across all applications in theGoogle fuzzer test suite [11] as part of our input record/replayfuzzing experiment. Due to this improved throughput, FuZZandiscovers 13% more unique paths (with an improvement inthroughput of 61% compared to ASan) given the standard 24hour fuzz testing with widely used real-world software and aprovided corpus of starting seeds.

    Crucially, FuZZan achieves this without any reduction inbug-finding ability. Therefore, FuZZan strictly increases theperformance of ASan-enabled fuzzing, resulting in finding thesame bugs in less time than using ASan with the same fuzzer.

    Our contributions are:

    1. Identifying and analyzing the primary source of overheadwhen sanitizers are used with fuzzing, and pinpointingthe sanitizer design decisions that cause the overhead;

    2. Designing and implementing a sanitizer optimization(FuZZan) and applying it to ASan; that is, we designseveral new metadata structures along with a dynamicmetadata structure switching to choose the optimalstructure at runtime. We also validate the generality of ourdesign by further applying it to MSan and MOpt-AFL;

    3. Evaluating FuZZan on the Google fuzzer test suite andother widely used real-world software and showing thatFuZZan effectively improves fuzzing throughput (andtherefore discovers more unique bugs or paths given thesame amount of time).

    2 Background and Analysis

    We present an overview of fuzzing overhead and ASan(our target sanitizer). Further, we detail the design conflictsbetween ASan and fuzzing when used in combination.

    2.1 Fuzzing overheadGiven the same input generation capabilities, a fuzzer’sthroughput (executions per second) is critical to its effective-ness in finding bugs. Greater throughput results in more code

    250 2020 USENIX Annual Technical Conference USENIX Association

  • and data paths being explored, and thus potentially triggersmore bugs. Running a fuzzer imposes some overhead on theprogram, a major component of which is the repeated executionof the target program’s initialization routines. These routines—including program loading, execve, and initialization—donot change across test cases, and hence result in repeated andunnecessary startup costs. To reduce this overhead, manyfuzzers leverage a fork server. A fork server loads and executesthe target program to a fully-initialized state, and then clonesthis process to execute each test case. This ensures that theexecution of each test case begins from an initialized state,and removes the overhead associated with the initial startup.

    Another technique for reducing process initialization costsis in-process fuzzing, such as AFL’s persistent mode andlibFuzzer. In-process fuzzing wraps each test in one iterationof a loop in one process, thus avoiding starting a new processfor each test. However, in-process fuzzing generally requiresmanual analysis and code changes [13, 58]. Additionally,in-process fuzzing requires the target code to be statelessacross executions as all tests share one process environment,otherwise the execution of one test may affect subsequentones, potentially leading to false positives. Consequently,testers should avoid in-process fuzzing for library code usingglobal variables. Bugs found from in-process fuzzing maynot be reproducible as it is not always possible to constructa valid calling context to trigger detected bugs in the targetfunction, and side-effects across multiple function calls maynot be captured [32]. Because of these limitations, in-processfuzzing is used on stateless functions in libraries, while thefork server model (i.e., out-of-process fuzzing) remains themost general fuzzing mode for fuzzing programs.

    2.2 Address SanitizerAll sanitizers leverage a customized metadata structure [47].Out of many different metadata schemes, shadow memory(both direct-mapped or multi-level shadow) is the most widelyused [4, 14–16, 29, 30, 42, 45, 48, 49, 56]. ASan enforces mem-ory safety by encoding the accessibility of each byte in shadowmemory. Allocated (and therefore accessible) areas are markedand padded with inaccessible red zones. In particular, direct-mapped shadow memory encodes the validity of the entirevirtual memory space, with every 8-bytes of memory mappingto 1-byte in shadow memory. Shadow memory encodes thestate of application memory. The 8-bit value k encodes thatthe 8-k bytes of the mapped memory are accessible. The corre-sponding shadow memory address for a byte of memory is at:

    addrshadow=(addr>>3)+offset

    where addr is the accessed address. Generally, ASan onlyinserts redzones to the high address side of each object as thepreceding object’s redzone suffices for the low address side.ASan also instruments each runtime memory access to checkif the accessed memory is in a red zone, and if so faults. ASan’s

    effectiveness in detecting hard-to-catch memory bugs has ledto its widespread adoption. It has become best practice [47] touse ASan (or KASan [20], the kernel equivalent) with a fuzzerto improve the bug detection capability.

    2.3 Overhead Analysis of Fuzzing with ASan

    To understand ASan’s overhead with fuzzing, we analyzethe Linux kernel functions used during fuzzing campaigns.Table 1 shows the overhead added by ASan, broken outacross ASan’s logging, ASan’s initialization, and memorymanagement. Our experiments measure the ratio of the timespent in the kernel functions compared to the total executiontime for a number of target programs.

    Note that memory management makes up 40.16% ofASan’s total execution time, as opposed to 11.49% for thebase case, and that memory management is more than doublethe overhead of ASan’s logging and initialization combined.ASan’s heavy use of the virtual address space results in4.66× page faults compared to native execution. Our memorymanagement overhead numbers reflect the time spent by thekernel in the four core page table management functions:(i) unmap_vmas (24.6%), (ii) free_pgtable (4.7%), (iii)do_wp_page (8.2%), and (iv) sys_mmap (2.6%).

    Notably, unmap_vmas and free_pgtable correspond to73% of ASan’s measured memory management overheadacross the four core page table management functions. Theexecution time for these two functions (unmap_vmas andfree_pgtable) is 10x higher than when executing withoutASan. To break this overhead down, when executing a testunder the fork server mode, a fuzzer needs to create a newprocess for each test. During initialization, ASan reservesmemory space (20TB total, including 16TB of shadowmemory, and a separate 4TB for the heap on 64-bit platforms)and then poisons the shadow memory for globals and theheap. Accessing these pages incurs additional page faults, andthus page table management overhead in the kernel. Note thatthe large heap area causes sparse page table entries (PTEs),which increase the number of pages used for the page tableand memory management overhead.

    Existing techniques to deal efficiently with large allocationsdo not help here. Lazy page allocation of the large virtual mem-ory area used by ASan does not mitigate memory managementoverhead in this case, as many of the pages are accessed whenshadow memory is poisoned. Poisoning forces a copy even forcopy-on-write pages, and thus increases page table manage-ment cost. During execution, memory allocations and accessescause additional shadow memory pages to be used, again withpage faults and page table management. When the process ex-its, the kernel clears all page table entries through unmap_vmasand releases memory for the page table (via free_pgtables).The cost of these two functions are correlated with the numberof physical pages used by the process. As fuzzing leads torepeated, short executions, such bookkeeping introduces

    USENIX Association 2020 USENIX Annual Technical Conference 251

  • considerable memory management overhead. In contrast tothese active memory management functions, sys_mmap onlyaccounts for 7% memory management overhead of ASan.This is the expense for reserving all virtual memory areas.However, large areas that are actively accessed by ASan incurconsiderable additional expenses as detailed above.

    For completeness, we note that our analysis finds that ASanperforms excessive “always-on” logging (18.86%) by default,and that ASan’s initial poisoning of global variables (10.58%)is inefficient. Combined, these additional sources of overheadaccount for 29.44% overhead. We address these engineeringshortcomings in our evaluation, but they are neither our corecontributions nor the choke point in fuzzing with ASan.

    3 FuZZan design

    FuZZan has two design goals: (1) define new light-weightmetadata structures, and (2) automatically switch betweenmetadata structures depending on the runtime execution pro-file. In this section, we present how we design each componentof FuZZan to achieve both goals, as illustrated in Figure 1.

    3.1 FuZZan Metadata Structures

    To minimize startup/teardown costs while maintainingreasonable access costs, FuZZan introduces two new metadatastructures: (i) a Red Black tree (RB-tree) metadata structure,which has low startup and teardown costs, but has highper-access costs; and (ii) min-shadow memory, which hasmedium startup/teardown costs and low per-access costs (onpar with ASan). Table 2 shows a qualitative comparison ofthe different metadata schemes that we propose in this section,see Table 4 for quantitative results. The RB-tree is optimal forshort executions with few metadata accesses as it emphasizeslow startup and teardown costs, while min-shadow memoryis best suited for executions with a mid-to-high numberof metadata accesses as it has lower per metadata access

    FuZZan Min-Shadow memory

    Fuzzer

    FuzzingModule

    Metadatastructureselector

    (3) Switch to selected metadata structure (§ 3.2)

    (1) Measure target program Behavior (§ 3.2.1)

    (2) Calculate the best metadata structure (§ 3.2.2)

    Dynamic feedback

    FuZZanRB-tree

    ASan shadow memorySwitch

    FuZZansampling

    Target

    Figure 1: Overview of FuZZan’s architecture and workflow.

    Metadata Structures Startup/Teardown Cost Access Cost

    ASan shadow memory High Low

    FuZZanCustomized RB-tree Low HighMin-shadow memory Medium Low

    Table 2: Comparison of metadata structures.

    costs while still avoiding the full startup/teardown overheadimposed by ASan’s shadow memory.

    3.1.1 Customized RB-Tree

    To optimize ASan’s metadata structure for test cases where afuzz testing application only executes for a very short time withfew metadata accesses, we introduce a customized RB-tree,shown in Figure 2. Nodes in the RB-tree store the redzone foreach object. Although each metadata access operation (insert,delete, and search) in the RB-tree is slower than its counterpartin the shadow memory metadata structure, our RB-tree has thefollowing benefits: (i) low total memory overhead (leading tolow startup/teardown overhead); (ii) removal of poisoning/un-poisoning page faults (as each RB-tree node compactly storesthe redzone addresses and these nodes are grouped together inmemory); and (iii) a faster range search than shadow memoryfor operations such as memcpy. For example, in order to checkmemcpy, ASan must validate each byte individually usingshadow memory. However, in our approach, we can verifysuch operations through only two range queries for memcpy’ssource and destination memory address range.

    In our RB-tree design, when an object is allocated (e.g.,through malloc), the range of the object’s high addressredzone is stored in a node of the RB-tree. During a query, ifthe address range of the target is lower than the start addressof the node, we search the left subtree (and vice versa). If theaddress is not found in the tree, it is a safe memory access.During redzone removal, the requested address range mayonly be a subset of an existing node’s range (and not the fullrange of a target node in the RB-tree). In this case, the RB-tree

    # Address area

    1 0x10007.. ~ 0x10008..

    2 0x10008.. ~ 0x10009..

    . ……..

    . ……..

    ... 0x02008.. ~0x02009..

    N 0xffffe.. ~0xfffff..

    Hash function(address)

    HashMap (optional)

    # Address Status

    1 0x100.. Normal

    2 0x200.. Normal

    3 0x300.. Normal

    Cache (optional)

    Search SearchFail

    Insert/Delete

    One timeInsert/Delete

    One timeRangeSearch

    Figure 2: Design of FuZZan’s customized RB-tree.

    252 2020 USENIX Annual Technical Conference USENIX Association

  • deletes the existing RB-tree node, creates new RB-tree nodeswhich have non-overlapping address ranges (e.g., the leftand right side of an overlapped area), and inserts these nodesinto the RB-tree. Since we reuse ASan’s memory allocatorand memory layout (e.g., redzones between objects and aquarantine zone for freed objects), FuZZan provides the samedetection capability as ASan.

    3.1.2 Min-shadow memory

    The idea behind Min-shadow memory (for executions witha mid-to-high number of metadata accesses) is to limit theaccessible virtual address space, effectively shrinking thesize of the required shadow memory. As the size of shadowmemory is a key driver of overhead in the fuzzing environment,this enhances performance.

    Figure 3 illustrates how min-shadow memory converts a64-bit program running in a 48-bit address space to run in a32-bit address space window (1GB for the stack, 1GB for theheap, and 2GB for the BSS, data, and text sections combined).Note that pointers remain 64 bits wide and the code remainsunchanged: the mapped address space is simply restricted, al-lowing min-shadow memory to have a partial shadow memorymap. To shrink a program’s memory space, we move the heap(by modifying ASan’s heap allocator) and remap the stack to anew address space. Min-shadow memory remaps parts of theaddress space but programs remain 64-bit programs. To accom-modate larger heap sizes, we create additional min-shadow

    StackHeap (4TB)

    ShadowBad

    ShadowBSS & Data

    & Text

    StackHeap (4TB)

    ShadowBad

    ShadowBSS & Data

    & Text

    BadShadow

    Stack (1GB)Heap (1GB)BSS & Data & Text (2GB)

    20TB (Heap + Shadow)

    4GB

    16TB(Shadow memory)

    512MB(Shadowmemory)

    Address sanitizer memory mapping

    FuZZan min-shadow memory mappingBad

    ShadowStack (1GB)Heap (1GB)BSS & Data & Text (2GB)

    Figure 3: ASan and min-shadow memory modes’ memorymapping on 64-bit platforms. ASan (top) reserves 20TBmemory space for heap and shadow memory, conversely, min-shadow memory mode (bottom) reserves 4512MB memoryspace for heap and shadow memory. Each application’s stack,heap, and other sections (BSS, data, and text) map to thecorresponding shadow regions. Further, the shadow memoryregion is mapped inaccessible.

    memory binaries with heap sizes of 4GB, 8GB, and 16GB.Our approach allows testing 64-bit code with 64-bit

    pointers without having to map shadow tables for the entireaddress space. We disagree with the recommendation of theASan developers to compile programs as 32-bit executables,as changing the target architecture, pointer length, anddata type sizes will hide bugs. Furthermore, min-shadowmemory provides greater flexibility compared to using thex32 ABI [53] mode (i.e., running the processor in 64-bit modebut using 32-bit pointers and arithmetic, limiting the programto a virtual address space of 4GB), as min-shadow memorycan provide various heap size options.

    3.2 Dynamic metadata structure switching

    Dynamic metadata structure switching automatically selectsthe optimal metadata scheme based on observed behavior.At the beginning of a fuzzing campaign, dynamic metadatastructure switching assesses the initial behavior and then pe-riodically samples behavior, adjusting the metadata structureif necessary. Our intuition for dynamic metadata structureswitching is that, during fuzzing, metadata access patternsand memory usage remain similar across runs and changein phases. While the fuzzer is mutating a specific input, theexecutions of the newly created inputs are similar regardingtheir control flow and memory access patterns compared tothe source input. However, new coverage may lead to differentexecution behaviors. We therefore design a dynamic metadatastructure switching technique that periodically and condition-ally samples the execution and adjusts the underlying metadatastructure according to the observed execution behavior.

    Dynamic metadata structure switching compiles theprogram in four different ways in preparation for fuzzing:ASan, RB-tree, min-shadow memory, and sampling mode. Thesampling mode repeatedly samples the runtime parametersand then selects the optimal metadata structure. The selectionof the optimal metadata structure is governed by FuZZan’smetadata structure switching policy.

    3.2.1 Sampling mode

    The sampling mode measures the behavior of the targetprogram using the min-shadow memory-1GB metadata modeand, based on the behavior, reports the currently optimalmetadata structure. The sampling mode profiles the followingparameters: (i) the number of metadata accesses during insert,delete, and search; and (ii) memory consumption. Note thatthis information can be collected by simple counters: profilingis therefore light-weight.

    Dynamic metadata structure switching starts in samplingmode and selects the optimal mode based on the observedbehavior. Dynamic metadata structure switching thenperiodically (e.g., every 1,000 executions) and conditionally(e.g., when the fuzzer starts mutating a new test case) samples

    USENIX Association 2020 USENIX Annual Technical Conference 253

  • executions to select the optimal metadata structure basedon the current behavior. To reduce the cost of periodicsampling, dynamic metadata structure switching implementsa continuous back-off strategy that gradually increases thesampling interval as long as the metadata structure does notchange (similar to TCP’s slow-start [17]). Note that bugs maybe triggered during sampling mode. As such, we maintainASan’s error detection capabilities while sampling to ensurethat we do not miss any bugs.

    3.2.2 Metadata structure switching policies

    Our metadata structure switching policy is based on a mappingof metadata access frequency to the corresponding metadatastructure. This heuristic is relatively simple in order to achievea low sampling overhead. To determine the best cutoff points,we compile all 26 applications in Google’s fuzzer test suitein two different ways: RB-tree and min-shadow memory.We then test these different configurations against 50,000recorded inputs and determine the best metadata structuredepending on the observed parameters, measuring executiontime. Profiling reveals that the frequency of metadata access(insert, delete, and search) is the primary factor that influencesmetadata structure overhead, which confirms our originalassumption. In this policy, depending on the metadata accessfrequency, we select different metadata structures (basedon statistics from profiling): RB-tree if there are fewer than1,000 accesses; and min-shadow memory if there are morethan 1,000 accesses. Additionally, if the selected heap sizegoes beyond a threshold, we sequentially switch to othermodes (min-shadow memory-4G, 8G, 16G, and ASan), thusincreasing heap memory for continuous fuzzing.

    4 Implementation

    We implement FuZZan’s two metadata structures anddynamic metadata structure switching mode on top of ASanin LLVM [28] (version 7.0.0). We support and interact withAFL [57] (version 2.52b). To address the other sources ofoverhead in ASan (shown in Table 1), we also implementtwo additional optimizations: (i) removal of unnecessaryinitialization; and (ii) removal of unnecessary logging. Ourimplementation consists of 3.5k LOC in total (mostly inLLVM, with minor extensions to AFL).RB-tree. The RB-tree requires modifications to ASan’smemory access instrumentation, as our RB-tree is not basedon a shadow memory metadata structure. Thus, we modifyall memory access checks, including interceptors, to usethe appropriate RB-tree operations instead of the equivalentshadow memory operations. As an optimization, and forcompatibility with min-shadow memory mode, the RB-treemode also reserves 1GB for the heap memory allocator. Acompact heap reduces memory management overhead. TheRB-tree mode is used when fuzz tests only execute for a very

    short time with few metadata accesses (i.e., they allocaterelatively a small amount of memory).

    Min-shadow memory. Unlike the RB-tree, we are able torepurpose ASan’s existing memory access checks, as themin-shadow memory metadata structure is based on a shadowmemory scheme. To shrink a 64-bit program’s address space,we modify ASan’s internal heap setup and remap the stackusing Kroes et al.’s linker/loader tricks [22]. More specif-ically, based on this script, we hook __libc_start_mainusing “LD_PRELOAD” and then remap the stack to a newaddress, update rbp and rsp, and then call the original__libc_start_main. This allows us to reduce ASan’sshadow map requirements from 16TB of mapped (but notnecessarily allocated) virtual memory to 512MB (1 bit ofshadow for each byte in our 4GB address space window).We also create an additional 192MB shadow memory forASan’s secondary allocator and dynamic libraries (whichare remapped above the stack). Finally, we implement fourdifferent min-shadow memory modes with increasing heapsizes (1GB, 4GB, 8GB, and 16GB) to handle the differentmemory requirements of a variety of programs.

    Heap size triggers. As previously stated, min-shadowmemory is configured for different heap sizes. We thereforeuse out of memory (OOM) errors to trigger callbacks thatnotify FuZZan to increase the heap size.

    AFL modifications. The target program is compiled onceper FuZZan mode. By default, AFL uses a random numbergenerator (RNG) to assign an ID to each basic block withinthe target program. Unfortunately, this would result in thesame input producing different coverage maps across the setof compiled targets, breaking AFL’s code coverage analysis.We therefore modify AFL to use the same RNG seed acrossthe set of compiled targets. This ensures that the same inputproduces the same coverage map across all compiled variants.

    Removing unnecessary initialization. ASan makes a num-ber of global constructor calls on program startup, performingseveral do_wp_page calls for copy-on-write. These construc-tor calls are unnecessarily repeated each time AFL executes anew test input, leading to redundant operations. Unfortunately,the AFL fork server is unaware of ASan’s initializationroutines. Therefore, to remove unnecessary (re-)initializationacross fuzzing runs, we modify ASan’s LLVM pass so thatglobal variable initialization occurs before AFL’s fork serverstarts. This is achieved by adjusting the priority of globalconstructors which contain ASan’s initialization function.

    Removing unnecessary logging. ASan provides loggingfunctionality for error reporting (e.g., saving allocation sizesand thread IDs during object allocation). Unfortunately, thislogging functionality introduces additional page faults andperformance overhead. However, this logging is unnecessarybecause fuzzing inherently enables replay by storing testinputs that trigger new behavior. Complete logging infor-mation can be recovered by replaying a given input with a

    254 2020 USENIX Annual Technical Conference USENIX Association

  • fully-instrumented program. We therefore identify and disableASan’s logging functionality (e.g., StackDepot) for fuzzingruns, allowing it to be reenabled for reportable runs.

    5 Evaluation

    We provide a security and performance evaluation of FuZZan.First, we verify that FuZZan and ASan have the sameerror-detection capabilities. Second, we evaluate the efficiencyof FuZZan’s new metadata structures and dynamic metadatastructure switching mode using deterministic input froma record/replay infrastructure to ensure fair comparisons.Next, to consider the random nature of fuzzing and to showFuZZan’s real-world impact, we evaluate FuZZan’s efficiencywithout deterministic input. Here we evaluate the numberof code paths found by FuZZan in a 24 hour time period,demonstrating the impact of FuZZan’s increased performance.We also measure FuZZan’s bug finding speed by using knownbugs in Google’s fuzzer test suite to verify that FuZZanmaximizes fuzzing execution speed while providing theexact same bug detection capabilities as ASan. Finally, weport FuZZan to another sanitizer (MSan) [48] and anotherAFL-based fuzzer (MOpt-AFL) [31] to verify its flexibility.Evaluation setup. All of our experiments are performed ona desktop running Ubuntu 18.04.3 LTS with a 32-core AMDRyzen Threadripper 2990WX, 64GB of RAM, 1TB SSD, andSimultaneous MultiThreading (SMT) disabled (to guaranteea single fuzzing instance is assigned to each physical core).Across all experiments, we apply FuZZan to AFL’s forkserver mode, which is a widely-used and highly optimizedout-of-process fuzzing mode. We evaluate FuZZan on allapplications in the Google fuzzer test suite [11] and otherwidely used real-world software.Evaluation strategy. Evaluating fuzzing effectiveness ischallenging. In a recent study of how to evaluate fuzzing byKlees et. al. [21], the authors find that the inherent randomnessof the fuzzer’s input generation can lead to seemingly large butspurious differences in fuzzing effectiveness. However, we areat an advantage as we do not need to compare different fuzzersnor do we change the input generation. We therefore record thefuzzer-generated inputs during a regular run of AFL, and thenreplay these recorded inputs to compare our different ASanoptimizations to the same baseline, effectively controlling forrandomness in input generation by using the same input for allexperiments. For our experiments we record the first 500,000executions for replay, yielding a large enough test corpusfor reasonable performance comparisons. We also undertakea real-world fuzzing campaign (i.e., without inhibitingfuzzing randomness by record/replay) to measure FuZZan’sreal-world impact on code path exploration. Finally, Kleeset. al. demonstrate the importance of the initial seed(s) whenevaluating fuzz testing, as performance can vary substantiallydepending on what seed is used. We therefore compare two

    CWD (ID) Good tests(Pass/Total)Bad tests

    (Pass/Total)Stack-based Buffer Overflow (121) 2,432 / 2,432 2,314 / 2,432Heap-based Buffer Overflow (122) 1,594 / 1,594 1,328 / 1,594Buffer Under-write (124) 682 / 682 641 / 682Buffer Over-read (126) 524 / 524 359 / 524Buffer Under-read (127) 682 / 682 641 / 682Total 5,914 / 5,914 5,283 / 5,914

    Table 3: Three different metadata structure modes’ detectioncapability based on the Juliet Test Suite for memory corruptionCWEs. FuZZan and ASan have identical results. Good testshave no memory corruption to check for false positives. Badtests are intentionally buggy to check for false negatives.

    scenarios: (i) starting with the empty seed; and (ii) startingwith a set of valid seeds (we use Google’s provided seeds forthe input record/replay experiment and randomly selectedseeds of the right file type for our real-world fuzz testing).

    5.1 Detection capability

    We verify that FuZZan and ASan detect the same set ofbugs in three different ways. First, we use the NIST Juliettest suite [35], which is a collection of test cases containingcommon vulnerabilities based on Common WeaknessEnumeration (CWE). We use the full Juliet test suite formemory corruption CWEs to verify FuZZan’s capability todetect the same classes of bugs as ASan, without introducingfalse positives or negatives. Second, to verify that FuZZanand ASan also have the same detection capability under fuzztesting, we use the Google fuzzer test suite and our recordedinput corpus. Finally, we leverage the complete set of ASan’spublic unit tests as a further sanity check.

    For the Juliet test suite (Table 3), we select CWEs related tomemory corruption bugs and obtain the same detection resultsfrom the three different modes (ASan’s shadow memory,RB-tree, and min-shadow memory). To validate FuZZanagainst ASan on the Google fuzzer test suite, we compareAFL crash reports across the full set of target programs inthe Google fuzzer test suite with our recorded inputs (toidentify both false positives and false negatives). Note thatwe force ASan to crash (the default setting under fuzz testing)when a memory error happens as fuzzers depend on programcrashes to detect bugs. As expected, FuZZan’s differentmodes all obtain the same crash results as ASan. However, weencounter minor differences between FuZZan and ASan whensanity-checking on the ASan unit tests. These differences aredue to internal changes we made when developing FuZZan,such as min-shadow memory’s changed memory layout (failedtest cases include features such as fixed memory addresses).

    USENIX Association 2020 USENIX Annual Technical Conference 255

  • Modes Empty seed Provided seed

    time(s)

    vs.Native

    (%)

    vs.ASan(%)

    time(s)

    vs.Native

    (%)

    vs.ASan(%)

    Native 199 - - 274 - -ASan 809 306 - 1,105 303 -RB-tree 1,541 673 90 3,308 1,106 199Min-1G 443 122 -45 632 131 -43Min-4G 465 133 -43 666 143 -40Min-8G 467 134 -42 685 150 -38Min-16G 477 139 -41 710 159 -36

    Table 4: Comparison between four min-shadow memorymodes, RB-tree, Native, and ASan execution overheadduring input record and replay fuzz testing with empty andprovided seed sets. The time (s) indicates the average ofall 26 applications’ execution time during testing. Positivepercentage (e.g., 20%) denotes overhead while negativepercentage indicates a speedup.

    5.2 Efficiency of new metadata structuresWe perform input record/replay fuzz testing to evaluate theeffectiveness of FuZZan’s new metadata structures. Doingso isolates the effects of our metadata structures by removingmost of the randomness/variation from a typical fuzzing run.

    Over the full Google fuzzer test suite, the RB-tree, withoutany other optimization, shows shorter execution times thanASan if the target application has less than 1,000 metadataaccesses; conversely, the RB-tree is slower than ASan whenthe target application has more than 1,000 metadata accesses.On average, as shown in Table 4, several applications in theGoogle fuzzer test suite have more than 1,000 metadata ac-cesses, and so RB-tree is overall slower than ASan on average.

    Despite being slower on average, the RB-tree can be fasteron individual applications and inputs. For instance, FuZZan inRB-tree mode demonstrates a 19% performance improvement(up to 45% faster) for 15 applications (the remaining 11applications show higher overhead compared to ASan) whenbenchmarked using the inputs generated from an empty seed.On the subset of applications for which seeds are provided, RB-tree shows less performance improvement (17% and up to 39%faster) for 14 applications (the remaining 12 applications showhigher overhead than ASan) when benchmarked using inputsgenerated from those seeds as provided seeds help to createvalid input, lengthening execution times and thus metadata ac-cesses. Note that RB-tree shows the best fuzzing performancewhen the target application (e.g., c-ares) has less 1,000 meta-data access. Additionally, even for applications where RB-treeis slower across all inputs, it is still faster on inputs with fewmetadata accesses. The variable performance of RB-tree,which is highly dependent on the number of metadata accesses,highlights the need for dynamic metadata structure switchingto automatically select the optimal metadata structure.

    Min-shadow memory mode, without additional optimiza-tion, outperforms ASan on all 26 programs (for both empty

    Modes Empty seed Provided seed

    time(s)

    vs.Native

    (%)

    vs.ASan(%)

    time(s)

    vs.Native

    (%)

    vs.ASan(%)

    Logging-Opt. 613 208 -24 891 225 -19Init-Opt. 686 244 -15 987 260 -11Logging+Init 552 177 -32 826 201 -25Min-Shadow 443 122 -45 632 131 -43Min-Shadow-Opt. 385 93 -52 574 109 -48Dynamic 387 94 -52 578 111 -48

    Table 5: Comparison between FuZZan’s three differentoptimization modes, native min-shadow memory (1G) mode,and min-shadow memory (1G) mode with FuZZan’s twooptimizations, and dynamic metadata structure switching (Dy-namic) mode execution overhead during all 26 applications’input record and replay fuzz testing.

    ModesASan’s

    init timems (%)

    ASan’slogging time

    ms (%)

    Memorymanage time

    ms (%)

    Page fault#

    Native 0.00 (0.00%) 0.00 (0.00%) 0.05 (11.49%) 2,569ASan 0.17 (10.58%) 0.30 (18.86%) 0.63 (40.16%) 11,967Min 0.10 (9.51%) 0.01 (1.33%) 0.24 (24.77%) 7,386

    Min-Opt. 0.00 (0.00%) 0.00 (0.00%) 0.24 (24.71%) 6,139

    Table 6: Comparison between native, ASan, min-shadowmemory (1G), two optimizations with min-shadow memoryexecutions with a breakdown of time spent in memorymanagement, and time spent for ASan’s initialization andlogging. Results are aggregated over 500,000 executions of thefull Google fuzzer test suite. Times are shown in milliseconds,and % denotes the ratio between single execution time andeach section execution’s time.

    and provided seeds), as shown in Table 4. More specifically,the average improvement is 45% when starting with an emptyseed and 43% when starting with the provided seeds. Whiledifferent min-shadow memory heap configurations showgradual increases in memory overhead (from 1GB to 16GB,in line with the heap size), all of them outperform ASan (atworst, min-shadow memory is still 36% faster than ASan witha provided seed).

    Additionally, both metadata configurations can utilize ourtwo engineering optimizations; i.e., removing logging andmodifying ASan’s initialization (as described in § 4). Table 5shows that the average improvement of removing unnecessarylogging is 24% when starting with an empty seed and 19%when starting with the provided seeds. Similarly, modifying theinitialization sequence improves performance by 15% whenstarting with an empty seed and by 11% when starting with theprovided seeds. Combining the two engineering optimizationswith min-shadow memory demonstrates synergistic effects:the combined performance is 52% (7% better than native min-shadow memory) faster for empty seeds, and 48% (5% betterthan native min-shadow memory) faster for provided seeds.

    Overall, FuZZan’s metadata structures show better perfor-

    256 2020 USENIX Annual Technical Conference USENIX Association

  • mance than ASan’s shadow memory for all 26 Google fuzzertest suite applications. As shown in Table 6, the main reasonsfor FuZZan’s improvement are: (i) the smaller memory spacereduces memory management overhead as page table manage-ment is more lightweight and incurs fewer page faults, (ii) ourtwo engineering optimizations further reduce overhead andnumber of page faults by removing unnecessary operations,and (iii) the min-shadow memory mode has the same O(1)time complexity for accessing target shadow memory asaccessing the original ASan metadata. However, we alsoobserve that the RB-tree is faster than min-shadow memoryfor some configurations and programs (e.g., c-ares-CVE).This motivates the need for dynamic metadata structureswitching, which observes program behavior and dynamicallyselects the best metadata structure based on this behavior.

    5.3 Efficiency of dynamic metadata structure

    As described in § 3.2, the dynamic metadata structureswitching mode leverages runtime feedback to select theoptimal metadata structure, dynamically tuning fuzzingperformance according to runtime feedback. The intuitionbehind the dynamic metadata structure switching mode is that(i) no single metadata structure is best across all applications,(ii) the best metadata structure is not known a priori, so theanalyst cannot pre-select the optimal metadata structure, and(iii) fuzzing goes through phases, e.g., alternating betweenlonger running tests (e.g., exploring new coverage) and shorterrunning tests (e.g., invalid input mutations searching for newcode paths). A consequence of the phases of fuzzing is thatthe same metadata structure is not optimal for every input toa given application. To verify the effectiveness of dynamicmetadata structure switching, which is implemented based onthese intuitions, we apply dynamic metadata structure switch-ing mode to fuzz testing for seven widely used applications forfuzzing and all 26 applications’ in Google’s fuzzer test suite.

    Our evaluation of dynamic metadata structure switchingvalidates our intuitions, as shown in Figure 4. Observe thatdifferent applications are dominated by different metadatastructures, e.g., c-ares for RB-tree and pngfix for min-shadow memory. This is because dynamic metadata structureswitching automatically selects the optimal metadata structure(which is unknown a priori). Because dynamic metadatastructure switching is automatic, it prevents users frommaking errors such as selecting RB-tree for applications witha large number of metadata accesses, and removes the need forany user-driven profiling to make metadata decisions. Further,dynamic metadata structure switching scales alongside withthe required memory of applications as it increases when thefuzzer finds deeper test cases, as evidenced by size, pngfix,or nm switching to different min-shadow memory modes(4GB, 8GB, and 16GB heap sizes), without user intervention.Without dynamic metadata structure switching, inefficientmin-shadow memory modes would be used at the beginning

    c-ares vorbis pngfix size nm0.00%

    20.00%

    40.00%

    60.00%

    80.00%

    100.00%

    ASan shadow memory FuZZan RB-treeFuZZan Min-shadow-1G FuZZan Min-shadow-4GFuZZan Min-shadow-8G FuZZan Min-shadow-16G

    270 350 60 95 682

    Figure 4: Evaluating the frequency of metadata structureswitching and each metadata structure selection over the first500,000 tests each for c-ares and vorbis in Google’s fuzzertest suite and pngfix, size, and nm. The number on each barindicates the total metadata switches.

    of fuzzing campaigns, or users would have to pause and restartfuzzing campaigns to change metadata modes.

    As an extreme example highlighting the need for automaticmetadata switching, the nm benchmark changes metadatastructures 682 times, underscoring the infeasibility of havinga human analyst determine the single best metadata structure.

    As a result of these factors, FuZZan’s dynamic metadatastructure switching mode improves performance over ASan by52% when starting with empty seeds and 48% when startingwith non-empty seeds. Further, ASan has 306% and FuZZanhas 94% (212% less) overhead with empty seeds and ASanhas 303% and FuZZan has 111% (192% less) overhead withnon-empty seeds compared to native execution. Note that dy-namic metadata structure switching has identical fuzzing per-formance to using min-shadow memory with 1GB heap alone,and improves performance over RB-tree up to 870%. Conse-quently, automating metadata selection is not adding notice-able overhead, while substantially improving user experience.We recommend using dynamic metadata structure switchingmode for the following four reasons: (i) if the target applicationexceeds FuZZan’s heap memory limit (1GB), dynamic meta-data structure switching automatically increases the heap sizefor the few executions that require it (a fixed heap size results infalse positive crashes due to heap memory exhaustion), (ii) pre-venting users from selecting an incorrect metadata structure,(iii) using only one metadata structure (e.g., min-shadow mem-ory) may miss the opportunity to further improve throughput,as, in some cases, RB-tree (or some future metadata structure)may be faster than min-shadow memory; (iv) manually select-ing a metadata structure requires extra effort (e.g., measuringeach metadata structure’s efficiency for the target application),which dynamic metadata structure switching mode avoids byautomatically selecting the optimal metadata structure.

    USENIX Association 2020 USENIX Annual Technical Conference 257

  • ProgramsNative ASan FuZZan

    exec#

    path#

    exec#

    path#

    exec# (%)

    path# (%)

    cxxfilt 86M 2,769 33M 2,442 51M (55%) 2,651 (9%)file 29M 1,126 7M 763 9M (29%) 845 (11%)nm 51M 1,272 7M 822 12M (71%) 872 (6%)objdump 95M 883 15M 567 17M (13%) 595 (5%)pngfix 36M 971 18M 912 33M (83%) 982 (8%)size 52M 703 17M 626 32M (88%) 656 (5%)tcpdump 70M 3,587 11M 1,540 20M (82%) 2,032 (32%)Total 419M 11,311 108M 7,672 174M (61%) 8,633 (13%)

    Table 7: Evaluating FuZZan’s total execution number andunique discovered path for 24 hours fuzz testing with providedseeds. The (M) denotes 1,000,000 (one million) and ratio (%)is the ratio between ASan and FuZZan.

    ProgramsASanTTE(s)

    FuZZanType (source)TTE

    (s)rate(%)

    c-ares 45 25 46 BO (ares_create_query.c:196)json 29 11 61 AF (fuzzer-parse_json.cpp:50)libxml2 7,314 4,194 43 BO (CVE-2015-8317)openssl-1.0.1f 443 336 24 BO (t1_lib.c:2586)pcre2 7,056 4,020 43 BO (pcre2_match.c:5968)Total 14,887 8,586 42 -

    Table 8: Evaluating FuZZan’s bug finding speed. The TTEdenotes the mean time-to-exposure. The AF is assertion errorand the BO denotes buffer overflow.

    5.4 Real-world fuzz testing

    Our experiments validating FuZZan use a record/replayapproach to avoid any impact of randomness, allowingmeaningful comparisons to a baseline. However, real-worldfuzzing is highly stochastic, and so we also evaluate FuZZanin the context of several real-world end-to-end fuzzingcampaigns without deterministic input record/replay. For thisexperiment, we select the following widely used programs:cxxfilt, nm, objdump, size (all from binutil-2.31), file(version 5.35), pngfix (from libpng 1.6.38) and tcpdump(version 4.10.0). Klees et al. [21] select and test cxxfilt,nm, and objdump in their fuzzing evaluation study. Theremaining four programs (size, file, pngfix, and tcpdump)are widely tested by recent fuzzing works [1, 3, 6, 26, 36, 46].For each binary, we run a fuzzing campaign. Each campaignis conducted for 24 hours and repeated five times. We measurethe number of total executions and discovered unique pathswhen fuzzing with seeds from the seed corpus of each programwith the right type file and three different configurations:native, ASan, and FuZZan’s dynamic metadata structureswitching mode, and report the mean over the five campaigns.

    As a result, FuZZan improves throughput over ASan by61% (up to 88%). Interestingly, FuZZan discovers 13% moreunique paths given the same 24 hours time due to improvedthroughput. Our evaluation also shows that improvedthroughput increases the possibility of finding more bugs inthe same amount of time, as we discuss next.

    Modes time(s)

    vs.Native

    (%)

    vs.MSan(%)

    vs.MSannolock

    (%)Native 146 - - -MSan 14,074 9,575 - -MSan-nolock 386 165 -97 -Min-16G 335 130 -98 -13

    Table 9: Comparison between Native, MSan, MSan-nolock,and min-shadow memory execution overhead during inputrecord and replay fuzz testing with provided seed sets. MSan-nolock disables lock/unlock for MSan’s logging depots. Time(s) indicates the average of execution time. Positive percent-ages denote overhead, negative percentages denote speedup.

    5.5 Bug finding effectivenessFuZZan increases throughput while maintaining ASan’s bugdetection capability, potentially enabling it to find more bugs.To demonstrate this, we evaluate FuZZan’s bug finding speedand compare it to a fuzzing campaign with ASan. In this eval-uation, we target five applications in Google’s fuzzer test suite.These applications are chosen because we found bugs in them(using ASan and dynamic metadata structure switching mode)within a 24 hour fuzzing campaign. We use the seeds providedby the test suite and repeated each campaign five times. Notethat we do not replay recorded inputs during these campaigns,instead letting the fuzzer generate random inputs. Table 8shows the mean time (over five campaigns) to find each bug.Notably, FuZZan finds all bugs up to 61% (mean 42%) fasterthan ASan, and is faster in all cases. This experiment empha-sizes our belief that throughput is paramount when fuzzingwith sanitizers.

    5.6 FuZZan Flexibility

    Appling FuZZan to Memory Sanitizer. Like ASan, nu-merous sanitizers use shadow memory for their metadatastructure [47]. For example, other popular sanitizers, suchas Memory Sanitizer (MSan) [48] and Thread Sanitizer(TSan) [42], also rely on shadow memory for metadata.FuZZan optimizes sanitizer usage of shadow memory withoutmodifying the stored shadow information or how the sanitizeruses that information. Consequently, porting our shadowmetadata improvements in FuZZan from ASan to othersanitizers is a simple engineering exercise. To demonstratethis, we port FuZZan to MSan. In so doing, we shrink MSan’smemory space to implement min-shadow memory 16G forMSan (1GB for the stack, 16GB for the heap, and 2GB for theBSS, data, and text sections combined). We only implementone metadata mode for our MSan proof-of-concept to validateour claim that applies FuZZan to other shadow memory basedsanitizers is an engineering exercise.

    Table 9 summarizes MSan’s performance overhead ondifferent modes for all 26 evaluated applications. Initially,

    258 2020 USENIX Annual Technical Conference USENIX Association

  • min-shadow memory shows high overhead—around 96times native. Analyzing this, we found that MSan’s fork()interceptor locks all logging depots before fork() and sim-ilarly unlocks them afterwards to avoid deadlocks. However,as explained in § 4, locking/unlocking logging depots isunnecessary for fuzzing because these logging depots existfor bug reporting and fuzzing inherently enables replay bystoring test inputs when the fuzzer finds bugs. We thus disablethese lock/unlock functions to create the MSan-nolock mode,which has reasonable overhead (2.6 times that of native).

    FuZZan’s MSan min-shadow memory 16G mode shows13% performance improvement compared to MSan-nolockmode, demonstrating FuZZan’s efficacy when applied toMSan. We expect that additional optimization and the appli-cation of the dynamic switch mode will lead to even higherperformance improvement. We leave this engineering as futurework.Applying FuZZan to MOpt-AFL. FuZZan is not coupledto a particular fuzzer or fuzzer version. Most modernfuzzers [2, 3, 31, 31] extend AFL, so our approach appliesbroadly. To demonstrate this, we apply FuZZan to MOpt-AFL [31], which is an efficient mutation scheduling schemeto achieve better fuzzing efficiency. We modify MOpt-AFLto add FuZZan’s profiling feedback and dynamic metadataswitching functions. To measure FuZZan’s impact onMOpt-AFL, we select seven real-world applications (the sameset as Table 7) and fuzz them for 24 hours each, repeating theexperiment five times to control for randomness in the results.On average, ASan-MOpt-AFL mode discovers 85% moreunique paths given the same 24 hours time due to MOpt-AFL’seffectiveness compared to ASan. Notably, FuZZan-MOpt-AFL mode discovers 112% more unique paths (27% higherthan ASan-MOpt-AFL) due to the improved throughput.

    6 Discussion

    In this section, we summarize some potential areas for futurework, a possible security extension enabled by FuZZan, andlessons learned in designing FuZZan.Removing conflicts between sanitizers. ASan’s shadowmemory scheme conflicts with other sanitizers that are alsobased on shadow memory, e.g., MSan and TSan. Each sani-tizer interprets the shadow memory in a mutually exclusivemanner, prohibiting the use of multiple concurrent sanitizers.For example, ASan uses shadow memory as a metadata store,while MSan prohibits access to the same memory range. FuZ-Zan’s new metadata structures can be adapted to avoid thisconflict, and enable true composition of sanitizers, since weuse lightweight, independent metadata structures. Each sani-tizer can map its own instance of our metadata structure, and allsanitizers may coexist in a single process. However, some engi-neering effort is required to port sanitizers to our new metadatastructures. An alternate approach would be to have one meta-

    data structure that stores information for all sanitizers. Whetherhaving a unified metadata structure or a metadata structure persanitizer is more efficient is an interesting research question.Possible security extension. Unfortunately, ASan’s virtualmemory requirements directly conflict with fuzzers’ abilitiesto detect certain out-of-memory (OOM) bugs. For example,fuzzers typically limit memory usage to detect OOM errorswhen parsing malformed input. However, ASan’s largevirtual memory requirement masks OOM bugs, leaving themundetected because of the difficulty of setting precise memorylimits. Consequently, using a compact metadata structure withASan not only improves performance, but also can enable anextension of ASan’s policy to cover OOM bugs.Lessons Learned. Our initial metadata design leveraged atwo-layered shadow memory metadata structure that splitmetadata lookups into two parts: a lookup into a top-levelmetadata structure, followed by a lookup into a second-levelmetadata structure a la page tables. While this design vastlyreduced memory consumption and management overhead, theadditional runtime cost per metadata access of the additionalindirection resulted in the two-layer structure being slowerthan ASan in all cases.

    For dynamic metadata structure switching, we evaluatedtwo additional policies: (i) utilizing more detailed metadataaccess information such as each object type’s (e.g, stack)metadata access (e.g., insert) count and each operation’smicrobenchmark results, and (ii) running each metadata mode,measuring their execution time, and selecting the fastestmetadata mode. In our evaluation, the additional samplingcomplexity of these policies outweighed any gains from moreprecisely selecting a metadata structure.

    7 Related Work

    7.1 Reducing Fuzzing Overhead

    Several approaches reduce the overhead of fuzzing. One ap-proach is to reduce the execution time of each iteration. AFLsupports a deferred fork server which requires a manual call tothe fork server. The analyst is encouraged to use the deferredfork server, and manually initiate the fork server as late as pos-sible to reduce, not only overhead from linking and libc initial-izations, but also overhead from the initialization of the targetprogram. Deferred mode, however, cannot reduce the teardownoverhead of heavy metadata structures. AFL’s persistent modeand libFuzzer eliminate the overhead from creating a new pro-cess. However, these approaches require manual effort, andusers must know the target programs. Xu et al. [55] implementseveral new OS primitives to improve the efficiency of fuzzingon multicore platforms. Especially, by supporting a new sys-tem call, snapshot instead of fork, they reduce the overheadof creating a process. Moreover, they reduce the overheadfrom file system contention through a dual file system service.

    USENIX Association 2020 USENIX Annual Technical Conference 259

  • However, this approach requires kernel modifications for thenew primitives, and does not reduce the overhead of sanitizers.

    Another approach is to improve fuzzing itself so that itcan find more crashes within the same amount of executions.AFLFast [3] adopts a Markov chain model to select a seed. If in-puts mutated from a seed explore more new paths, the seed hashigher probability to be selected. With given target source lo-cations, AFLGo [2] selects a seed that has higher probabilitiesto reach the source locations. Several approaches adopt hybridfuzzing, taint analysis, and machine learning to help fuzzers ex-plore more paths. SAVIOR [8] uses hybrid fuzzing, combiningit with concolic execution to explore code blocks guarded bycomplex branch conditions. RedQueen [1] uses taint analysisand symbolic execution for the same purpose. VUzzer [40]also uses dynamic taint analysis and mutates bytes which arerelated to target branch conditions to efficiently explore paths.TIFF [18] infers the type of the input bytes through dynamictaint analysis and uses the type information to mutate the input.Matryoshka [7] uses both data flow and control flow informa-tion to explore nested branches. In addition to hybrid fuzzingwith traditional techniques such as symbolic and concolic exe-cutions, NEUZZ [46] adapts neural network and sets the num-ber of covered paths as an objective function to maximize cov-ered paths. Angora [6] adapts both taint analysis and a gradientdescent algorithm to improve the number of covered paths.These approaches do not reduce the execution time of each iter-ation. They are therefore orthogonal to our work. Thus, we canuse these approaches to further increase fuzzing performance.

    7.2 Optimizing Sanitizers

    Since C/C++ programming languages are memory and typeunsafe languages, several sanitizers [47] target memorysafety violations [5, 23, 41, 48, 49] and type safety viola-tions [14, 19, 24, 29]. Despite their broad use, sanitizers haveseveral limitations such as high overhead, limited detectionabilities, and incompatibility with other sanitizers.

    To reduce sanitizer overhead, ASAP [52] and PartiSan [25]disable check instrumentation on the hot path according totheir policies. The intuition of both approaches is that mostof the sanitizer’s overhead comes from checks on a few hotcode paths that are frequently executed (e.g., instrumentationin a loop). ASAP removes check instrumentation on the hotpath based on pre-calculated profiling results at compile time.In PartiSan [25], Lettner et al., propose runtime partitioningto more effectively remove check instrumentation basedon runtime information during execution. However, bothapproaches miss a main source of overhead when reducing thecost of ASan during fuzzing campaigns: the overhead is due tomemory management and not due to the low overhead safetychecks. As ASAP and PartiSan target the cost of checks, theyare complementary to FuZZan. To fuzz quickly, there is anoption to generate a corpus from a normal binary, and thenfeed the corpus to an ASan binary. FuZZan can also adopt this

    option for fast fuzzing.

    Pina et al., [38] use multi-version execution to concurrentlyrun sanitizer-protected processes together with nativeprocesses, synchronizing all versions at the system-call level.To synchronize all versions, they use a system-call buffer anda Domain-Specific Language [37] to resolve conflicts betweendifferent program versions. Xu et al., [54] propose Bunshinto reduce the overhead of sanitizers and conflicts based on theN-version system through their check distribution, sanitizerdistribution, and cost distribution policies. Since theseapproaches are based on N-version systems, they increasehardware requirements such as several dedicated cores andat least N times of memory. Also, these approaches do notaddress the fundamental problem of ASan memory overhead.

    8 Conclusion

    Combining a fuzzer with sanitizers is a popular and effectiveapproach to maximize bug finding efficacy. However,several design choices of current sanitizers hinder fuzzingeffectiveness, increasing the runtime cost and reducing thebenefit of combining fuzzing and sanitization.

    We show that the root cause of this overhead is the heavymetadata structure used by sanitizers, and propose FuZZan tooptimize sanitizer metadata structures for fuzzing. We imple-ment and apply these ideas to ASan. We design new metadatastructures to replace ASan’s rigid shadow memory, reducingthe memory management overhead while maintaining thesame error detection capabilities. Our dynamic metadata struc-ture adaptively selects the most efficient metadata structure forthe current fuzzing campaign without manual configuration.

    Our evaluation shows that FuZZan improves performanceover ASan 52% when starting with empty seeds (48% withGoogle’s seed corpus). Based on improved throughput, FuZ-Zan discovers 13% more unique paths given the same 24 hoursand finds bugs 42% faster. The open-source version of FuZZanis available at https://github.com/HexHive/FuZZan.

    Acknowledgments

    We thank the anonymous reviewers and our shepherd JuliaLawall for their detailed feedback. This project has receivedfunding from the European Research Council (ERC) underthe European Union’s Horizon 2020 research and innovationprogram (grant agreement No. 850868), NSF CNS-1801601,and ONR award N00014-18-1-2674. Any opinions, findings,and conclusions or recommendations expressed in thismaterial are those of the authors and do not necessarily reflectthe views of our sponsors.

    260 2020 USENIX Annual Technical Conference USENIX Association

    https://github.com/HexHive/FuZZan

  • References

    [1] Cornelius Aschermann, Sergej Schumilo, Tim Blazytko,Robert Gawlik, and Thorsten Holz. REDQUEEN:Fuzzing with Input-to-State Correspondence. InProceedings of the Network and Distributed SystemSecurity Symposium (NDSS), 2019.

    [2] Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen,and Abhik Roychoudhury. Directed greybox fuzzing.In Proceedings of the ACM Conference on Computerand Communications Security (CCS), 2017.

    [3] Marcel Böhme, Van-Thuan Pham, and Abhik Roychoud-hury. Coverage-based greybox fuzzing as Markov chain.In Proceedings of the ACM Conference on Computerand Communications Security (CCS), 2016.

    [4] Derek Bruening and Qin Zhao. Practical memorychecking with Dr. Memory. In Proceedings of theAnnual IEEE/ACM International Symposium on CodeGeneration and Optimization (CGO), 2011.

    [5] Nathan Burow, Derrick McKee, Scott A Carr, andMathias Payer. CUP: Comprehensive User-SpaceProtection for C/C++. In Proceedings of the AsiaConference on Computer and Communications Security(ASIACCS), 2018.

    [6] Peng Chen and Hao Chen. Angora: Efficient fuzzingby principled search. In Proceedings of the IEEESymposium on Security and Privacy (SP), 2018.

    [7] Peng Chen, Jianzhong Liu, and Hao Chen. Matryoshka:Fuzzing Deeply Nested Branches. In Proceedings of theACM Conference on Computer and CommunicationsSecurity (CCS), 2019.

    [8] Yaohui Chen, Peng Li, Jun Xu, Shengjian Guo, RundongZhou, Yulong Zhang, Long Lu, et al. SAVIOR: TowardsBug-Driven Hybrid Testing. In Proceedings of the IEEESymposium on Security and Privacy (SP), 2020.

    [9] Google. Address Sanitizer Found Bugs.https://github.com/google/sanitizers/wiki/AddressSanitizerFoundBugs.

    [10] Google. Clusterfuzz. https://google.github.io/clusterfuzz/.

    [11] Google. Fuzzer test suite. https://github.com/google/fuzzer-test-suite.

    [12] Google. Kernel Address Sanitizer (KASan), afast memory error detector for the Linux kernel.https://github.com/google/kasan/wiki.

    [13] Google. Libfuzzer tutorial. https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md.

    [14] Istvan Haller, Yuseok Jeon, Hui Peng, Mathias Payer,Cristiano Giuffrida, Herbert Bos, and Erik van derKouwe. TypeSan: Practical type confusion detection.In Proceedings of the ACM Conference on Computerand Communications Security (CCS), 2016.

    [15] Niranjan Hasabnis, Ashish Misra, and R Sekar. Light-weight bounds checking. In Proceedings of theInternational Symposium on Code Generation andOptimization (CGO), 2012.

    [16] Reed Hastings. Purify: Fast detection of memory leaksand access errors. In Proceedings of the USENIXSecurity Symposium (SEC), 1992.

    [17] Van Jacobson. Congestion avoidance and control. ACMSIGCOMM computer communication review, 1988.

    [18] Vivek Jain, Sanjay Rawat, Cristiano Giuffrida, andHerbert Bos. TIFF: Using Input Type Inference To Im-prove Fuzzing. In Proceedings of the Annual ComputerSecurity Applications Conference (ACSAC), 2018.

    [19] Yuseok Jeon, Priyam Biswas, Scott Carr, ByoungyoungLee, and Mathias Payer. HexType: Efficient Detectionof Type Confusion Errors for C++. In Proceedings ofthe ACM Conference on Computer and CommunicationsSecurity (CCS), 2017.

    [20] Linux kernel document. The Kernel Address Sanitizer(KASAN). https://www.kernel.org/doc/html/v4.14/dev-tools/kasan.html.

    [21] George Klees, Andrew Ruef, Benji Cooper, ShiyiWei, and Michael Hicks. Evaluating fuzz testing. InProceedings of the ACM Conference on Computer andCommunications Security (CCS), 2018.

    [22] Taddeus Kroes, Koen Koning, Cristiano Giuffrida,Herbert Bos, and Erik van der Kouwe. Fast andgeneric metadata management with mid-fat pointers.In Proceedings of the European Workshop on SystemsSecurity (EuroSec), 2017.

    [23] Byoungyoung Lee, Chengyu Song, Yeongjin Jang,Tielei Wang, Taesoo Kim, Long Lu, and Wenke Lee.Preventing Use-after-free with Dangling PointersNullification. In Proceedings of the Network andDistributed System Security Symposium (NDSS), 2015.

    [24] Byoungyoung Lee, Chengyu Song, Taesoo Kim, andWenke Lee. Type Casting Verification: Stopping anEmerging Attack Vector. In Proceedings of the USENIXSecurity Symposium (SEC), 2015.

    USENIX Association 2020 USENIX Annual Technical Conference 261

    https://github.com/google/sanitizers/wiki/AddressSanitizerFoundBugshttps://github.com/google/sanitizers/wiki/AddressSanitizerFoundBugshttps://google.github.io/clusterfuzz/https://google.github.io/clusterfuzz/https://github.com/google/fuzzer-test-suitehttps://github.com/google/fuzzer-test-suitehttps://github.com/google/kasan/wikihttps://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.mdhttps://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.mdhttps://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.mdhttps://www.kernel.org/doc/html/v4.14/dev-tools/kasan.htmlhttps://www.kernel.org/doc/html/v4.14/dev-tools/kasan.html

  • [25] Julian Lettner, Dokyung Song, Taemin Park, Per Larsen,Stijn Volckaert, and Michael Franz. PartiSan: fast andflexible sanitization via run-time partitioning. In Pro-ceedings of the International Symposium on Researchin Attacks, Intrusions, and Defenses (RAID), 2018.

    [26] Yuekang Li, Bihuan Chen, Mahinthan Chandramohan,Shang-Wei Lin, Yang Liu, and Alwen Tiu. Steelix:program-state based binary fuzzing. In Proceedingsof the Joint Meeting on Foundations of SoftwareEngineering (FSE), 2017.

    [27] LLVM. LibFuzzer – a library for coverage-guided fuzztesting. https://llvm.org/docs/LibFuzzer.html.

    [28] LLVM. The LLVM Compiler Infrastructure Project.http://llvm.org/.

    [29] LLVM. TySan: A type sanitizer. https://reviews.llvm.org/D32199.

    [30] Alexey Loginov, Suan Hsi Yong, Susan Horwitz, andThomas Reps. Debugging via run-time type checking.In Processings of the International Conference onFundamental Approaches to Software Engineering(FASE), 2001.

    [31] Chenyang Lyu, Shouling Ji, Chao Zhang, Yuwei Li,Wei-Han Lee, Yu Song, and Raheem Beyah. MOPT:Optimized Mutation Scheduling for Fuzzers. In Proceed-ings of the USENIX Security Symposium (SEC), 2019.

    [32] Valentin Jean Marie Manès, HyungSeok Han, Choong-woo Han, Sang Kil Cha, Manuel Egele, Edward JSchwartz, and Maverick Woo. The art, science, andengineering of fuzzing: A survey. IEEE Transactionson Software Engineering, 2019.

    [33] Barton P Miller, Louis Fredriksen, and Bryan So. Anempirical study of the reliability of UNIX utilities.Communications of the ACM, 1990.

    [34] Matt Miller. Trends, challenge, and shifts in soft-ware vulnerability mitigation. https://github.com/Microsoft/MSRC-Security-Research/blob/master/presentations/2019_02_BlueHatIL/2019_01%20-%20BlueHatIL%20-%20Trends%2C%20challenge%2C%20and%20shifts%20in%20software%20vulnerability%20mitigation.pdf.

    [35] NIST. Juliet test suite. https://samate.nist.gov/SARD/testsuite.php.

    [36] Hui Peng, Yan Shoshitaishvili, and Mathias Payer.T-Fuzz: fuzzing by program transformation. InProceedings of the IEEE Symposium on Security andPrivacy (SP), 2018.

    [37] Luís Pina, Daniel Grumberg, Anastasios Andronidis, andCristian Cadar. A DSL approach to reconcile equivalentdivergent program executions. In Proceedings of theUSENIX Annual Technical Conference (ATC), 2017.

    [38] Luís Pina, Anastasios Andronidis, and Cristian Cadar.FreeDA: Deploying Incompatible Stock DynamicAnalyses in Production via Multi-Version Execution. InProceedings of the ACM International Conference onComputing Frontiers (CF), 2018.

    [39] The Chromium Project. Address Sanitizer(ASan). https://www.chromium.org/developers/testing/addresssanitizer.

    [40] Sanjay Rawat, Vivek Jain, Ashish Kumar, LucianCojocar, Cristiano Giuffrida, and Herbert Bos. Vuzzer:Application-aware evolutionary fuzzing. In Proceed-ings of the Network and Distributed System SecuritySymposium (NDSS), 2017.

    [41] Konstantin Serebryany, Derek Bruening, AlexanderPotapenko, and Dmitriy Vyukov. AddressSanitizer:A fast address sanity checker. In Proceedings of theUSENIX Annual Technical Conference (ATC), 2012.

    [42] Konstantin Serebryany and Timur Iskhodzhanov.ThreadSanitizer: data race detection in practice. InProceedings of the workshop on binary instrumentationand applications (WBIA), 2009.

    [43] Kostya Serebryany. Hardware MemoryTagging to make C/C++ memory safe(r).https://github.com/google/sanitizers/blob/master/hwaddress-sanitizer/HardwareMemoryTaggingtomakeC_C++memorysafe(r)-iSecCon2018.pdf.

    [44] Kostya Serebryany. Sanitize, Fuzz, and Harden YourC++ Code. https://www.usenix.org/sites/default/files/conference/protected-files/enigma_slides_serebryany.pdf.

    [45] Julian Seward and Nicholas Nethercote. Using Valgrindto Detect Undefined Value Errors with Bit-Precision.In Proceedings of the USENIX Annual TechnicalConference (ATC), 2005.

    [46] Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang,Baishakhi Ray, and Suman Jana. Neuzz: Efficientfuzzing with neural program smoothing. In Proceedingsof the IEEE Symposium on Security and Privacy (SP),2019.

    [47] Dokyung Song, Julian Lettner, Prabhu Rajasekaran,Yeoul Na, Stijn Volckaert, Per Larsen, and MichaelFranz. SoK: sanitizing for security. In Proceedings of theIEEE Symposium on Security and Privacy (SP), 2019.

    262 2020 USENIX Annual Technical Conference USENIX Association

    https://llvm.org/docs/LibFuzzer.htmlhttp://llvm.org/https://reviews.llvm.org/D32199https://reviews.llvm.org/D32199https://github.com/Microsoft/MSRC-Security-Research/blob/master/presentations/2019_02_BlueHatIL/2019_01%20-%20BlueHatIL%20-%20Trends%2C%20challenge%2C%20and%20shifts%20in%20software%20vulnerability%20mitigation.pdfhttps://github.com/Microsoft/MSRC-Security-Research/blob/master/presentations/2019_02_BlueHatIL/2019_01%20-%20BlueHatIL%20-%20Trends%2C%20challenge%2C%20and%20shifts%20in%20software%20vulnerability%20mitigation.pdfhttps://github.com/Microsoft/MSRC-Security-Research/blob/master/presentations/2019_02_BlueHatIL/2019_01%20-%20BlueHatIL%20-%20Trends%2C%20challenge%2C%20and%20shifts%20in%20software%20vulnerability%20mitigation.pdfhttps://github.com/Microsoft/MSRC-Security-Research/blob/master/presentations/2019_02_BlueHatIL/2019_01%20-%20BlueHatIL%20-%20Trends%2C%20challenge%2C%20and%20shifts%20in%20software%20vulnerability%20mitigation.pdfhttps://github.com/Microsoft/MSRC-Security-Research/blob/master/presentations/2019_02_BlueHatIL/2019_01%20-%20BlueHatIL%20-%20Trends%2C%20challenge%2C%20and%20shifts%20in%20software%20vulnerability%20mitigation.pdfhttps://github.com/Microsoft/MSRC-Security-Research/blob/master/presentations/2019_02_BlueHatIL/2019_01%20-%20BlueHatIL%20-%20Trends%2C%20challenge%2C%20and%20shifts%20in%20software%20vulnerability%20mitigation.pdfhttps://github.com/Microsoft/MSRC-Security-Research/blob/master/presentations/2019_02_BlueHatIL/2019_01%20-%20BlueHatIL%20-%20Trends%2C%20challenge%2C%20and%20shifts%20in%20software%20vulnerability%20mitigation.pdfhttps://samate.nist.gov/SARD/testsuite.phphttps://samate.nist.gov/SARD/testsuite.phphttps://www.chromium.org/developers/testing/addresssanitizerhttps://www.chromium.org/developers/testing/addresssanitizerhttps://github.com/google/sanitizers/blob/master/hwaddress-sanitizer/Hardware Memory Tagging to make C_C++ memory safe(r) - iSecCon 2018.pdfhttps://github.com/google/sanitizers/blob/master/hwaddress-sanitizer/Hardware Memory Tagging to make C_C++ memory safe(r) - iSecCon 2018.pdfhttps://github.com/google/sanitizers/blob/master/hwaddress-sanitizer/Hardware Memory Tagging to make C_C++ memory safe(r) - iSecCon 2018.pdfhttps://github.com/google/sanitizers/blob/master/hwaddress-sanitizer/Hardware Memory Tagging to make C_C++ memory safe(r) - iSecCon 2018.pdfhttps://www.usenix.org/sites/default/files/conference/protected-files/enigma_slides_serebryany.pdfhttps://www.usenix.org/sites/default/files/conference/protected-files/enigma_slides_serebryany.pdfhttps://www.usenix.org/sites/default/files/conference/protected-files/enigma_slides_serebryany.pdf

  • [48] Evgeniy Stepanov and Konstantin Serebryany. Mem-orySanitizer: fast detector of uninitialized memoryuse in C++. In Proceedings of the Annual IEEE/ACMInternational Symposium on Code Generation andOptimization (CGO), 2015.

    [49] Erik Van Der Kouwe, Vinod Nigade, and CristianoGiuffrida. Dangsan: Scalable use-after-free detection. InProceedings of the European Conference on ComputerSystems (EUROSYS), 2017.

    [50] Dmitry Vyukov. Address/Thread/MemorySanitizerSlaughtering C++ bugs. https://www.slideshare.net/sermp/sanitizer-cppcon-russia.

    [51] Dmitry Vyukov. Syzbot. https://syzkaller.appspot.com/upstream.

    [52] Jonas Wagner, Volodymyr Kuznetsov, George Candea,and Johannes Kinder. High system-code security withlow overhead. In Proceedings of the IEEE Symposiumon Security and Privacy (SP), 2015.

    [53] Wikipedia. x32 ABI. https://en.wikipedia.org/wiki/X32_ABI.

    [54] Meng Xu, Kangjie Lu, Taesoo Kim, and Wenke Lee.Bunshin: Compositing Security Mechanisms throughDiversification. In Proceedings of the USENIX AnnualTechnical Conference (ATC), 2017.

    [55] Wen Xu, Sanidhya Kashyap, Changwoo Min, andTaesoo Kim. Designing New Operating Primitives toImprove Fuzzing Performance. In Proceedings of theACM Conference on Computer and CommunicationsSecurity (CCS), 2017.

    [56] Yves Younan. FreeSentry: protecting against use-after-free vulnerabilities due to dangling pointers. InProceedings of the Network and Distributed SystemSecurity Symposium (NDSS), 2015.

    [57] Michal Zalewski. American Fuzzy Lop.http://lcamtuf.coredump.cx/afl.

    [58] Michal Zalewski. New in AFL: persistent mode.https://lcamtuf.blogspot.com/2015/06/

    new-in-afl-persistent-mode.html.

    USENIX Association 2020 USENIX Annual Technical Conference 263

    https://www.slideshare.net/sermp/sanitizer-cppcon-russiahttps://www.slideshare.net/sermp/sanitizer-cppcon-russiahttps://syzkaller.appspot.com/upstreamhttps://syzkaller.appspot.com/upstreamhttps://en.wikipedia.org/wiki/X32_ABIhttps://en.wikipedia.org/wiki/X32_ABIhttp://lcamtuf.coredump.cx/aflhttps://lcamtuf.blogspot.com/2015/06/new-in-afl-persistent-mode.htmlhttps://lcamtuf.blogspot.com/2015/06/new-in-afl-persistent-mode.html

  • IntroductionBackground and AnalysisFuzzing overheadAddress SanitizerOverhead Analysis of Fuzzing with ASan

    FuZZan designFuZZan Metadata StructuresCustomized RB-TreeMin-shadow memory

    Dynamic metadata structure switchingSampling modeMetadata structure switching policies

    ImplementationEvaluationDetection capabilityEfficiency of new metadata structuresEfficiency of dynamic metadata structureReal-world fuzz testingBug finding effectivenessFuZZan Flexibility

    DiscussionRelated WorkReducing Fuzzing OverheadOptimizing Sanitizers

    Conclusion