chris murphy, moses vaughan, waseem ilahi, gail kaiser columbia university
DESCRIPTION
Automatic Detection of Previously-Unseen Application States for Deployment Environment Testing and Analysis. Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University. Overview. Deployment environment testing and analysis can find defects not found prior to release - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/1.jpg)
Automatic Detection of Previously-Unseen Application
States for Deployment Environment Testing and Analysis
Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser
Columbia University
![Page 2: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/2.jpg)
2Chris Murphy, Columbia University
Overview Deployment environment testing and
analysis can find defects not found prior to release
Approaches potentially suffer from high overhead
We seek to increase the efficiency by only running tests/analysis in previously-unseen application states
![Page 3: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/3.jpg)
3Chris Murphy, Columbia University
In Vivo Testing [Murphy et al. ICST’09]
Automatically conduct tests as the software is running in the deployment environmentUnit tests, integration tests, etc.“Runtime assertions with side effects”
Looks for defects in previously untested application states
![Page 4: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/4.jpg)
4Chris Murphy, Columbia University
In Vivo test
Function f is about to be executed with input x in state S
Create a sandbox for
the testExecute f(x)
Program continues
ExecuteINVtest_f(x)
in state S
Reportviolations
![Page 5: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/5.jpg)
5Chris Murphy, Columbia University
Research Question Can the approach be made more efficient
by only running tests and performing analysis in application states that the program has not seen before?
Yes, assuming we can quickly determine whether the current state has not previously been seen
![Page 6: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/6.jpg)
6Chris Murphy, Columbia University
Broader Impact Other approaches may also use data about
deployment environment executionMonitoring: Anomaly detectionProfiling: Path coverage, line coverage, etc.Fault localization
All of these may benefit if analysis is done only in previously-unseen states
![Page 7: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/7.jpg)
7Chris Murphy, Columbia University
Analysis Overhead of standard approach
Tstandard = N * ti
Overhead in previously unseen statesTunseen = N * DSP * (td + tu + ti)
Overhead in previously seen statesTseen = N * (1-DSP) * td
Find DSP so that Tunseen + Tseen ≤ Tstandard
Numberof tests
Timeper test
DistinctState
Percentage
Time to determine if state has been
seen before
Time to update list of seen states
![Page 8: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/8.jpg)
8Chris Murphy, Columbia University
Analysis Results To be more efficient, we need:
DSP ≤ (ti – td) / (ti + tu)
If td and tu are much less than ti the right side of the inequality approaches 1
This means that even if nearly all states are distinct, this new approach will still be more efficient
![Page 9: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/9.jpg)
9Chris Murphy, Columbia University
Implementation Issues How do we define the “state”?
How do we represent the state?
How can we quickly determine whether a state has previously been seen?
Does the previous analysis hold in the real world?
![Page 10: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/10.jpg)
10Chris Murphy, Columbia University
Defining “State” We define “state” as “the values of all
variables that are in scope at a given execution point”
For the purposes of In Vivo Testing, this can be further refined to “the values of all variables on which a function depends that are in scope at the start of the function execution”
![Page 11: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/11.jpg)
11Chris Murphy, Columbia University
Example
We can statically determine that function f1 depends on:parameters p1, p2, p3; and global variables a and b
![Page 12: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/12.jpg)
12Chris Murphy, Columbia University
Representing States Given our definition, the “state” is simply a
map between variable names and values
We want to avoid element-wise comparison We want to avoid false positives and false
negatives
a = 4b = 3
…
a = 2b = 5
…
a = 1b = 8
…
a = 2b = 7
…
![Page 13: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/13.jpg)
13Chris Murphy, Columbia University
Cantor Function Goal: Give each state a distinct value
Hashing function that assigns a distinct number to a pair of numbers [Royden 1988]
f(k1,k2) = (1/2)(k1+k2)(k1+k2+1) + k2
Can be used recursively over a set of numerical values
![Page 14: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/14.jpg)
14Chris Murphy, Columbia University
Tracking Execution States Even if each state has its own
representation, how can we quickly determine whether a state has been seen before?
Hashtable: O(n) in worst case Bloom filter: O(1), but allows for false
positives
27 18 91 33 74 26 18 ?
![Page 15: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/15.jpg)
15Chris Murphy, Columbia University
Judy Array Scalable, space efficient, speed efficient
Developed by Hewlett-Packard in 2001
Highly-optimized 256-ary prefix tree data structure
Lookups are O(log256n)
Now we can quickly detect whether a state has already been seen
![Page 16: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/16.jpg)
16Chris Murphy, Columbia University
Automated Process1. Statically analyze the source code to
determine which parts of the state the function depends on
2. Create code that uses Cantor function to represent the state and Judy Array to determine whether it had already been seen
3. Generate instrumentation as normal, with call to function created in Step 2
![Page 17: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/17.jpg)
17Chris Murphy, Columbia University
Evaluation In practice, is sometimes running the
instrumentation (i.e., only in previously unseen states) really more efficient than always doing it?
Target: In Vivo Testing implementation in C
Sieve of Eratosthenes program run with 100 inputs, using varying percentages of distinct states ranging from: 0%, i.e. all values are the same 100%, i.e. all values are different
![Page 18: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/18.jpg)
18Chris Murphy, Columbia University
Results
![Page 19: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/19.jpg)
19Chris Murphy, Columbia University
Limitations & Future Work Memory cost Upper bound of Cantor function Representation of complex objects
Coordinating globally-unseen states
![Page 20: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/20.jpg)
Automatic Detection of Previously-Unseen Application
States for Deployment Environment Testing and Analysis
Chris Murphy
Columbia University
![Page 21: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/21.jpg)
21Chris Murphy, Columbia University
Related Work Reducing overhead of runtime monitoring
Static analysis to remove unnecessary instrumentation [Yong & Horwitz, 2005]
Fast cases vs. slow cases [Liblit et al., 2003]
State representation for anomaly detection [Baah et al., 2006] [Hangal & Lam, 2002]
![Page 22: Chris Murphy, Moses Vaughan, Waseem Ilahi, Gail Kaiser Columbia University](https://reader036.vdocuments.us/reader036/viewer/2022062315/568150fb550346895dbf1a11/html5/thumbnails/22.jpg)
22Chris Murphy, Columbia University