program analysis, in industry and academia · technique: type inference defined a type system for a...
TRANSCRIPT
![Page 1: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/1.jpg)
Program Analysis, in Industry and AcademiaManu Sridharan
![Page 2: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/2.jpg)
My Journey
![Page 3: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/3.jpg)
This talkMy personal journey in industrial research / industry / academia
For a great overview of CS PhD jobs, see Kathleen Fisher's PLMW@PLDI'19 talk: http://bit.ly/careerOptions
My focus: program analysis at scaleStatic analysisType systems and type inference
![Page 4: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/4.jpg)
Grad School2002-2007
![Page 5: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/5.jpg)
Ras Bodik(my advisor)
![Page 6: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/6.jpg)
Problem Space: Pointer AnalysisFinds values for pointer variablesCrucial building block
"What gets called by x.m()?""What code uses untrustedInput()?"
Others formulated C pointer analysis as a CFL-reachability problemRas's suggestion (~January 2003): Java pointer analysis using CFL-reachability
![Page 7: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/7.jpg)
Target programs: BenchmarksStandard suites used in performance papers
SpecJVM98Dacapo
Tried to use "real-world" clientsE.g., proving safety of downcastsAtypical for pointer analysis papers at the time
"Large" programsHundreds of thousands of lines (with libraries)I learned later they weren't that big…
![Page 8: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/8.jpg)
Making progressTook a while (> 1 year) to ramp up
Understanding the (extensive) literatureLearning the engineering "tricks"
A few false starts, paper rejectionsSpent multiple months on an idea that didn't work
Insight: Java analysis has a "balanced parentheses" structureLed to a refinement-based technique with improved scalability and precision
![Page 9: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/9.jpg)
9
High-Level Example
• Must merge some objects (decidability)• Too much merging: precision loss• Too little merging: space explosion (typically exponential)• Our technique: unmerge through refinement
v1: Vector
v2: Vector
a1: Object[]
a2: Object[]
s1: String
s2: String
sn: String
b1: Boolean
bm: Boolean
…
…
x
y
p = (String)x.get(i);
q = (Boolean)y.get(i);
a: Object[]
s: String
b: Boolean
Safe!
Safe!
May Fail
May Fail
Single execution (dynamic)All executions (static)
![Page 10: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/10.jpg)
Branching outIdea: balanced parentheses for slicingImplemented with Steve Fink from IBM
Ras had long collaborated with IBMUsed newly open-source WALA framework
Wrote a fun paper ("Thin Slicing")Realized how much I enjoy close collaboration...
![Page 11: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/11.jpg)
IBM Research2008-2013
![Page 12: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/12.jpg)
Balance in Industrial ResearchSome work should "pay the bills"
Applying research results to a productLower risk (but still publishable!)Tangible impact
Other work should be forward lookingSpeculative, but with a story for eventual impactSometimes turns into "pay the bills" work
I really enjoyed this balance!Get to have a portfolio of impact
![Page 13: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/13.jpg)
Information flow vulnerabilitiesUntrusted data used in sensitive operation
E.g., SQL injectionLeaking of confidential data
E.g., showing exception stack traces
(paying the bills)
![Page 14: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/14.jpg)
https://www.xkcd.com/327/
![Page 15: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/15.jpg)
Target programs: Customer codeMuch different than what I saw in grad schoolEnormous!
Dozens of library jars in classpathEven building class hierarchy had to be optimized
Often based on complex frameworksNo main() method in applicationFramework calls into app based on XML configHard to analyze!
![Page 16: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/16.jpg)
Technique: hybrid thin slicing
![Page 17: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/17.jpg)
The real worldMany heuristics needed for huge programs
Cutoffs, timeouts, etc.Hard to build robustly
Frameworks led to many missed issuesTools sell by doing well in "proof-of-concept"
Sales engineer tries tool on customer codeFind some bugs, don't miss obvious ones"Verifying" the code not relevant
So, design analysis to work well in PoCs
![Page 18: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/18.jpg)
Eventual approachNo whole-program pointer analysis!
Use simple class hierarchy + local tracking of data flow and aliasing
Built a sophisticated tool F4F to make modeling of frameworks easier
Successful product (still sold today)
![Page 19: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/19.jpg)
Shifting gears: JavaScript~2011, heard from Julian Dolby about challenges in analyzing JavaScript web apps
We started studying challenges in core static analysis for JavaScript
Forward looking: no particular customer
![Page 20: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/20.jpg)
Reflective code
Java: x.f = yIf no f field, compile error
JavaScript: x[e] = yComputed field name
If field doesn’t exist, create it!
Disaster for pointer analysis
![Page 21: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/21.jpg)
Used in real world 😐
var e = “blur,focus,load”.split(“,”);// e is [“blur”,”focus”,”load”]for (var i=0;i<e.length;i++) {var o = e[i];jQuery.fn[o] = function() { … };jQuery.fn[“un”+o] = function() { … };jQuery.fn[“one”+o] = function() { … };
}
![Page 22: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/22.jpg)
TechniquesImprovements to traditional pointer analysis
Correlation trackingDynamic determinacy analysis
Scalable, approximate (unsound) call graphsUsed to pay the bills! AppScan JS analysis
![Page 23: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/23.jpg)
Samsung Research2013-2016
![Page 24: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/24.jpg)
Shifting gears: Performance
A push for running JS / web apps on devicesPerformance was lagging behind
Dynamic features hard to optimizeHard to run JIT, GC with resource constraints
Goal: ahead-of-time compilation for JS
![Page 25: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/25.jpg)
Target programs: web appsOften, not much client code
Many had < 10,000 LoC
But, relied on complex frameworksAnd web browser itself is complex!
When optimizing, can’t be unsound
![Page 26: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/26.jpg)
Technique: type inferenceDefined a type system for a rich subset of JS
Enabled efficient code generation
Defined a (global) type inference algorithm, to handle extant code
Built a full compiler backend (compiled to C)
![Page 27: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/27.jpg)
![Page 28: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/28.jpg)
Uber2017-2018
![Page 29: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/29.jpg)
Uber Apps
![Page 30: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/30.jpg)
App reliability: critical
Rider app crash: can’t get home
Driver app crash: can’t earn
Using apps involves payments
Apps take significant time to patch
App reliability: crucial
![Page 31: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/31.jpg)
Target programs: our own codeWe can rewrite code to work well with tool!
Goal: integrate with development processFast, modular analysisUnderstandable error messages
Can require reasonable annotation burden
![Page 32: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/32.jpg)
Technique: Pluggable TypesIdea: add extra type checking to compiler
Leveraging source code annotations
Pioneered by Checker Framework
NullAway: fast, practical NPE preventionEngineered for speedSoundness tradeoffs to reduce annotationsRuns on every Android build at Uberhttps://github.com/uber/NullAway, https://arxiv.org/abs/1907.02127
![Page 33: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/33.jpg)
Example: Nullability
static void log(Object x) {System.out.println(x.toString());
}static void foo() {
log(null);}
Error: cannot pass null to @NonNullparameter x
![Page 34: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/34.jpg)
Example: Nullability
static void log(@Nullable Object x) {System.out.println(x.toString());
}static void foo() {
log(null);}
Error: de-referencing x may yield NPE
![Page 35: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/35.jpg)
Example: Nullability
static void log(@Nullable Object x) {if (x == null) return;System.out.println(x.toString());
}static void foo() {
log(null);}
![Page 36: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/36.jpg)
Other Uber projectsType-based detection of multithreading bugs
Specialized to stream APIsAlso running on every Android build
Optimization of Swift protocols12% speedup in app startupUpstreamed to Apple
Auto-deletion of stale feature flagsStale flags hurt reliability, code readabilityHundreds of flags removed
![Page 37: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/37.jpg)
UC Riverside2019-present
![Page 38: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/38.jpg)
What’s next?I still like to pay the bills!
Make time for open-source hackingPolish tools, within reason
Invest in teaching / advisingResearch: greater risk / time horizonFind good collaborators!
Students, faculty (Riverside / elsewhere), industry
![Page 39: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/39.jpg)
TipsBe aware of “adjacent” areas
Read broadly (papers, Hacker News, …)Make time for it
Networking and visibility matters!Talk to peopleDo open source, give talks (with video)
Value in different perspectives on a problemThe “right”, academic solutionDealing with existing code as-isWorking with developers
![Page 40: Program Analysis, in Industry and Academia · Technique: type inference Defined a type system for a rich subset of JS Enabled efficient code generation Defined a (global) type inference](https://reader033.vdocuments.us/reader033/viewer/2022042222/5ec859b1a23ed260c86775ca/html5/thumbnails/40.jpg)
Thanks!