dev441 writing faster managed code jonathan hawkins lead program manager common language runtime
TRANSCRIPT
![Page 1: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/1.jpg)
DEV441
Writing Faster Managed Code
Jonathan HawkinsLead Program ManagerCommon Language Runtime
![Page 2: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/2.jpg)
Outline
Introduction and design patterns
Managed code performance issues
Cost model
Tools
Wrap-up
![Page 3: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/3.jpg)
Slow Software is BadDon’t Ship It
SymptomsLocked UI – splash screen, wait cursor
Bad citizenship – paging, CPU utilization
Scalability – server farms
Ultimate causesInattentive engineering
Bad design – bad architecture, interfaces, data structures, algorithms
Waste not
Premature optimization...
![Page 4: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/4.jpg)
Design PatternsFaster Code, Smaller Data
Measure it – time and space
Speedup techniquesCache, batch, precompute, defer
Smarter recalcIncremental, progressive, background
Smaller dataDon’t hoard; size appropriately
Arrays vs. links; frugal interfaces
![Page 5: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/5.jpg)
Performance Anti-PatternsThink (Twice)
Waiting on remote data
XML
Excessive OOP
Ignorance and ApathyNot measuring
Not setting performance goals
![Page 6: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/6.jpg)
Perf Process Patterns“That which gets measured gets done”
Perf budgets, goals, requirements
Perf unit tests, regression tests
Process of “constant” improvementMeasuring, tracking, refining, trend lines
Perf cultureUsers: perf as a key feature
Devs: perf as a correctness issue
![Page 7: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/7.jpg)
Outline
Introduction and design patterns
Managed code performance issues
Cost model
Tools
Wrap-up
![Page 8: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/8.jpg)
Why Managed Code?
Programmer productivityGoodbye, corrupt heap debugging
Target modern requirements
FX: ++clean, ++consistent, ++streamlined
Better apps sooner
Performance barrier to adoption?Real – improves with each release
Perceived – “blame it on managed code”
Reality – its “pedal to the metal”
![Page 9: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/9.jpg)
The Challenge of Writing Fast Managed Code
We’re all newbies!
Learning how
May not be learning how much things cost
Everything is easier...
The KnowledgeIldasm, debuggers, CLR Profiler, profilers, timing, vadump, events, Rotor
![Page 10: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/10.jpg)
Managed CodeClose to the Machine
Not your father’s bytecode interpreterSource → IL → native (JIT or NGEN)
Optimizing JIT compilerConstant folding; Constant and copy propagation;Common subexpression elimination;Code motion of loop invariants;Dead store and dead code elimination;Register allocation; Method inlining;Loop unrolling (small loops/small bodies)
.NET Framework 1.1 NGEN does same opts
Disabled when debugging
![Page 11: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/11.jpg)
Managed DataAutomatic Storage Management
Fast new; fast garbage collectionGC traces and compacts reachable object graph
>50 million objects per second
Generational GC HeapsGen0 – new objects – cache sized; fast GC
Gen1 – objects survived a GC of gen0
Gen2 – objects survived a GC of gen1,2
Large object heap
Server GCCache affinitive; concurrent; ASP.NET/hosted
Managed data costs space & time over its lifetime
![Page 12: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/12.jpg)
Managed DataBest Practices
Often performance == allocation profileShort lived objects are cheap (not free)
Try not to churn old objects
Inspect with CLR Profiler
GC “gotchas”Keeping refs to “dead” object graphs
Caches; weak references
Pinning
Boxing
Finalization ...
![Page 13: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/13.jpg)
Managed DataFinalization and the Dispose Pattern
Finalization: ~C(): non-det. res. clean upGC; object unref’d; promote; queue finalizer
Costs – retains object and its objects; finalizer thread; bookkeeping; call
Use rarely; use Dispose PatternImplement IDisposable
Call GC.SuppressFinalize
Hold few obj fields and null them out ASAP
Dispose early; try/finally; C# using
![Page 14: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/14.jpg)
Managed CodeThreading and Synchronization
Use the ThreadPoolEasy, self tuning, good citizen
QueueUserWorkItem, BeginInvoke
lock()– not cheapGranularity trade-off – concurrency vs. cost
Scales much better in .NET 1.1
Consider Interlocked.Exchange, R.W.Lock
![Page 15: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/15.jpg)
Managed CodeReflection
Slower and larger than direct use
Prefer is/as to typeof() ==
Member lookup/enum slow but cached
Reflective invoke is quite slowLookup, overload res., security, stack frame
Activator.CreateInstance too
Prefer MethodInfo.Invoke to Type.InvokeMember
Beware of code that uses reflectionLate binding in VB.NET, use Option Explicit On and Option Strict On
![Page 16: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/16.jpg)
Managed CodeP/Invoke and COM Interop
Efficient, but frequent calls add up
Costs depend on marshalingPrimitive types and arrays of same: cheap
Others, not; e.g. Unicode to ANSI strings
COM interop – learn threading modelsAvoid STA threaded components
Avoid calling or being callable via IDispatch
Mitigate interop call costsChunky interfaces; move to managed code
![Page 17: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/17.jpg)
Outline
Introduction and design patterns
Managed code performance issues
Cost model
Tools
Wrap-up
![Page 18: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/18.jpg)
C/C++ Cost ModelsThe Gut-Feel Cost of a Line of Code
C – close to the machineWYWIWYG; int = * + call → instructions
C++ (OOP)C features: same cost
New features: additional, hidden costsCtors; SI, MI, VI; virtual; PMs; EH; RTTI
What does a function cost?
![Page 19: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/19.jpg)
Towards a Managed Code Cost Model
Optimized native codeC features: similar cost?
OOP features: similar cost?
Let’s measure itSimple timing loops, unrolled some
Modified to prevent CSE/dead code elim.
50 ms each (218 to 230 iterations)
Measured on 1.1 GHz P-III laptop, Win XP
Disclaimers: uncertainty, subj. to change
![Page 20: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/20.jpg)
Costs: MathAvg Primitive Avg Primitive
1.0 int add 1.3 float add 1.0 int sub 1.4 float sub 2.7 int mul 2.0 float mul
35.9 int div 27.7 float div 2.1 int shift 2.1 long add 1.5 double add 2.1 long sub 1.5 double sub
34.2 long mul 2.1 double mul 50.1 long div 27.7 double div
5.1 long shift Nicely optimized and run at full native code speed
![Page 21: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/21.jpg)
Costs: Method Calls
Inlining – !virtual, small, simple, no try
Instance method call-site null this check
Virtual – like C++: (*this->MT[m])(…)
Interface – quadruple indirect(*this->MT->itfmap[i]->MT[m])(…)
Disclaimers: !inlining, branch prediction, arguments
Avg Primitive Avg Primitive 0.2 inlined static call 5.4 virtual call 6.1 static call 6.6 interface call 1.1 inlined instance call 6.8 instance call
![Page 22: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/22.jpg)
Costs: Construction
class A { int a; } // L1class B : A { int b; } // L2class C : B { int c; } // L3 etc.
Allocation / management / GC cost
Value types: “0”
Ref types: fast but ~proportional to size
Construction cost
All fields 0-initialized
Small ctors can be inlined
Larger ctors incur up to 1 call/level
Avg Primitive Avg Primitive 2.6 new valtype L1 22.9 new rt ctor L1 6.4 new valtype L3 32.7 new rt ctor L3
22.0 new reftype L1 28.6 new rt no-inl L1 30.2 new reftype L3 50.6 new rt no-inl L3
![Page 23: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/23.jpg)
Costs: Casts and IsInsts
Safe, secure, verifiable → type safety
Cast may throw exceptionIsInst will not – is and as operators
Up casts always safe and free
Down casts incur a helper function call
Avg Primitive Avg Primitive 0.4 cast up 1 0.8 isinst up 1 0.3 cast down 0 0.8 isinst down 0 8.9 cast down 1 6.3 isinst down 1 9.8 cast (up 2) down 1 10.7 isinst (up 2) down 1 8.7 cast down 3 6.1 isinst down 3
![Page 24: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/24.jpg)
Costs: Write Barriers
Gen0 GC: trace roots and gen0 only?Could miss gen0 refs from gen1/gen2
Write barrier notes obj ref field storescontact.address = newAddress;
Tracks refs to newer gen objectsNot needed for locals, non-objects
Incurs a helper function call
Avg Primitive 6.4 write barrier
![Page 25: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/25.jpg)
Costs: Array Bounds ChecksAvg Min Primitive 1.9 1.9 load int array elem 1.9 1.9 store int array elem 2.5 2.5 load obj array elem
16.0 16.0 store obj array elem
For productivity and runtime integrity
Checks index against array.Length
Inlined, optimized – inexpensive
Range check elimination:for (i=0; i < a.Length; i++)…a[i]…
Helper call for store object array elt.Bounds check, type check, write barrier
![Page 26: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/26.jpg)
SummaryA Managed Code Cost Model
Like C/C++, close to the machine~1 ns: int, float = * + - * ==
~6 ns: calls (perfectly predicted)
Unlike C++~20-40 ns: new small obj + gen0 GC, box
~6-8 ns: casts, write barriers
~16 ns: object[] stores
Reflection? Think 100 times slower
![Page 27: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/27.jpg)
(Get Real)Consider Computer Architecture
Cache misses, page faults
1983: 1 MIPS; 300 ns DRAM; 25 ms disk
2003: 10 BOPS; 100 ns DRAM; 10ms disk“Branch-predicting out-of-order superscalar trace-cache RISC w/ 3L data caches”
Issue 10,000 ops in 1 μs – or 10 DRAM reads
Full cache miss – 1,000 ops
Page fault – 100 M ops!
100 ns full cache miss >> any CLR op’n
Locality of reference matters
![Page 28: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/28.jpg)
Outline
Introduction and design patterns
Managed code performance issues
Cost model
Tools
Wrap-up
![Page 29: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/29.jpg)
Tools
InspectorsIldasm, debuggers – beware “debug mode”
MeasurersCode profilers, perfmon, CLR Profiler, vadump
Simple timing loops[...InteropServices.DllImport("KERNEL32")]private static extern boolQueryPerformanceCounter(ref long lpCount);QueryPerformanceFrequency(ref long lpFreq);
Rotor
![Page 30: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/30.jpg)
Outline
Introduction
Managed code performance issues
Cost model
Tools
Wrap-up
![Page 31: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/31.jpg)
In Conclusion…The Secret to Faster Managed Code
“There is no magic faster pixie dust!”Look in the mirror
You have the power and the responsibility
Mantra: Set goals, measure, understand the platform
![Page 32: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/32.jpg)
ResourcesResourcesManaged code perf papers atMSDN .NET Developer Center[http://msdn.microsoft.com/netframework/]
“GC Basics and Performance Hints”
“Writing Faster Managed Code: Know What Things Cost”
CLR Profiler (same site)
SSCLI [http://msdn.microsoft.com/net/sscli]
Stutz et al., Shared Source CLI Essentials
![Page 33: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/33.jpg)
Community Resources
Community Resourceshttp://www.microsoft.com/communities/default.mspx
Most Valuable Professional (MVP)http://www.mvp.support.microsoft.com/
NewsgroupsConverse online with Microsoft Newsgroups, including Worldwidehttp://www.microsoft.com/communities/newsgroups/default.mspx
User GroupsMeet and learn with your peershttp://www.microsoft.com/communities/usergroups/default.mspx
![Page 34: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/34.jpg)
evaluationsevaluations
![Page 35: DEV441 Writing Faster Managed Code Jonathan Hawkins Lead Program Manager Common Language Runtime](https://reader036.vdocuments.us/reader036/viewer/2022070415/5697bf981a28abf838c9131d/html5/thumbnails/35.jpg)
© 2003 Microsoft Corporation. All rights reserved.© 2003 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.