jit compilation - microsoft azure › slides... · jit performance counters - % time in jit %...
TRANSCRIPT
JIT Compilation
Module Overview
JIT Compilation
Native vs. Managed Compilation
Managed Execution Phases
Assembly Loading & Initialization
JIT Compilation
JIT Optimizations
What’s new in NGEN 4.0?
When to use NGEN?
2
Running Code
Behavior in Windows 2000
- Legacy entry point mscoree!CorExeMain gets used
Behavior in Windows XP
- The operating system loader checks for managed modules by examining a bit
in the common object file format (COFF) header- The bit being set denotes a managed module
- If the loader detects managed modules, it loads mscoree.dll, and
clr!CorValidateImage and clr!CorImageUnloading notify the loader when the
managed module images are loaded and unloaded
clr!CorValidateImage performs the following:
- Ensures that the code is valid managed code
- Changes the entry point in the image to an entry point in the runtime
On 64-bit Windows, _CorValidateImage modifies the image that is in
memory by transforming it from PE32 to PE32+
3
Native compile vs Managed compile
Simplified view of native code compilation
.CPP or .C file
containing C or C++
code
Compile.OBJ File (Machine
language)
.EXE or .DLL File
(Machine language)Link
Native Code
Type describing information
Type
describing
information
4
Native compile vs Managed compile
Simplified view of managed code compilation
.CS File containing C#
codeCompile
Assembly(.EXE or
.DLL) containing MSIL
and Metadata
Machine Language
generated in memory
at runtime by JIT
compiler
Execute
Managed Code
5
.NET Source code
IL and Metadata
Managed Execution Phases
Native Code
Compile Time
Run Time
Phase.NET Source code
IL and Metadata
IL and Metadata
Native CodeJIT (CLR)
Compile
6
JIT Compilation
What does JIT do?
Checks if function is called 1st time
- JIT compiles IL code to native code if true
Stores native code in memory
Updates MethodDescriptor field
- Reference updated to point to memory location of native code
7
Execute the native code
MyModule::Main
First call?
CILJit::compileMethod (x86)
PreJit::compileMethod (x64)
[verifies and compiles the IL]
Native code, GCInfo, EH
data,etc.
Store native code in memory Store the address in
MethodDesc
Execute the native code
Ye
s
Retrieve address of native
code from MethodDesc
No
Managed Execution
JIT Compilation
8
MethodDescriptor
Contains implementation of a managed method
Generated as part of the class loading procedure
Initially points to IL Code
Can be determined during debugging
- !SOS.DumpMD <MethodDesc address>
0:004> !dumpmd 009969a8Method Name: MyApp.MainForm.menu_Click(System.Object, System.EventArgs)
Class: 00cd5c0c
MethodTable: 00996ad4
mdToken: 0600028d
Module: 00992c3c
IsJitted: no
CodeAddr: ffffffffTransparency: Critical
9
Process starts and
loads the .NET
Framework
MSCoreEE.dll is loaded
Process Main thread starts executing
by calling _CorExeMain
Initializes CLR
Reads MetaData tables
Build InMemory representation
ClassLoader is called
JIT compile the Main method
Execute Main
MethodTable &
EEClass
Managed Execution
Assembly Loading and Initialization
10
Anatomy of a managed non-value instance
(very schematic)
EEClass*
Method
Desc*
“Hot” metadata
VTables
Full
(“Cold”)
metadata
MethodTableEEClass
MethodDesc
Loader Heap
Pointer to
•PreJittedStub
•JItted code
•“IL” stub
MethodTable*
GC Heap
Field layout
Instance
* == Pointer11
!DumpClass & DumpMT
12
JIT Optimizations
Summary
Types of Optimization:
- JIT Inlining
- JIT Tail Calls
13
JIT Optimizations
Tail Calls
When the last thing a function does is call another function
- Calls without Optimization:
- With Optimization:
- will use the same stack space as the caller.
- improve data locality, memory usage,
and cache usage.
static public void Main(){
Helper();}
static public void Helper(){
One();Two();Three();
}
static public void Three(){
...}
Call One()Call Two()Call Three()
Call One()Call Two()Jump Three()
14
JIT Optimizations
Tail Calls
Tail Call Feature set different between X86 and X64
- Can lead to e.g. Stack Overflow Exception on X86 Debug but works fine on
X64 ( where x86 stack just enough in release with tail call, but overflows
without optimization)
No Tail Calls possible if:
- Caller doesn't return immediately after the call
- Stack arguments between caller and callee are incompatible in a way that
would require shifting things around in the caller's frame before the callee
could execute
- Caller and callee return different types
- We inline the call instead (inlining is way better than tail calling, and opens the
door to many more optimizations)
- Security issues
- The debugger / profiler turned off JIT optimizations
- Full list see: .NET 2.0 Tail limits and .NET 4.0 Tail Limits
15
JIT Optimizations – Inlining
Without inlining
With inlining
class Test{static int And(int i1, int i2){return i1 & i2;}static int i;static public void Main(){i = And(i, 0); }}
class Test{static int i;static public void Main(){i = 0 // xor edx,edx}}
16
JIT Optimizations – Inlining
Without Inlining:Main()<Setup stack>mov ecx,dword ptr ds:[183368h] ; setup first argument (i)xor edx,edx ; setup second argument (0)call dword ptr ds:[183818h] (Inline.Program+Test.And(Int32, Int32), mdToken: 06000002) ; Call And(…)mov dword ptr [ebp-4],eax ; save return valuemov eax,dword ptr [ebp-4] ; assign result to staticmov dword ptr ds:[00283368h],eax ; assign result to static<cleanup stack>ret ; return
And(Int32, Int32)<Setup stack>mov eax,dword ptr [ebp-4] ; move arg 1 to eaxand eax,dword ptr [ebp-8] ; Add argument 2 to eax (return register)<cleanup stack>ret ; return to caller
17
JIT Optimizations – Inlining
With Inlining
- And(.,.) is inlined now
- No add reg,reg - because not needed (argument is 0)
MethodDesc TableEntry MethodDesc JIT Name53dda7e0 53bb4934 PreJIT System.Object.ToString()53dde2e0 53bb493c PreJIT System.Object.Equals(System.Object)53dde1f0 53bb495c PreJIT System.Object.GetHashCode()53e61600 53bb4970 PreJIT System.Object.Finalize()001dc019 001d3828 NONE Inline.Program+Test..ctor()001dc011 001d3810 NONE Inline.Program+Test.And(Int32, Int32)00270070 001d381c JIT Inline.Program+Test.Main()
Main() xor edx,edx ; generate final result mov dword ptr ds:[1D3368h],edx ; move result to staticret ; return
18
Demo:JIT Compilation
!dumpmt –md
bp cmdStartJit_Click
JIT Optimizations
Additional Config
Instruct CLR not to optimize the code (during jit)
without recompiling the dll:
- Use an ini file (and symbols)
• MyDll.ini:[.NET Framework Debugging Control]
GenerateTrackingInfo=1 (per default on up from .NET 2.0)
AllowOptimize=0
• (useable for GAC as well)
Instruct CLR to ignore (optimized) Ngen Image
- Use Environment variable: set COMPLUS_ZapDisable=1
20
JIT Performance Counters - % Time in Jit
% elapsed time in JIT compilation since JIT started
Updated at the end of every JIT compilation phase.
A JIT compilation phase occurs when a method and its
dependencies are compiled.
A value > 5% can indicate a problem
- Is Ngen an option?
• http://msdn.microsoft.com/en-us/magazine/cc163610.aspx
- Do you use multiple AppDomains?
• loading assemblies as domain neutral can help
- Minimize the classes and assemblies within code path
• Use code coverage to determine these components.
- See .NET Framework Usage Performance Rules/DA0009
21
JIT Performance Counters -summary
Performance counter Description
# of IL Bytes
JITted
Displays the total number of Microsoft intermediate language (MSIL) bytes compiled by
the just-in-time (JIT) compiler since the application started. This counter is equivalent to
the Total # of IL Bytes Jitted counter.
# of IL Methods
JITted
Displays the total number of methods JIT-compiled since the application started. This
counter does not include pre-JIT-compiled methods.
% Time in Jit
Displays the percentage of elapsed time spent in JIT compilation since the last JIT
compilation phase. This counter is updated at the end of every JIT compilation phase. A
JIT compilation phase occurs when a method and its dependencies are compiled.
IL Bytes Jitted /
sec
Displays the number of MSIL bytes that are JIT-compiled per second. This counter is
not an average over time; it displays the difference between the values observed in the
last two samples divided by the duration of the sample interval.
Standard Jit
Failures
Displays the peak number of methods the JIT compiler has failed to compile since the
application started. This failure can occur if the MSIL cannot be verified or if there is an
internal error in the JIT compiler.
Total # of IL
Bytes Jitted
Displays the total MSIL bytes JIT-compiled since the application started. This counter is
equivalent to the # of IL Bytes Jitted counter.
22
JIT Performance Counters - % Time in Jit
% elapsed time in JIT compilation since JIT started
Updated at the end of every JIT compilation phase.
A JIT compilation phase occurs when a method and its
dependencies are compiled.
A value > 5% can indicate a problem
- Is Ngen an option?
• http://msdn.microsoft.com/en-us/magazine/cc163610.aspx
- Do you use multiple AppDomains?
• loading assemblies as domain neutral can help
- Minimize the classes and assemblies within code path
• Use code coverage to determine these components.
- See .NET Framework Usage Performance Rules/DA0009
23
JIT Performance Counters -summary
Performance counter Description
# of IL Bytes
JITted
Displays the total number of Microsoft intermediate language (MSIL) bytes compiled by
the just-in-time (JIT) compiler since the application started. This counter is equivalent to
the Total # of IL Bytes Jitted counter.
# of IL Methods
JITted
Displays the total number of methods JIT-compiled since the application started. This
counter does not include pre-JIT-compiled methods.
% Time in Jit
Displays the percentage of elapsed time spent in JIT compilation since the last JIT
compilation phase. This counter is updated at the end of every JIT compilation phase. A
JIT compilation phase occurs when a method and its dependencies are compiled.
IL Bytes Jitted /
sec
Displays the number of MSIL bytes that are JIT-compiled per second. This counter is
not an average over time; it displays the difference between the values observed in the
last two samples divided by the duration of the sample interval.
Standard Jit
Failures
Displays the peak number of methods the JIT compiler has failed to compile since the
application started. This failure can occur if the MSIL cannot be verified or if there is an
internal error in the JIT compiler.
Total # of IL
Bytes Jitted
Displays the total MSIL bytes JIT-compiled since the application started. This counter is
equivalent to the # of IL Bytes Jitted counter.
24
.NET 4.5: Enabling Multi-Core Background JIT
using
public static void
/*Sets Application Profiler Path */
/*
Starts JIT on multicore Systems Starts process of recording current method use, which later overwrites the specified profile file.
*/
25
.NET 4.5: Multi-Core Background JIT
ProfileOptimization.StartProfile("MyApp.Scenario1");
…
ProfileOptimization.StartProfile("MyApp.Scenario2");
….
ProfileOptimization.StartProfile("MyApp.Scenario3");
http://support.microsoft.com/kb/2715214/en-us
26
.NET 4.6 - RyuJIT
Better througput (25% for Bing)
Disable:
<configuration><runtime>
<useLegacyJit enabled=“1"/></runtime>
</configuration>
27
Demo:JIT Optimizations
AllowOptimize
ETW Architecture
Controllers
Log Files
Session
1
Session
2
Session
64
Event Tracing Sessions
Providers Consumers
…
Events EventsEvents
29
JIT ETW tracing in .NET Framework 4
Gives Information about Inlining or tail-calling a certain method
- Register .NET Events:
- Start ETW tracing (JIT events on: 1010)
- Stop ETW Tracing and view Trace with Xperf
wevtutil im ….\v4.0.21006\clr-etw.man
Xperf –on basexperf -start Jit -on e13c0d23-ccbc-4e12-931b-d9cc2eee27e4:0x1010:5 -f JIT.etl<Start of my Application>
xperf –stop JitXperf –d base.etlXperf –merge Jit.etl base.etl merge.etlWpa merge.etl
30
.NET ETW Events -summaryRuntime keyword name Value Purpose
GCKeyword 0x00000001 Enables the collection of garbage collection events.
LoaderKeyword 0x00000008 Enables the collection of loader events.
JITKeyword 0x00000010 Enables the collection of just-in-time (JIT) events.
NGenKeyword 0x00000020Enables the collection of events for native image methods (methods processed by the Native Image Generator,
Ngen.exe); used with StartEnumerationKeyword and EndEnumerationKeyword.
StartEnumerationKeyword 0x00000040 Enables the enumeration of all the methods in the runtime; used in conjunction with NGenKeyword.
EndEnumerationKeyword 0x00000080Enables the enumeration of all the methods destroyed in the runtime; used in conjunction with JITKeyword and
NGenKeyword.
SecurityKeyword 0x00000400 Enables the collection of security events.
AppDomainResourceManagementK
eyword 0x00000800 Enables the collection of resource monitoring events at an application domain level.
JITTracingKeyword 0x00001000 Enables the collection of JIT tracing events.
InteropKeyword 0x00002000 Enables the collection of interop events.
ContentionKeyword 0x00004000 Enables the collection of contention events.
ExceptionKeyword 0x00008000 Enables the collection of exception events.
ThreadingKeyword 0x00010000 Enables the collection of threadpool events.
StackKeyword 0x40000000 Enables the collection of CLR stack trace events 31
Demo:JIT ETW Events
Wevtutil im CLR-ETW.man
xperf
.NET Decompilation
Managed decompilers
- Output source code and IL
- Source output available in multiple languages
Obfuscation can make it more difficult to decompile source code
Many different options
33
PreJit / NGEN
.NET 2.0 - 3.5
Ngen is calling a Service (LocalSystem).
This service compiles the image within background.
Support compilation of all dependent DLLs and update
functionality
Metadata is now included within created dll/exe
Ngen MyNiceExe.EXEmscorsvw.exe
JIT
C:\WINDOWS\assembly\
NativeImages_v2.0.50215_32
\MyExe\MyExe.exe
34
NGEN 4.0
Side by Side support
Ngen.exe now compiles assemblies with full trust, CAS policy is no longer evaluated.
Native images that are generated with Ngen.exe can no longer be loaded into applications that are running in partial trust.
Located in folder Framework\v4.0.xxxxx
Supports .NET 4.0 and .NET 2.0 assemblies
- Generates 2.0 image
• ngen.exe install <2.0 assembly>
- Generates 4.0 image
• ngen.exe install <2.0 assembly> /ExeConfig:<Path to a 4.0 EXE>
OR
• ngen.exe install <2.0 EXE with a config file that indicates 4.0 as the preferred runtime>
35
NGEN 4.0
Target Patching
In .NET 2 - 3.5, if Assembly Y depends on X then CLR re-ngen Y
for any change in X, because
• Y may inline methods from X
• Y may use fields in X’s classes (layout of classes might change)
• Y may derive X’s classes (layout of classes might change)
BUT ~half of changes only modify bodies of large methods
• Large methods not inlined cross-assembly
• No need to re-NGEN
– if only function bodies changed (unless function prototype
changed)
• Works great for QFEs and GDRs (small security fixes)
• Unlikely to work for a service pack
36
.NGEN 4.0
Priotization of NGEN
Priority 1 images compiles on all cores
- ngen.exe install /queue:1 <MyImportantAssembly#1>
- ngen.exe install /queue:1 <MyImportantAssembly#2>
Priority 3 images compile at idle time
- ngen.exe install /queue:3 <MyAssembly#N+1>
37
.NGEN 4.5
Starting from windows 8 and .NET framework 4.5
Native images will be created Automatically by Auto NGen
Maintenance Task
Images will be created based on “Assembly Usage Logs”
created by the application in the AppData windows directory
Auto NGEN Maintenance Task is based on Automatic
Maintenance that runs at background when the machine is idle
Auto NGEN Maintenance Task also reclaiming native images
that are not in use anymore
38
.NGEN 4.5 Notes
The assembly must targets the .NET Framework 4.5
The Auto NGen runs only on Windows 8 and above
For Desktop apps the Auto NGen applies only to GAC
assemblies
For Modern Style Apps Auto NGen applies to all assemblies
Auto NGen will not remove not used rooted native images
(Images NGened by the developers).
39
.NET 4.5: Managed Profile Guided Optimization
IL Assembly MPGO
IL Assembly with
embedded training profile
NGENOptimized
precompiled native Image
MPGO co-locates frequently used image data within a native image
reduce the number of pages loaded from disk.
• Less page faultsreduce the number of copy-on-write pages.
Improves startup time, memory usage (All apps)
40
.NET 4.5: How to MPGO
1. Run the MPGO tool (as an administrator) with the necessary parameters.The optimized IL assemblies are created in the C:\Optimized folder.
2. Run the NGen tool (as an administrator) with the necessary parameters for each application DLL:
MPGO -scenario MyLargeApp.exe -AssembyList *.* -OutDir C:\Optimized\
NGEN.exe c:\Optimized\myLargeApp.exe
.NET 4.6 - Ngen
Better througput (25% for Bing)
Disable for specific assemblies
<configuration>
<runtime>
<disableNativeImageLoad>
<assemblyIdentity name="assembly_one" />
<assemblyIdentity name="assembly_two" />
</disableNativeImageLoad>
</runtime>
</configuration>
42
If DLL used in several loaded processes
If multiple instances of application started e.g. Terminal Server
Be sure to set the base address of your assemblies correctly
Rebasing DLL during load impacts perf & prevents sharing image
When to use Ngen?
43
ASLR
/DYNAMICBASE
Address Space Layout Randomization
- Up from Vista and Windows 2008 Server
• Comes with .NET 3.5 Sp1
• C++: use DynamicBase
Rebasing
- done within the kernel
- Pages still shareable
- Backed up by the image – not the page file
- Base Address is no issue anymore
- Rebuild your app with .NET 3.5 sp1!
44
Review
1. What is the benefit of using NGEN?
2. What is tail optmization?
3. What tool can be used for viewing ETW files?
45
Reference
.NET Code Generation Blog
- http://blogs.msdn.com/b/clrcodegeneration/
The Performance Benefits of NGen.
- http://msdn.microsoft.com/en-us/magazine/cc163610.aspx
JIT ETW tracing in .NET Framework 4
- http://blogs.msdn.com/b/clrcodegeneration/archive/2009/05/11/jit-etw-tracing-
in-net-framework-4.aspx
46