parallel programming - all-electronics · go parallel with coarray fortran. intel® fortran...
TRANSCRIPT
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Parallel Programming The Ultimate Road to Performance
April 16, 2013
1
Werner Krotz-Vogel
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. © 2009 Mathew J. Sottile, Timothy G. Mattson, and Craig E
2
Getting started with parallel algorithms • Concurrency is a general concept
– … multiple activities that can occur and make progress at the same time.
• A parallel algorithm is any algorithm that uses concurrency to solve a problem of a given size in less time
• Scientific programmers have been working with parallelism since the early 80’s
– Hence we have almost 30 years of experience to draw on to help us understand parallel algorithms.
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Develop & Parallelize Today for Maximum Performance
Use One Software Architecture Today. Scale Forward Tomorrow.
Cluster
Multicore Cluster
Enabling & Advancing Parallelism High Performance Parallel Programming
Code
Compiler Libraries
Parallel Models
Multicore & Many -core
Cluster
Many-core
Multicore CPU
Intel® Xeon Phi™ coprocessor
Multicore
Multicore CPU
Intel tools, libraries and parallel models extend to multicore, many-core and heterogeneous computing
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Intel® Software Development Products Deliver Application Performance
Foundation of Performance, Productivity, and Standards
Advanced Performance Cluster Performance
Intel® Inspector XE, Intel® VTune™ Amplifier XE, Intel® Advisor
Intel® C/C++ and Fortran Compilers w/OpenMP
Intel® MKL, Intel® Cilk™ Plus, Intel® TBB Library, Intel® IPP Library
Intel® Trace Analyzer and Collector
Intel® MPI Library
Intel® Parallel Studio XE
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
A Family of Parallel Programming Models Developer Choice
Intel® Cilk™ Plus C/C++ language extensions to simplify parallelism
Open sourced
Also an Intel product
Intel® Threading Building Blocks
Widely used C++ template library for parallelism
Open sourced
Also an Intel product
Domain-Specific Libraries
Intel® Integrated Performance Primitives
Intel® Math Kernel Library
Established Standards
Message Passing Interface (MPI)
OpenMP*
Coarray Fortran
OpenCL*
Offload Extensions
Research and Development
Intel® Concurrent Collections
Intel® SPMD Parallel Compiler
Choice of high-performance parallel programming models
Applicable to Multicore and Many-core Programming
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Invest in Common Tools and Programming Models
Intel® Xeon® processors are designed for intelligent performance and smart
energy efficiency
Continuing to advance Intel® Xeon® processor family and instruction set (e.g., Intel®
AVX, etc.)
Multicore
Intel® Xeon Phi™ coprocessors are ideal for highly parallel computing
applications
Software development platforms ramping now
+
Many-core
Tomorrow
Use One Software Architecture Today. Scale Forward Tomorrow.
Code
Today
Use One Software Architecture
+
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
void foo() /* Intel® Math Kernel Library */ { float *A, *B, *C; /* Matrices */ sgemm(&transa, &transb, &N, &N, &N, &alpha, A, &N, B, &N, &beta, C, &N); }
Go Parallel with High Performance Math Kernel Library Intel® Math Kernel Library (Intel® MKL)
Intel® Xeon® processor Intel® Xeon Phi™ coprocessor
Implicit automatic offloading requires no code changes, simply link with the offload MKL Library
Intel High Performance Math Kernel Library is Applicable to Multicore and Many-core Programming
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Go Parallel with Intel® Cilk™ Plus
• Proven Cilk parallel model, teachable in one minute – Parallelism in Three Key Words:
• cilk_spawn • cilk_sync • cilk_for
• Cilk™ Plus: an open specification
– Recently placed into open source by Intel for the advancement of parallel programming
Learn more at http://cilkplus.org
// Parallel function invocation, in C cilk_for (int i=0; i<n; ++i){ Foo(a[i]); }
// Parallel spawn in a recursive fibonacci // computation, in C int fib (int n) { if (n < 2) return 1; else { int x, y; x = cilk_spawn fib(n-1); y = fib(n-2); cilk_sync; return x + y; } }
Intel® Cilk™ Plus is Applicable to Multicore and Many-core Programming
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
//pragma SIMD: User-mandated // vectorization #pragma simd for (i=0; i<n; i++) { A[i] = A[i]+ B[i] + C[i]; }
// Simplify operation using // array notations in C/C++: a[:] = b[:] + c[:];
// Elemental functions, in C, // using Cilk Plus: __declspec (vector) void saxpy(float a, float x, float &y) { y += a * x; }
Go Parallel with Intel® Cilk™ Plus
• Data and Task Parallelism as first class citizens in C and C++ – Vectorization via intuitive
notations that automatically span MMX, SSE, AVX, and wider widths in the future including those in the Intel® Xeon Phi™ coprocessors
• array notations • #pragma SIMD controls • elemental functions
Learn more at http://cilkplus.org
Intel® Cilk™ Plus is Applicable to Multicore and Many-core Programming
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Go Parallel with Intel® Threading Building Blocks (Intel® TBB)
• A popular parallel abstraction for C++ developers
– A C++ template library – Scalable memory allocation – Load-balancing – Work-stealing task scheduling – Thread-safe pipeline – Concurrent containers – High-level parallel algorithms – Numerous synchronization primitives
• Intel remains a leading participant and contributor in the TBB open source project as well as a leading supplier of TBB support and supporting tool.
//Parallel function invocation example, in C++, //using TBB:
parallel_for (0, n, [=](int i) { Foo(a[i]);
});
Learn more at http://threadingbuildingblocks.org
Intel® TBB is Applicable to Multicore and Many-core Programming
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Intel® Threading Building Blocks
Concurrent Containers Concurrent access, and a scalable
alternative to containers that are externally locked for thread-safety
Miscellaneous Thread-safe timers
Generic Parallel Algorithms Efficient scalable way to exploit the power of
multi-core without having to start from scratch
Task scheduler The engine that empowers parallel
algorithms that employs task-stealing to maximize concurrency
Synchronization Primitives User-level and OS wrappers for
mutual exclusion, ranging from atomic operations to several flavors of mutexes and condition
variables
Memory Allocation Per-thread scalable memory manager and false-sharing free allocators
Threads OS API wrappers
Thread Local Storage Scalable implementation of thread-local data that supports
infinite number of TLS
TBB flow graph
11
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
struct body { std::string my_name; body( const char *name ) : my_name(name) {} void operator()( continue_msg ) const {
printf("%s\n", my_name.c_str()); }
}; int main() { graph g; broadcast_node< continue_msg > start; continue_node< continue_msg > a( g, body("A") ); continue_node< continue_msg > b( g, body("B") ); continue_node< continue_msg > c( g, body("C") ); continue_node< continue_msg > d( g, body("D") ); continue_node< continue_msg > e( g, body("E") ); make_edge( start, a ); make_edge( start, b ); make_edge( a, c ); make_edge( b, c ); make_edge( c, d ); make_edge( a, e ); for (int i = 0; i < 3; ++i ) { start.try_put( continue_msg() ); g.wait_for_all(); } return 0; }
f()
f()
f()
f() f()
A B
C
D
E
12
TBB Flow Graph Dependence Example
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Go Parallel with Message Passing Interface (MPI) Intel® Message Passing Interface (Intel® MPI)
• Extend your cluster solutions to the Intel® Xeon Phi™ coprocessor – E.g., Intel Xeon Phi™ coprocessor
in every node of the cluster using Intel® MPI and Intel® Threading Building Blocks and/or Intel® Cilk™ Plus on nodes
– Same model as an Intel® Xeon processor based cluster .
Learn more at http://intel.com/go/mpi
Intel is a leading vendor of MPI implementations and tools
Clusters with Multicore and Many-core
… …
Multicore Cluster
Clusters
MPI is applicable to Multicore and Many-core Programming
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Go Parallel with Coarray Fortran Intel® Fortran Compiler
• A standard, explicit notation for data decomposition, such as that often used in message-passing models, expressed in a natural Fortran-like syntax.
• For parallel programming on both shared memory and distributed memory systems
!Sum in Fortran, using co-array feature:
REAL SUM[*] CALL SYNC_ALL( WAIT=1 ) DO IMG= 2,NUM_IMAGES() IF (IMG==THIS_IMAGE()) THEN SUM = SUM + SUM[IMG-1] ENDIF CALL SYNC_ALL( WAIT=IMG ) ENDDO
Learn more at http://intel.com/software/products
Coarray Fortran is Applicable to Multicore and Many-core Programming
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Intel® Xeon® processor Intel® Xeon Phi™ coprocessor
main() { double pi = 0.0f; long i; for (i=0; i<N; i++) { double t = (double)((i+0.5)/N); pi += 4.0/(1.0+t*t); } printf("pi = %f\n",pi/N); }
#pragma omp parallel for reduction(+:pi) #pragma offload target (mic)
OpenMP* is Applicable to Multicore and Many-core Programming
One Line Change to Offload to the
Intel® Xeon Phi™ coprocessor
Go Parallel with OpenMP* Intel® C/C++ and Fortran Compilers (C Example)
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Intel® Xeon® processor Intel® Xeon Phi™ coprocessor
do i=1,10 A(i) = B(i) * C(i) enddo !$omp end parallel do
!$omp parallel do !dir$ omp offload target(mic)
Go Parallel with OpenMP* Intel® C/C++ and Fortran Compilers (Fortran Example)
OpenMP* is Applicable to Multicore and Many-core Programming
One Line Change to Offload to the
Intel® Xeon Phi™ coprocessor
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Go Parallel with C/C++ Language Extensions
• Simple Keyword Language Extensions to control offloading to Intel Xeon Phi™ coprocessor
C/C++ Language Extensions to Multicore and Many-core Programming
C/C++ Language Extensions class _Shared common { int data1;
char *data2;
class common *next;
void process();
};
_Shared class common obj1, obj2;
… _Cilk_spawn _Offload obj1.process(); _Cilk_spawn obj2.process();
…
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Use the Same Code for Execution on Intel® Xeon Phi™ coprocessors by Offloading
• C/C++ Offload Pragma
#pragma offload target (mic)
#pragma omp parallel for reduction(+:pi)
for (i=0; i<count; i++) {
float t = (float)((i+0.5)/count);
pi += 4.0/(1.0+t*t);
}
pi /= count;
MKL Implicit Offload
//MKL implicit offload requires no source code changes, simply link with the offload MKL Library.
• MKL Explicit Offload
#pragma offload target (mic) \
in(transa, transb, N, alpha, beta) \
in(A:length(matrix_elements)) \
in(B:length(matrix_elements)) \
in(C:length(matrix_elements)) \
out(C:length(matrix_elements)alloc_if(0))
sgemm(&transa, &transb, &N, &N, &N, &alpha,
A, &N, B, &N, &beta, C, &N);
• Fortran Offload Directive
!dir$ omp offload target(mic)
!$omp parallel do
do i=1,10
A(i) = B(i) * C(i)
enddo
!$omp end parallel
C/C++ Language Extensions
class _Shared common {
int data1;
char *data2;
class common *next;
void process();
};
_Shared class common obj1, obj2;
…
_Cilk_spawn _Offload obj1.process();
_Cilk_spawn obj2.process();
…
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Parallelism with OpenCL* Intel® OpenCL SDK
• OpenCL* is a framework for writing programs that execute across heterogeneous platforms (e.g., CPUs, GPUs, many-core)
• Intel is a leading participant in the OpenCL* standard efforts, and a vendor of solutions
and related tools with early implementations available today.
• OpenCL* addresses the needs of customers in specific segments
//Simple per element multiplication using OpenCL*:
kernel void dotprod( global const float *a, global const float *b, global float *c) { int myid = get_global_id(0); c[myid] = a[myid] * b[myid]; }
Learn more at http://intel.com/go/opencl
OpenCL is applicable to multicore and many-core programming
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Intel Host Processor
Multicore
Running your Application Execution on the host and Intel® Xeon Phi™ coprocessor
Intel® Xeon Phi™ coprocessor(s)
Many-core
Host Offload Library
Message Library
Target Offload Library
Message Library
Without: Intel® Xeon Phi™ coprocessor(s) are absent
With: Intel® Xeon Phi™ coprocessor(s) are present
Application starts and executes on host
Application starts on host and executes portions on Intel® Xeon Phi™ coprocessor(s)
At runtime, if Intel® Xeon Phi™ coprocessor (s) are available, the target binary is loaded
At each offload, the construct runs on host cores/threads
At each offload, the construct runs on the Intel® Xeon Phi™ coprocessor(s)
Normal program termination on host
At program termination, target binary is unloaded
Your Application With identified
Compute Intensive Kernels
Execution Flow
Your Application With identified
Compute Intensive Kernels
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Intel® MPI/Thread Environment Support
The execution command mpirun of Intel® MPI reads argument sets from the command line:
Sections between „:“ define an argument set (alternatively a line in a configfile specifies a set)
Host, number of nodes, but also environment can be set independently in each argument set
# mpirun –env I_MPI_PIN_DOMAIN 4 –host myXEON ... : -env I_MPI_PIN_DOMAIN 16 –host myMIC
Adapt the important environment variables to the architecture
OMP_NUM_THREADS, KMP_AFFINITY for OpenMP CILK_NWORKERS for Intel® CilkTM Plus
21
* Although locality issues apply as well, multicore threading runtimes are by far more expressive, richer, and with lower overhead.
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Analyzing your Application Performance Analysis Tools
• Intel® VTune™ Amplifier XE performance profiler – Analyze your multicore and many-core performance
• Analyze performance of the application in offload mode
• Support for Intel® Xeon Phi™ coprocessors includes:
– A Linux* hosted command line tool that collects events
– The VTune™ Amplifier XE graphical user interface to display results collected in previous step highlighting bottlenecks, time spent and other details of performance.
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
GDB* on Intel® Xeon Phi™ Coprocessor
• GDB* supports Intel® Xeon Phi™ Coprocessor
• Intel upstreams features and capabilities to GNU* community
• Broad enabling of developers and software tools ecosystem
• Available from Intel at http://software.intel.com
23
4/16/201
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
The GNU* Project Debugger and Intel® Xeon Phi™ Coprocessor • Native and cross-debugger versions of GDB*
exist for the Intel® Xeon Phi™ coprocessor • It is part of the Intel® Manycore Platform
Software Stack (Intel® MPSS) • http://software.intel.com/en-us/articles/intel-
manycore-platform-software-stack-mpss You can debug with it as either root or a user
24
Intel Confidential – NDA presentation
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Native debugging on the Intel® Xeon Phi™ Coprocessor with GDB*
25
• Run GDB* on the Intel® Xeon Phi™ Coprocessor ssh –t mic0 /usr/bin/gdb
– To attach to a running application via the process-id
(gdb) shell pidof my_application
42
(gdb) attach 42
– To run an application directly from GDB* (gdb) file /target/path/to/application
(gdb) start
Intel Confidential – NDA presentation
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Remote debugging with GDB* for Intel® Xeon Phi™ Coprocessor
26
• Run GDB* on your localhost
/usr/linux-k1om-4.7/bin/x86_64-k1om-linux-gdb
Start gdbserver on the Intel® Xeon Phi™Coprocessor • To remote debug using |ssh (gdb) target extended-remote | ssh –T mic0 gdbserver –multi IP:port
• To remote debug using stdio (gdb) target extended-remote | ssh -T mic0 gdbserver –multi -
To attach to a running application via the process-id (pid) (gdb) file /local/path/to/application
(gdb) attach <remote-pid>
To run an application directly from GDB* (gdb) file /local/path/to/application
(gdb) set remote exec-file /target/path/to/application
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Explore Intel® Xeon Phi™ Coprocessor Architecture Features
27 4/16/2013
List all new vector and mask registers (gdb) info registers zmm k0 0x0 0 ⁞ zmm31 {v16_float = {0x0 <repeats 16 times>}, v8_double = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v64_int8 = {0x0 <repeats 64 times>}, v32_int16 = {0x0 <repeats 32 times>}, v16_int32 = {0x0 <repeats 16 times>}, v8_int64 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_uint128 = {0x0, 0x0, 0x0, 0x0}}
Disassemble Instructions • (gdb) disassemble $pc, +10 • Dump of assembler code from 0x11 to 0x24: • 0x0000000000000011 <foobar+17>: vpackstorelps %zmm0,-
0x10(%rbp){%k1} • 0x0000000000000018 <foobar+24>: vbroadcastss -0x10(%rbp),%zmm0
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Intel® Software Tools Roadmap
High Performance Computing / Enterprise
Intel® Parallel Studio XE 2013 -Support for Intel® Xeon Phi™ Coprocessors (Linux)
Q3 ’13 Q3 ’12 Q4 ’12 Q2 ’13 Q1 ’13
Gold release Beta Release window Alpha
Intel® Cluster Studio XE 2013 -Support for Intel® Xeon Phi™ Coprocessors (Linux)
Intel® Parallel Studio XE NEXT
Intel® Cluster Studio XE NEXT
Many-Core
Data Center Tools
Beta release window for Microsoft Windows*
Intel® Xeon Phi™ Coprocessor Support for Windows* (Beta)
Intel® Xeon Phi™ Coprocessor Support for Windows* (Alpha)
Intel® Cluster Studio XE 2012
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Preserve Your Development Investment Common Tools and Programming Models for Parallelism
Multicore
Many-core
Heterogeneous Computing
Intel® Cilk Plus
Intel® TBB Offload Pragmas
OpenCL*
OpenMP*
OpenMP*
Coarray
Offload Directives
Intel® MPI
Intel® MKL
C/C++
Fortran
Intel® C/C++ Compiler
Intel® Fortran Compiler
Develop Using Parallel Models that Support Heterogeneous Computing
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Conclusion • There are many parallel programming models in
existence. But only a small number are actually used and standardized across platforms:
• OpenMP • MPI • TBB • Cilk • Pthreads • OpenCL
• All you do to make applications run well on Intel Xeon Phi coprocessors (vectorization, parallelization) can be done in above ways (OpenMP, MPI, etc.) - it also works on Intel Xeon, and typically improves performance there too.
30
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
Call to Action
• Evaluate the Intel® Software Development Products, including the family of Parallel Programming Models, for your High Performance needs:
http://www.intel.com/software/products/eval
• For product information see:
http://www.intel.com/software/products
Note: The Intel® Parallel Studio XE 2013 and Intel® Cluster Studio XE 2013 products include support for Intel® Xeon Phi™ coprocessors prior to the coprocessors being generally available.
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. 32
© 2013, Intel Corporation. All righ ts reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Copyright © , Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon, Core, Xeon Phi, VTune, and Cilk are trademarks of Intel Corporation in the U.S. and other countries.
Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
Legal Disclaimer & Optimization Notice
Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
33
4/16/201
Intel Confidential - Use under NDA only
33