intel math kernel library (mkl) clay p. breshears, phd intel software college ncsa multi-core...
Post on 03-Jan-2016
220 Views
Preview:
TRANSCRIPT
Intel Math Kernel Library (MKL)Clay P. Breshears, PhD
Intel Software College
NCSA Multi-core WorkshopJuly 24, 2007
2
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Agenda
Performance Features
The Library Sections• BLAS• LAPACK*• DFTs• VML• VSL
SciMark 2.0 Optimization Case Study (from Henry Gabb)
• SciMark 2.0 overview
• Tuning with the Intel compiler
• Tuning with the Intel Math Kernel Library
3
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel® Math Kernel Library Purpose
Performance, Performance, Performance!
Intel’s engineering, scientific, and financial math library
Addresses:
• Solvers (BLAS, LAPACK)
• Eigenvector/eigenvalue solvers (BLAS, LAPACK)
• Some quantum chemistry needs (dgemm)
• PDEs, signal processing, seismic, solid-state physics (FFTs)
• General scientific, financial [vector transcendental functions (VML) and vector random number generators (VSL)]
Tuned for Intel® processors – current and future
4
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel® Math Kernel Library Purpose – Don’ts
But don’t use Intel® Math Kernel (Intel® MKL) on …
Don’t use Intel® MKL on “small” counts
Don’t call vector math functions on small n
X’Y’Z’W’
XYZW
=4x4
Transformationmatrix
Geometric Transformation
But you could use Intel® Performance Primitives
5
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel® Math Kernel Library Environment
Support 32-bit and 64-bit Intel® processors
Large set of examples and tests
Extensive documentation
Windows* Linux*
Compilers Intel, Microsoft Intel, Gnu
Libraries .dll, .lib .a, .so
6
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Resource Limited Optimization
The goal of all optimization is maximum speed
Resource limited optimization – exhaust one or more resource of system:
• CPU: Register use, FP units
• Cache: Keep data in cache as long as possible; deal with cache interleaving
• TLBs: Maximally use data on each page
• Memory bandwidth: Minimally access memory
• Computer: Use all the processors/cores available using threading
• System: Use all the nodes available (cluster software)
7
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Threading
Most of Intel® Math Kernel Library could be threaded but:
• Limited resource is memory bandwidth
• Threading level 1 and level 2 BLAS are mostly ineffective ( O(n) )
There are numerous opportunities for threading:
• Level 3 BLAS ( O(n3) )
• LAPACK* ( O(n3) )
• FFTs ( O(n log n ) )
• VML, VSL ? depends on processor and function
All threading is via OpenMP*
All Intel MKL is designed and compiled for thread safety
8
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
SciMark 2.0
Produced by the National Institute of Standards and Technology
ANSI C and Java versions available
Five floating-point-intensive kernels
• FFT: Compute a complex 1D FFT
• SOR: Jacobi successive over-relaxation in 2D
• MC: Compute by Monte Carlo integration
• MV: Sparse matrix-vector multiplication
• LU: Dense matrix LU factorization
9
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
SciMark 2.0 Problem Sizes
Benchmark
Problem Size
Small Large
FFT N = 1024 N = 1048576
SOR 100 x 100 1000 x 1000
MC Problem size not fixed, no distinction between small and large problems
MVN = 1000
NZ = 5000
N = 100000
NZ = 1000000
LU 100 x 100 1000 x 1000
10
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Benchmark System
Hardware
CPU (dual-processor system) 3.6 GHz Xeon (2 MB L2 cache) EM64T
Motherboard Intel Server Board SE7520AF2
Memory 512 MB DDR2
BIOS
Version P06
Adjacent Cache Line Prefetch ON
Hardware Prefetch ON
Hyper-Threading Technology OFF
Software
Operating system Red Hat Enterprise Linux AS3
Linux kernel 2.4.21-20.EL #1 SMP
Intel C++ Compiler for Linux 8.1 (l_cce_pc_8.1.024)
Intel Cluster MKL 7.2 (l_cluster_mkl_7.2.008)
GNU C Compiler gcc 3.2.3
11
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
GNU Performance Baseline
Small Problems
0
200
400
600
800
1000
FFT
SO
R
MC
MV
LU
Co
mp
.
MFLO
PS
Default Optimized
Large Problems
0
100
200
300
400
500
FFT
SO
R
MC
MV
LU
Co
mp
.
MFLO
PS
Default Optimized
Aggressive optimization significantly improves performance relative to the default optimization level. The following gcc options were used to establish baseline performance: –O3 –march=nocona –ffast-math –mfpmath=sse
12
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel C++ Compiler for Linux
Performance• Automatic vectorization• Streaming SIMD Extensions 3• IPO and PGO• Automatic parallelization and OpenMP support• Automatic CPU dispatch• Much more...
Compatibility• Source and object compatible with gcc and g++• Supports GNU inline ASM• ANSI/ISO C/C++ standards compliance• Conforms to the C++ ABI standard• Integrated with the Eclipse IDE
13
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Tuning SciMark 2.0 with the Intel Compiler
Small Problems
0
500
1000
1500
2000
FFT
SO
R
MC
MV
LU
Co
mp
.
MFLO
PS
GNU Intel
Large Problems
0200400600800
10001200
FFT
SO
R
MC
MV
LU
Co
mp
.
MFLO
PS
GNU Intel
The Intel C++ Compiler for Linux improves SciMark 2.0 performance relative to the GNU baseline. Intel compiler options: –O3 –xP –ipo –fno-alias.
14
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel® Math Kernel Library ContentsBLAS
BLAS (Basic Linear Algebra Subroutines)
Level 1 BLAS – vector-vector operations• 15 function types• 48 functions
Level 2 BLAS – matrix-vector operations• 26 function types• 66 functions
Level 3 BLAS – matrix-matrix operations• 9 function types• 30 functions
Extended BLAS – level 1 BLAS for sparse vectors• 8 function types• 24 functions
15
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel® Math Kernel Library ContentsLAPACK
LAPACK (linear algebra package)• Solvers and eigensolvers. Many hundreds of routines total!
• There are more than 1000 total user callable and support routines
DFTs (Discrete Fourier transforms)• Mixed radix, multi-dimensional transforms
• Multithreaded
VML (Vector Math Library)• Set of vectorized transcendental functions
• Most of libm functions, but faster
VSL (Vector Statistical Library)• Set of vectorized random number generators
16
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel® Math Kernel Library Contents
BLAS and LAPACK* are both Fortran
• Legacy of high performance computation
VSL and VML have Fortran and C interfaces
DFTs have Fortran 95 and C interfaces
cblas interface available
• More convenient for a C/C++ programmer
17
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel® Math Kernel Library Optimizations in LAPACK*
Most important LAPACK optimizations:
• Threading – effectively uses multiple cores
• Recursive factorization• Reduces scalar time (Amdahl’s law: t = tscalar + tparallel/p)
• Extends blocking further into the code
No runtime library support required
18
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Tuning the SciMark 2.0 LU Kernel
Replacing the SciMark 2.0 LU kernel with the LAPACK dgetrf function requires attention to detail:
• SciMark 2.0 is written in C• LAPACK defines a Fortran interface
• C is call-by-value• Fortran is call-by-reference
• C uses row-major ordering• Fortran uses column-major ordering
• For best performance, dgetrf requires data to be contiguous in memory
• SciMark 2.0 LU kernel allocates a 2D array as pointers-to-pointers (not necessarily contiguous in memory)
19
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Tuning the SciMark 2.0 LU Kernel
0
1000
2000
3000
4000
5000
6000
7000
MFLO
PS
Small Large
SciMark 2.0 LU Kernel
GNU baseline
Intel compiler
Intel MKL LAPACK
Intel MKL LAPACK+ OpenMP
The Intel MKL Lapack significantly improves performance over the original SciMark 2.0 LU source code.
20
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel® Math Kernel Library Contents Discrete Fourier Transforms
One dimensional, two-dimensional, three-dimensional…
Multithreaded
Mixed radix
User-specified scaling, transform sign
Transforms on embedded matrices
Multiple one-dimensional transforms on single call
Strides
C and F90 interfaces; FFTW interface support
21
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Using the Intel® Math Kernel Library DFTs
Basically a 3-step Process
Create a descriptor
Status = DftiCreateDescriptor(MDH, …)
Commit the descriptor (instantiates it)
Status = DftiCommitDescriptor(MDH)
Perform the transform
Status = DftiComputeForward(MDH, X)
Optionally free the descriptor
MDH: MyDescriptorHandle
22
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Tuning the SciMark 2.0 FFT Kernel
#include <mkl.h>
int N = 1024; // Size of SciMark 2.0 small FFT problemdouble scale = 1.0 / (double)N;
double *x = RandomVector ((2 * N), R); // SciMark creates a random vector // of size 2*N to hold real and // imaginary partsDFTI_DESCRIPTOR *dftiHandle; // Structure for MKL DFT descriptor
DftiCreateDescriptor (&dftiHandle, // Transform descriptor DFTI_DOUBLE, // Precision DFTI_COMPLEX, // Complex-to-complex 1, // Number of dimensions N); // Size of transform
// Apply scaling factor to backward transformDftiSetValue (dftiHandle, DFTI_BACKWARD_SCALE, scale);
DftiCommitDescriptor (dftiHandle);
DftiComputeForward (dftiHandle, x); // Apply DFT to array xDftiComputeBackward (dftiHandle, x);
DftiFreeDescriptor (&dftiHandle);
23
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Tuning the SciMark 2.0 FFT Kernel
0
500
1000
1500
2000
MFLO
PS
Small Large
SciMark 2.0 FFT Kernel
GNU baselineIntel compilerIntel MKL DFT
The Intel MKL DFT significantly improves performance over the original SciMark 2.0 FFT source code.
24
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel® Math Kernel Library Contents Vector Math Library (VML)
Vector Math Library: vectorized transcendental functions – like libm but better (faster)
Interface: Have both Fortran and C interfaces
Multiple accuracies
• High accuracy ( < 1 ulp )
• Lower accuracy, faster ( < 4 ulps )
Special value handling √(-a), sin(0), and so on
Error handling – can not duplicate libm here
25
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
VML: Why Does It Matter?
It is important for financial codes (Monte Carlo simulations)
• Exponentials, logarithms
Other scientific codes depend on transcendental functions
Error functions can be big time sinks in some codes
26
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel® Math Kernel Library Contents Vector Statistical Library (VSL)
Set of random number generators (RNGs)
Numerous non-uniform distributions
VML used extensively for transformations
Parallel computation support – some functions
User can supply own BRNG or transformations
Five basic RNGs (BRNGs)
• MCG31, R250, MRG32, MCG59, WH
27
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Non-Uniform RNGs
Gaussian (two methods)
Exponential
Laplace
Weibull
Cauchy
Rayleigh
Lognormal
Gumbel
28
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Using VSL
Basically a 3-step Process
Create a stream pointer
VSLStreamStatePtr stream;
Create a stream
vslNewStream(&stream,VSL_BRNG_MC_G31,seed );
Generate a set of RNGs
vsRngUniform( 0,&stream,size,out,start,end );
Delete a stream (optional)
vslDeleteStream(&stream);
29
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Calculating Pi by Monte Carlo
squarein darts of #
circle hitting darts of#4
41
squarein darts of #
circle hitting darts of#2
2
rr
Loop I = 1 to N_samples
x.coor = random [0..1]
y.coor = random [0..1]
dist = sqrt (x^2 + y^2)
if dist <= 1
hits = hits + 1
Pi = 4 * hits / N_samples
r
30
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Tuning the SciMark 2.0 MC Kernel
#include <mkl.h>
double MonteCarlo_integrate (int Num_samples){ int i, j, blocks, under_curve = 0; static double rnBuf[2 * BLOCK_SIZE]; double rnX, rnY; VSLStreamStatePtr stream;
blocks = Num_samples / BLOCK_SIZE; vslNewStream (&stream, VSL_BRNG_MCG31, SEED);
for (i = 0; i < blocks; i++) { vdRngUniform (VSL_METHOD_DUNIFORM_STD, stream, (2 * BLOCK_SIZE), rnBuf, 0.0, 1.0);
for (j = 0; j < BLOCK_SIZE; j++) { rnX = rnBuf[2*j]; rnY = rnBuf[2*j+1]; if (sqrt(rnX*rnX + rnY*rnY) <= 1.0) under_curve++; } } vslDeleteStream (&stream);
return ((double) under_curve / Num_samples) * 4.0;}
31
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Tuning the SciMark 2.0 MC Kernel
0100200300400500600700800900
1000
MFLO
PS
SciMark 2.0 MC Kernel
GNU baseline
Intel compiler
Intel MKL VSL
Intel MKL VSL +OpenMP
The Intel MKL VSL significantly improves performance over the original SciMark 2.0 MC source code.
32
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Best SciMark 2.0 Single Node PerformanceSmall Problems
0
500
1000
1500
2000FFT
SO
R
MC
MV
LU
Co
mp
.
MFLO
PS
GNU Intel
Small Problems (MFLOPS)
GNU Intel Speedup
FFT 510 1817 3.6
SOR 524 1092 2.1
MC 206 1003 4.9
MV 857 832 1.0
LU 884 1827 2.1
Comp. 596 1314 2.2
33
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Best SciMark 2.0 Single Node PerformanceLarge Problems
0
500
1000
1500
2000
FFT
SO
R
MC
MV
LU
Co
mp
.
MFLO
PS
GNU Intel
Large Problems (MFLOPS)
GNU Intel Speedup
FFT 45 600 13.3
SOR 495 1015 2.1
MC 206 1003 4.9
MV 453 457 1.0
LU 392 6646 16.9
Comp. 318 1944 6.1
6646
34
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Intel® Cluster MKL
Intel Cluster MKL is a superset of MKL for solving large linear algebra problems on a cluster
Intel Cluster MKL contains:
• ScaLAPACK (Scalable LAPACK)
• BLACS (Basic Linear Algebra Communication Subprograms)
Supports MPICH and the Intel MPI Library
35
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Data Layout Critical to Parallel Performance
ScaLAPACK uses 2D block-cyclic data distribution
Example layouts of lower triangular matrix for four processes
0 1
32
0 1
32
0 1
32
0 1
32
2D block-cyclic
distribution
0 1 2 3 0 1 2 3 0 1 2 3
1D block distribution
1D block-cyclic
distribution
2D block-cyclic
distribution
Load balancePoor Better
36
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Parallelizing the SciMark 2.0 LU Kernel with Intel® Cluster MKL
1. Initialize the process grid
2. Create a descriptor for each distributed matrix
3. Replace the call to dgetrf with pdgetrf (the ‘p’ is for parallel)
Result: LU factorization of a 40000 x 40000 matrix on an 8-node, dual 3.0 GHz Xeon cluster achieves 46000 MFLOPS.
37
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Performance Libraries: Intel® MKLWhat’s Been Covered
Intel® Math Kernel Library is a broad scientific/engineering math library
It is optimized for Intel® processors
It is threaded for effective use on multi-core and SMP machines
The Intel C++ Compiler for Linux improves SciMark 2.0 performance without requiring code modifications
With minor code modifications, Intel MKL dramatically improves the FFT, MC, and LU kernels
Some SciMark 2.0 kernels benefit from parallel computing
38
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
Useful Links
Intel Software Products• http://www.intel.com/software/products/
Intel Software Network• http://www.intel.com/software/
Intel Software College• http://www.intel.com/software/college/
SciMark 2.0• http://math.nist.gov/scimark2/index.html
39
Copyright © 2007, Intel Corporation. All rights reserved.
Performance Libraries: Intel® Math Kernel Library (MKL)
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.
top related