optimization notice - istep-2015.mc-reg.ru

49
Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice

Upload: others

Post on 19-Mar-2022

8 views

Category:

Documents


0 download

TRANSCRIPT

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

2

Smart, Connected Devices are Growing in Complexity and are EverywhereIncreasing the Challenges for System and Embedded Developers

To address these challenges, software developers need tools that… Are comprehensive and easy to use Quickly help resolve defects in complex systems Offer insight into sources of excess power consumption Enable and accelerate performance-demanding use cases

Networks &CommunicationTransportation MedicalIndustrial

Military, Aerospace,

GovernmentRetail

$$

ImagingDigital

SecuritySurveillance

Client & Mobile

Cloud /data centers /

storageIoT Devices

F

143 bpm

Gateways

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice3

Deep system-wide insight for SYSTEM and embedded developers

Accelerate Time to Market

Strengthen System Reliability

Boost Power Efficiency and Performance

Create smarter code — smarter

Intel® System Studio

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

JTAG, JTAGover USB

Intel® System Studio Provides a Comprehensive Suite of ToolsThat Provide Deep System-wide Insight for System and Embedded Developers

4

DEBUGGERSANALYZERSCOMPILER & LIBRARIES

System and Application Code Running onLinux* 1, Android*, Windows*, FreeBSD* or VxWorks*

Application &System

Power &Performance

Memory &Threading

C/C++Compiler

Image, Signal, Math and Data Processing

1 Linux*, Embedded Linux, Wind River* Linux*, Yocto Project*2 UEFI: Unified Extensible Firmware Interface

Intel® Architecture-based Platforms

Target system

UEFI* 2

Agent

Simics*Platform

Simulation

Debug &Trace

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice5

Intel® System Studio Helps System and Embedded DevelopersAddress Unique Needs Across Usages and Platforms

Device Manufacturers System Integrators Embedded Application DevelopersShorter system bring-up and validation cycles

Faster software stack integration and optimization

Efficiently introduce compelling new device capabilities

Wide-Ranging System and Embedded PlatformsNetworks &

CommunicationTransportation MedicalIndustrialMilitary,

Aerospace, Government

Retail

$$

ImagingDigital

SecuritySurveillance

Client & Mobile

Cloud /data centers /

storageIoT Devices

F

143 bpm

Gateways

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Intel® System Studio 2016 <Edition> for Linux*

6

Host OSDeveloper’s PC

IDE SupportTools integrating into

Target OSEmbedded System

Eclipse*Wind River* Workbench*

Microsoft* Visual Studio*

Intel® System Studio 2016 <Edition> for Windows*

Including:

OS, IDE Support

Intel® System Studio 2016 for FreeBSD*

Intel® C++ Compiler installs on target system

Intel® VTune™ Amplifier installs

on Linux* host

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Intel® System Studio Helps System and Embedded DevelopersAccelerate Time to Market

7

2xProductivity

Increase

5 minutes vs. 8+ hoursProductivity Increase

www.imcorp.com

“The Intel® C++ Compiler, as part of Intel® System Studio for FreeBSD*, is nearly a drop-in replacement for Clang and GCC. Working with a code base of seven million lines, built with Clang and GCC, the effort to integrate Intel System Studio for FreeBSD* took only about three days. This was less than half as long as expected.”

Dell, Eric van Gyzen, Senior Software Development Engineer

“IMCORP pioneers complex signal processing algorithms for power transmission cable diagnostics. Intel® VTune™ Amplifier, as part of Intel® System Studio, allowed us to find critical performance hotspots within 5 minutes that otherwise would take us more than 8 hours.”

IMCORP R&D Software Engineer

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Intel® System Studio Helps System and Embedded DevelopersStrengthen System Reliability

8

Code Improvement“Intel® System Debugger, as part of Intel® System Studio, enabled us to improve sensitive, hardware-dependent code in our industrial automation system software. It helped us to drastically reduce engineering efforts when analyzing processor internal states and execution of time-critical paths in our software.”

Dr. Henning Zabel, Beckhoff Automation

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Intel® System Studio Helps System and Embedded Developers Boost Power Efficiency and Performance

9

”By using Intel System Studio, we could improve the performance of our Intel® architecture-based network video recorder systems by 50%”

Cai Jian FengProduct Line Manager

Zhejiang Dahua Technology Co.

”Intel System Studio drastically improved the user experience of our recently launched Android*-based tablet, Tolino Tab* 8” (optimized for eReading)—by a factor of 3x (200ms vs. 500-700ms)—which reduced the CPU workload and the resulting power consumption by at least the same factor.”

Dirk Hofmann Chief Product Owner

Deutsche Telekom

5o% BetterPerformance

3X BetterPower Efficiency

4x BetterPerformance

www.imcorp.com

“Between Intel® System Studio’s compiler optimizations, Intel® Math Kernel Library’s fully featured list of vector operations, and the easy-to-use Intel® Cilk™ Plus implementation, our code has reached its lowest execution time by 4x while maintaining a small footprint.”

IMCORP R&D Software Engineer

intel® System Studio 2016Latest Advancements In the new Release

10

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Enable and Optimize Compelling System and Application UsagesHighly Optimized Compilers and Libraries

11

For tests and configurations see subsequent benchmark slides

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

Performance

Performance gain for embedded applications for Windows*• Intel® C++ Compiler for Windows* VS. Microsoft* Compiler

Performance gain for embedded applications for Linux*/Android*• Intel® C++ Compiler for Linux*/Android* VS. GCC*

Performance gain for demanding image, signal, data processing• Intel® C++ Compiler – Code Offload to Intel® Graphics Technology• Intel® Integrated Performance Primitives• Intel® Math Kernel Library

2x

Up to

1.5x

4x

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice12

Debug & Trace

Intel® System Debugger

UEFI, OS, driversthrough JTAG

System Software

Intel® Debug Extensions for

WinDbg*

Windows* stackWinDbg* over JTAG

Build & Optimize

Intel® C++

Compilerincl. Intel® Graphics

Technology offload

Intel® Integrated

Performance Primitives

Intel® Math Kernel Library

Intel® Threading Building Blocks

Eclipse*-based,Visual Studio*

Intel-enhanced GDB*

IDE support

Systems, Embedded Applications

Intel® Energy Profiler

Intel® Frame Analyzer

Intel® Platform Analyzer

Intel® System Analyzer

Intel® Inspector

Intel® VTune™ Amplifier

Analyze

CPU/GPU workloads

In real-time

Code performanceon CPUtime-, event-based

System-wide power efficiencyWake-up, sleep-state, frequency, temp.

Graphics performanceOpenGL ES, DirectX

Application robustness memory leaks

Performance

Power

Correctness

CPU/GPU workloads

offline and detailed

Composer Edition

Professional Edition

Ultimate Edition

What’s included in Intel® System Studio

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice13

Intel® Integrated Performance

Primitives

Intel®-enhanced GDB debugger

Eclipse* IDE,Workbench*, Visual Studio*

Intel® C++ Compiler

IDE support• Eclipse, Workbench for Linux* target OS• Visual Studio for Windows* target OS

IA-optimized CompilerIncl. Intel® Graphics Technology offload

IA-optimized libraries• Image, signal, data processing

Application DebuggerLinux*, Android*

Build performance optimized code

Composer EditionBuild & Optimize

Systems, Embedded Applications

• Great code performance• Application remote debug for robust

code• Libraries for performance demanding

code routines• Unified threading methodology across

target OS platforms• Integrates into common IDEs

Intel® Math Kernel Library

• 1D, 2D, 3D FFT, and others

Intel® Threading Building Blocks

Threading libraryUnified templates for Windows, Linux targets

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice14

Intel® Energy Profiler

Intel® Frame Analyzer

Intel® Platform Analyzer

Intel® System Analyzer

Intel® Inspector

Intel® VTune™ Amplifier

Analyze

CPU/GPU workloads• in real-time

Code performance on CPUtime-, event-based

System-wide power efficiencyWake-up, sleep-state, frequency, temp.

Graphics performanceOpenGL ES, DirectX

Application robustness memory leaks

Performance

Power

Correctness

• offline and detailed

• Workload analysis to understand system behavior

• Code analysis for more responsive systems• Frame analysis for fast graphics• System-wide analysis to optimize energy

efficiency• Threading and memory leak analysis to

improve system robustness

Analyze performance, power efficiencyand code correctness

Professional EditionIncludes Composer Edition

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice15

Intel® GPA - the app to optimize your games!

Desktop / AIO

High–End

GPU

Mainstream

Graphics

10fps

5fps

60+

fps

5fps

30 fps

30+

fps

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice16

Intel® GPA analysis workflow

Game with HUD / System Analyzer: Real-time in-game Analysis / Experiments

Frame Analyzer:Deep frame-level Analysis / Experiments

Platform Analyzer: GPU / Gfx API / CPU tasks visual timeline

+

CPU Limited

GPU Limited

Capture frame

Capture PA trace

?

Run with the tool Find your bottlenecks Do detailed analysis1 2 3

Run with Intel® GPA

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice17

Analyze Application Performance on Preemptive Real-Time Linux*Performance Analyzer Supports Real-Time Linux* System Profiling

Quickly and accurately pinpoint performance hotspots in preemptive Linux* systems

o Data collectors can be interrupted any time by high-priority tasks, precise performance profiling is a challenge in preemptive systems

Intel® VTune™ Amplifier continues to collect data through low-overhead sampling

Provides concurrency, waits and locks analysis and context switch information

Intel® VTune™data collector

Embedded Real-time Applications

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

Performance

PreemptiveRT Linux*

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice18

Analyze Application Performance in Virtualized EnvironmentsPerformance Analyzer Supports Virtualized Environment Performance Profiling

Observe and analyze performance behavior of embedded applications running on guest OS instances

Performance optimize multiple OSes and applications in virtualized environment on a single platform to save hardware cost

VM1 1

Guest OS 1

Embedded Application

VM2 1

Guest OS 2

Embedded Application

VM3 1

Guest OS 3

Embedded Application

Hypervisor

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

Performance

1 KVM, Wind River* Open Virtualization http://www.windriver.com/announces/open_virtualization/

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice

Repeat: Overview of Remote/Attached Collection Procedure/Architecture for Android*

Uses “adb” protocol/binary for collection & data transfer (must be in path)

Flexible collection configuration + control (pause/resume/stop)19

Target device

amplxe-runss

Host

VTuneGUI

VTune result

VTune collector binary runs on target and stores result on target

Data is opened in GUI and symbols are resolved

using modules stored in result dir

User can specify search dir with separate debug

files if needed

amplxe-cl

control collection

transfer data/modules

VTune result

driver

adb

adb

Transfers data collected remotely back to host automatically together with stripped application modules for symbol resolution

GUI Collector Control

Some collection types require signed drivers accessed from rooted device

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Traditional optimization Race to Idle

– Perform Operation Faster – Then Sleep

Achieved by

– Use of new instructions

– Increase core parallelism

– Use Standard Performance Optimization Tools – like Intel® VTune™ Amplifier for performance Analysis

New optimization Increase uninterrupted idle time

Use SoC components as needed

Achieve by

– Reduce the frequency of activity

– Consolidate activities

– Run code on appropriate SoC block

– Turn off components (or system)when not in use

Increase Power Efficiency

Minimize Wake-ups from Timers and Interrupts

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice

CPU Sleep States

Flexible C-States to Select Idle Power Level vs. Responsiveness

* Rough approximation

Core voltage*

Core clock

PLL

L1 caches

L2 cache

Wakeup time*

Idle power*

C0 C1 C3 C4 C6

off

Active state

off

flushed

off

off

flushed

off

partial flush

active

off

off

off

off

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice22

Correlate CPU Frequency, Sleep State, Wake-up Objects, etc...

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

New Architecture Diagram for GPU analysis

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice24

Intel® System Debugger

System Software

System-wide debug and trace for more robustness

Debug & Trace

UEFI, OS, driver debug & tracethrough JTAGLinux*, Android* target OS

• Holistic system-wide debug and trace• For UEFI, OS, drivers, middleware• Identify tricky bugs faster through event

tracing• Supports a variety of JTAG hardware

interfaces• OS awareness for Linux*, Android*,

VxWorks* for more efficient debug cycles• Full-stack debug for Windows* integrators

Ultimate EditionIncludes Professional Edition

Intel® Debug Extensions for

WinDbg*

Windows* stackWinDbg* over JTAGWindows* target OS

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Debug vs. Trace

25

• Actual source code line• Variable status• HW register contents

Debug = Program Flow

time

Hardware eventsBluetooth, GPS

Software eventsOS, Firmware

STOP

• Hardware and software events with time stamps• Mapping to service routines

Trace = System History

time

STOP

Platform Software Platform Software

Trace precisely shows the history of hardware and software events to identify and isolate complex bugs faster

Hardware eventsBluetooth, GPS

Software eventsOS, Firmware

DebugTraceRecording time

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Quickly Isolate Complex System IssuesComprehensive System-wide Hardware and Software Event Tracing

26

Efficiently pinpoint issues with time-stamp correlated trace information

Analyze complex interactions between software and hardware

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

Performance

Event trace with time-stamp information

Filter dialog to focus on specific

eventsAvailable with 6th generation Intel® Core™ processor family (formerly code-named Skylake)

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice27

BIOS/UEFI OS/RTOS/Kernel/DriversBoot phases

USB.DbCstack

Intel® SVT Closed Chassis Adapter

JTAG over USB Closed Chassis

USB Closed Chassis Debug

CPU reset USB connection established(part of UEFI, OS independent)

Cost-effective access to isolate defects system-widelow-Cost JTAG-based debug & trace solution over USB

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice28

Flexibility – alleviates requirement for an accessible hardware JTAG port

Low-cost – debug over standard USB connection instead of expensive JTAG probe

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

PerformanceSystem-wide Closed Chassis DebuggingJTAG-based Debug and Trace over Low-cost USB Connection

Debug & trace from CPU reset

Intel® SVT Closed Chassis Adapter (1)

Debug & trace OS boot

USB cableIntel® System

DebuggerTarget System

(1) SVT = Silicon View Technology – more details: https://designintools.intel.com/product_p/itpxdpsvt.htm

JTAG data over physical USB port

Target System

Intel® SystemDebugger

Available with 6th generation Intel® Core™ processor family (formerly code-named Skylake)

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice29

Extended Insight into Windows* System to Strengthen ReliabilitySystem Debug and Trace Extensions for Microsoft* WinDbg* Kernel Debugger

Simplify platform bring-up and Windows* driver validation now available with Microsoft* WinDbg* over JTAG

Debug a completely halted Windows* system including drivers and interrupts

Isolate complex run-time issues faster with Intel® Processor Trace

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

Performance

Hardware

Firmware

JTAG and Intel® Processor Trace enhanced Microsoft* WinDbg* kernel debugger

Intel® Processor Trace information

Available with 6th generation Intel® Core™ processor family (formerly code-named Skylake)

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Graphics Cores

Compiler generated code offloaded

CPUCores

Effectively Debug Compute Intensive Code Offloaded to Graphics Cores Debugger for Offloaded Code

30

Cooperatively execute compute intensive code across processor and graphics cores

Use simple compiler directives (#pragma) to mark code for offload

Debugger now available to debug code running on graphics cores

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

Performance

Intel® Core™ Processors and Intel®

Xeon® Processors with Intel® HD or

Intel® Iris™ Pro Graphics

Debug client

Source code of application that

executes on graphics cores

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice31

Enhanced Developer ProductivityImproved Out-of-the-Box Experience, IDE and Samples Included

Enhanced out-of-the-box experience

o Get started without actual target hardware using Wind River* Simics* platform simulation

Eclipse* IDE included

o Improved tools integration

More samples for a quicker start

Enhanced documentation

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

Performance

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice32

Product Line Differences

Starter Edition Professional Edition Ultimate Edition

Native C++ APP developmentWindows*, Android*, OS X*OpenCL Builder

Graphics acceleration through media packs

Intel® Compiler and Libraries

Composer Edition Professional Edition Cluster Edition

Native host developmentWindows*, Linux*, OS X*

Native performance profilingParallel processing & threading analysis

MPI Library

Composer Edition Professional Edition Ultimate Edition

Cross developmentMultiple Embedded OSes Heterogeneous computing

Remote analysis of power efficiency, performance and code correctness

System Debug & Trace

Intel® System Studio – System Software, Embedded Applications

Intel® INDE – Native C++ Apps

Intel® Parallel Studio XE – High Performance Computing, Cluster

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice33

Intel® System Studio 2016 SummaryDeep System-wide Insight for System and Embedded Developers

Increases performance with expertly optimized compiler and libraries

Enhances power efficiency and performance with enhanced analyzers

Eases isolation of complex defects with new debug and trace capabilties

Extends support to the newest Intel platforms and operating systems

Improves developer productivity with expanded usability and capabilities

Create smarter code — smarter, with Intel System StudioLearn more at: http://intel.ly/system-studio

34

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice35

CPU C-States / P-States

35

C1

C2

C3

C4

C5

C6

Pn

P1

P0 CPU

Active

CPU

Sleep

P0 - CPU active at highest frequency (HFM)

Pn - CPU active at lowest frequency (LFM)

C0 - CPU active (In any P-state)

C0

C1 - Core clock is Off

C3/C4 - Reduced Voltage, Partial L2 cache flush

C6 - Core Off, L2 cache flush, state saved to SRAM

The deeper the sleep state

more power saving

but longer to wake up

Po

we

r

Hig

he

r

La

ten

cy

Gre

ate

r

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice36

Support Newest PlatformsAdded Support for New Intel Processors and Target Operating Systems

Support for recently launched versions of Intel® processors

o Intel® Atom™ x3 processorsformerly code-named SoFIA

o Intel® Atom™ x5, x7 processorsformerly code-named Cherry Trail

o 6th Generation Intel® Core™ processors formerly code-named Skylake

Microsoft* Windows* 10

FreeBSD*

Accelerate Time to Market

Strengthen System

Reliability

Boost Power Efficiency &

Performance

Expanded New

Expanded Expanded

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice37

Intel® System Studio: Editions, Components, and Operating Systems

Target Operating Systems Linux* 1, 2 Android* 2 Windows* VxWorks* 3 FreeBSD*

Category Component

Co

mp

ose

r E

dit

ion

Pro

fess

ion

al

Ed

itio

n

Ult

ima

te

Ed

itio

n

Co

mp

ose

r E

dit

ion

Pro

fess

ion

al

Ed

itio

n

Ult

ima

te

Ed

itio

n

Co

mp

ose

r E

dit

ion

Pro

fess

ion

al

Ed

itio

n

Ult

ima

te

Ed

itio

n

VxW

ork

s*

Ed

itio

n

Fre

eB

SD

E

dit

ion

Host Operating SystemsLinux*

Windows*Linux*

Windows*Windows*

Linux*Windows*

Linux*FreeBSD*

Integrated Development EnvironmentEclipse*,

Workbench*Eclipse* Visual Studio* Workbench* Eclipse*

Compiler & Libraries

Intel® C++ Compiler

Intel® Integrated Performance Primitives

Intel® Math Kernel Library

Intel® Threading Building Blocks

System & Application Debuggers

Intel® System Debugger 4 7

Intel® Debug Extensions for WinDbg* 4

Intel®-enhanced GDB* Application Debugger

Intel® Debugger for Heterogeneous Compute

Performance, Power &

Correctness Analyzers

Intel® VTune™ Amplifier 6

Intel® Energy Profiler

Intel® Inspector

System Analyzer

Platform Analyzer 5

Frame Analyzer 5

1 Linux*, Embedded Linux, Wind River* Linux*, Yocto Project*2 Linux* and Android* target support available in a single product3 Available from Wind River* with VxWorks*

4 Via Intel® ITP-XDP3 probe, OpenOCD*, Intel® SVT Closed Chassis Adapter* and EDKII* for UEFI*5 Available for Windows* host6 Also available for OS X* host as a separate download7 Intel® System Debugger provides VxWorks* OS awareness – available with Ultimate Editions

NewNew

New

New

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice38

Compiler options

Intel System Studio XE 2016: -O3 -ipo -xATOM_SSE4.2 -ansi-alias -prec-div- -staticGCC 5.1: -m32 -Ofast -mfpmath=sse -flto -march=native -funroll-loops -ffat-lto-objects (-m64 for Coremark Intel64)

Hardware configurations

Intel(R) Atom(TM) CPU C2750 @ 2.41GHz, 32 GB RAMRed Hat Enterprise Linux Server release 7.0 (Maipo), kernel 3.10.0-123.el7.x86_64

Benchmarks

EEMBC sources have been taken from common repository for GCC, LLVM , IC teamsMetric is Iterations per second, scaled according to EEMBC publishing requirements (http://eembc.org/benchmark/index.php)

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

100% 100% 100% 100% 100% 100%

132%

208%

156%

127%

104%

141%

0%

50%

100%

150%

200%

250%

AutoBench 1.1

Geomean

TeleBench 1.1

Geomean

DenBench 2.0

Geomean

IpMark

Geomean

TCPMark

Geomean

EEMBC

Geomean

Pe

rfo

rma

nce

ga

in (

hig

he

r is

be

tte

r)

AutoBench 1.1, TeleBench 1.1, DenBench 2.0, IpMark, and

TCPMark Benchmarks (EEMBC) - Best Option Set

GCC 5.1 Intel System Studio 2016

Intel® C++ Compiler Performance on EEMBC* Benchmarks

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Intel® C++ Compiler Benchmarks on Windows* Targets Estimated Performance Difference

39

CompilersIntel® C++ Compiler for IA-32 applications, Version 16.0 Build 20150423Intel® C++ Intel(R) 64 Compiler for Intel(R) 64 applications, Version 16.0 Build 20150423Microsoft* C/C++ Optimizing Compiler Version 18.00.21005.1 for x86Microsoft* C/C++ Optimizing Compiler Version 18.00.21005.1 for x64

Platform Microsoft* Windows 8.1 EnterpriseHardware Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz, HyperThreading is offRAM 16GBHDD 1TB

BenchmarksCINT2006 geomeanCFP2006 C/C++ geomeanSPEC2006 C/C++ geomean

NOTE: 32-bit compilers for CINT2006 in RATE mode were used, as in SPEC publications

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance test, such as CINT2006*, CFP2006 C/C++*, SPEC2006 C/C++*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Benchmark Source: Intel Corporation. For more complete information about compiler optimizations, see our Optimization Notice.

100% 100% 100%

151%

129%143%

0%

20%

40%

60%

80%

100%

120%

140%

160%

CINT2006 Geomean CFP2006 C/C++

Geomean

SPEC2006 C/C++

Geomean

Pe

rfo

rma

nce

Ga

in (

Hig

he

r is

be

tte

r)

CINT2006, CFP2006 C/C++, SPEC2006 C/C++

RATE Benchmarks - Best Option Set

MS Visual Studio* 2013 Intel C++ Compiler 2016

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice40

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance test, such as CINT2006*, CFP2006 C/C++*, SPEC2006 C/C++*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Benchmark Source: Intel Corporation. For more complete information about compiler optimizations, see our Optimization Notice.

Platform Microsoft* Windows 8/ServerHardware Intel® Core™ i7-4770 CPU @ 3.50GHzGraphics Intel® HD Graphics 4600RAM 16GBHDD 1TB

Intel® C++ Compiler Benchmark – Code Offload to Intel® Graphics Technology

0,0

0,5

1,0

1,5

2,0

2,5

3,0

3,5

0%10%20%30%40%50%60%70%80%90%100%

SP

EE

D-U

P

CPU SHARE (100% ALL CPU)

Performance When Offloaded to Graphics Cores

NBodyLocals MoonLight MoonLight_struct MatmultLocalsAN

BoxBlur_Vec BoxBlurFloat BoxBlurFloatLocal FDTD_3d

FishEye Mandelbrot Mandelbrot_bw MatmultLocalsAN_d

NBody geomean

Performance gain up to 3x by

offloading code to graphics cores, at a load balance of 30% on CPU and 70% on graphics cores

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

0%

200%

400%

600%

800%

Single-Rate FIR Linear Convolution Cross-Correlation Forward FFT

Intel® IPP Signal Processing

Functions

Speedup

Intel® SSE2 Intel® SSE4.x Intel® AVX2

System configuration: Intel® Integrated Performance Primitives (Intel® IPP) 9.0. Hardware: Intel® Core™ i5-4300U processor, 3 MB Intel® Smart Cache, 8 GB RAM. Operating system: Windows* 8 64-bit, single-threaded benchmark

Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intelmicroprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, oreffectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intelmicroprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User andReference Guides for more information regarding the specific instruction sets covered by this notice.Notice revision #20110804

Intel® Integrated Performance Primitives (Intel® IPP) – Signal Processing

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

0%

50%

100%

150%

200%

BZIP2 v. 1.0.6 ZLIB v. 1.2.8, level 6

Intel® IPP data compression

performance boost by using Intel®

IPP vs. open source libraries

Intel® Xeon® E5-2680 Intel® Core™ i7-4770K Intel® Quark™ SoC X1000

System configuration: Intel® Integrated Performance Primitives (Intel® IPP) 9.0. Hardware: Intel® Xeon® E5-2680, 20 MB cache, 2.7 GHz, 64 GB RAM. OS: RH EL Server 6.4, 64-bitIntel® Core™ i7-4770K, 8 MB cache, 3.9 GHz, 32 GB RAM. OS: RH EL Server 6.4, 64-bitIntel® Quark™ SoC X1000, 16 KB cache, 400 MHz, 2 GB RAM. OS: Yocto Linux 3.8.7, 32-bitData sets: Calgary and Canterbury corpuses

Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intelmicroprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, oreffectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intelmicroprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User andReference Guides for more information regarding the specific instruction sets covered by this notice.Notice revision #20110804

Intel® Integrated Performance Primitives (Intel® IPP) – Data Compression

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

0%

50%

100%

150%

200%

BZIP2 v. 1.0.6 ZLIB v. 1.2.8, level 6

Intel® IPP data decompression

performance boost by using Intel®

IPP vs. open source libraries

Intel® Xeon® E5-2680 Intel® Core™ i7-4770K Intel® Pentium® J2900

System configuration: Intel® Integrated Performance Primitives (Intel® IPP) 9.0. Hardware: Intel® Xeon® E5-2680, 20 MB cache, 2.7 GHz, 64 GB RAM. OS: RH EL Server 6.4, 64-bitIntel® Core™ i7-4770K, 8 MB cache, 3.9 GHz, 32 GB RAM. OS: RH EL Server 6.4, 64-bitIntel® Pentium® J2900, 2 MB cache, 2.7 GHz, 8 GB RAM. OS: RH EL Server 7.0, 64-bitData sets: Calgary and Canterbury corpuses

Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intelmicroprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, oreffectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intelmicroprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User andReference Guides for more information regarding the specific instruction sets covered by this notice.Notice revision #20110804

Intel® Integrated Performance Primitives (Intel® IPP) – Data Decompression

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

0,00

1,00

2,00

3,00

4,00

5,00

6,00

AES-128-ECB

Encrypt

AES-128-CBC

Encrypt

AES-128-CBC

Decrypt

SHA-1 SHA-256

GB

YT

ES

/S

Intel® IPP Cryptography Function

Performance

Intel® IPP OpenSSL 1.0.2c

System configuration: Intel® Integrated Performance Primitives (Intel® IPP) 9.0. Hardware: Intel® Core™ i7-4770K, 8 MB cache, 3.9 GHz, 32 GB RAM. OS: RH EL Server 6.4, 64-bit

Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intelmicroprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, oreffectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intelmicroprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User andReference Guides for more information regarding the specific instruction sets covered by this notice.Notice revision #20110804

Intel® Integrated Performance Primitives (Intel® IPP) – Cryptography

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

0

500

1000

1500

2000

2500

3000

3500

4000

Box Filter Median Filter

FR

AM

ES

-PE

R-S

EC

ON

D, 1

92

0X

10

80

X1

Intel® IPP Image Processing Filters

Performance In Multi-Thread Mode

1 thread 2 threads 4 threads

System configuration: Intel® Integrated Performance Primitives (Intel® IPP) 9.0. Hardware: Intel® Core™ i5-4300U processor, 3 MB Intel® Smart Cache, 8 GB RAM. Operating system: Windows* 8 64-bit, multi-threaded benchmark

Optimization Notice: Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intelmicroprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, oreffectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intelmicroprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User andReference Guides for more information regarding the specific instruction sets covered by this notice.Notice revision #20110804

Intel® Integrated Performance Primitives (Intel® IPP) – Image Processing

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

FFT Performance Boost using Intel® MKL vs. FFTW*Single Precision Complex 2D and 3D FFT on Intel® Core™ Processor i7-6700K

46

Configuration Info - Versions: Intel® Math Kernel Library (Intel® MKL) 11.3, FFTW* 3.3.4; Hardware: Intel® Core™ Processor i7-6700K, Quad-core CPU (8MB LLC, 4.0 GHz), 32GB of RAM; Operating System: RHEL 6.5 x86_64;

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. * Other brands and names are the property of their respective owners. Benchmark Source: Intel Corporation

Optimization Notice: Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 .

Single Precision Complex 2D & 3D FFT Performance Boost using Intel® MKL vs. FFTW*

0

50

100

Pe

rfo

rma

nce

(G

Flo

ps)

Transform Size (Power of two)

3D FFT

Intel MKL - 1 thread Intel MKL - 2 threads Intel MKL - 4 threads

FFTW - 1 thread FFTW - 2 threads FFTW - 4 threads

Return to Menu

0

20

40

60

80

100

120

Pe

rfo

rma

nce

(G

Flo

ps)

Transform Size (Power of two)

2D FFT

Intel MKL - 1 thread Intel MKL - 2 threads Intel MKL - 4 threads

FFTW - 1 thread FFTW - 2 threads FFTW - 4 threads

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice

Simulate processors, peripherals, and networks.

Simulate any size target system.

Run unmodified target binaries.

Use unique and powerful debugging techniques.

Record, save, and restore your simulation runs.

Wind River Simics Simulation TechnologySimics is a full system simulator used by software developers to simulate the hardware of complex electronic systems.

SIMICS

Target Hardware

Access - eliminate hardware availability bottlenecks from software development process

Collaboration – share, communicate and exchange with saved simulation environments

Automation – automate any debug, test, profile or tracing function

Free trial simulator provided with Intel System Studio. Simics is licensed and sold by Wind River (an Intel Company)

Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.

Optimization Notice48

Legal Disclaimer & Optimization Notice

INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

Copyright © 2015, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries.

Optimization Notice

Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804

48