developing yocto linux* targeted apps using intel dev tools · legal information information in...

37
Developing Yocto Linux* targeted Apps using Intel Dev Tools Feilong Huang Technical Consulting Engineer DPD/SSG 1

Upload: others

Post on 11-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Developing Yocto Linux* targeted Apps using Intel Dev Tools

Feilong Huang

Technical Consulting Engineer

DPD/SSG

1

Page 2: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Legal Information INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.

A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS.

Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm

Intel, VTune, Cilk, Atom and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries.

*Other names and brands may be claimed as the property of others

Copyright© 2011 Intel Corporation. All rights reserved.

2

Page 3: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Executive Summary

• Yocto Linux* requires performance tools in addition to baseline development tools

• Intel® Embedded Software Development Tool Suite 2.3 for Intel® Atom™ Processor integrates Intel® Compiler with Yocto* Application Development Toolkit 1.1 (ADT).

• It provides Sampling Collector for Intel® VTune™ Amplifier XE validated for use on Yocto Linux*.

• System and Application Debuggers permit debug for all layers of the software stack

3

Page 4: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Agenda

• Intel® Embedded Software Development Tool Suite for Intel® Atom™ Processor

• Integrating Intel® C++ Compiler into ADT

• Using Intel® VTune™ Amplifier XE with Yocto Linux*

• Debugging Yocto Linux* Applications

• Summary

4

Page 5: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

The Development Cycle

Intel® C++ Compiler

• SSSE3 Vectorization

• In-order scheduler

• Memory access optimization

Intel® Integrated Performance Primitives

• Intel® Application Debugger

• Intel® JTAG Debugger

• Intel® Flash Memory Tool

• Intel® VTune™ Amplifier XE

• Sampling Collector for Intel® VTune™ Amplifier XE (SEP)

• Intel® Debuggers

Tool Suite for all Phases of Development Design through Validation

5

Page 6: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Intel® Embedded Software Development Tool Suite for Intel® Atom™ Processor

Target OS: Linux*

Kernel debug; On-Chip trace & SMP run control

Identify optimization opportunities

Thread Specific Run Control & Thread Grouping

Broad Processor coverage CE4xxx, Z6xx, E6xx series

6

Page 7: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Intel® C++ Compiler & Intel® Atom™ Processor

• Optimization Switch –xSSE3_ATOM

– In order scheduler – IDIV DIVB expansion – Arithmetic operations feeding addresses turned into LEAs – All stack adjusts done using LEAs – Support for movbe instruction – Intel® Streaming SIMD Extensions 3 (SSE3) instruction support

• Compiler Based Vectorization and Automatic Processor Dispatch –ax[?] – Single executable optimized for Intel® Atom™ processors and generic code that runs on

all IA32 processors

– For each target processor it uses:

Processor-specific instructions, vectorization,

low overhead, some increase in code size

Dedicated performance optimizations for the Intel® Atom™ Processor

7

Page 8: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Build Support for Cross-Build Environments Embedded cross-build environments for Linux* tend to have varying install

locations for • Preprocessor defines

• GNU tools paths and names

• GNU startup files, C++ includes/runtime

• Location of target system headers and libraries

• The list of default libraries

Intel® C++ Compiler supports • --sysroot

• Chroot/jailroot installs

• Detailed build environment definition via –platform=<name>

(where name is the name of a user editable environment file)

• Tested against Poky Linux*, MADDE*, CE Linux* SDK

• Yocto Linux* Application Development Toolkit Support

Compiler is flexible in meeting

embedded build environment needs

8

Page 9: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Common Optimization Switches

9

Windows* Linux*

Disable optimization /Od -O0

Optimize for speed (no code size increase) /O1 -O1

Optimize for speed (default) /O2 -O2

High-level loop optimization /O3 -O3

Create symbols for debugging /Zi -g

Multi-file inter-procedural optimization /Qipo -ipo

Profile guided optimization (multi-step build) /Qprof-gen

/Qprof-use

-prof-gen

-prof-use

Optimize for speed across the entire program /fast (same as: /O3 /Qipo /Qprec-div- /QxHost)

-fast (same as: -ipo –O3 -no-prec-div -static -xHost)

OpenMP 3.0 support /Qopenmp -openmp

Automatic parallelization /Qparallel -parallel

Page 10: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

High-Level Optimizer (HLO)

• Compiler switches: /O2, /O3 (Windows*), -O2, -O3 (Linux*)

• Loop level optimizations – loop unrolling, cache blocking, prefetching

• More aggressive dependency analysis – Determines whether or not it‘s safe to reorder or

parallelize statements

• Scalar replacement – Goal is to reduce memory by replacing with register

references

10

Page 11: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

SIMD: Single Instruction Multiple Data

• Scalar processing

– traditional mode

– one operation produces one result

• SIMD processing

– one instruction produces multiple results

+

x3 x2 x1 x0

y3 y2 y1 y0

x3+y3 x2+y2 x1+y1 x0+y0

X

Y

X + Y

+

X

Y

X + Y

= =

Page 12: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

12

Interprocedural Optimizations (IPO)

• Interprocedural optimizations performs a static, topological analysis of your application!

• ip: Enables inter-procedural optimizations for current source file compilation

• ipo: Enables inter-procedural optimizations across files Can inline functions in separate files

Especially many small utility functions benefit from IPO Enabled optimizations: • Procedure inlining (reduced function call overhead) • Interprocedural dead code elimination, constant propagation and procedure

reordering • Enhances optimization when used in combination with other compiler features

Windows* Linux*

/Qip -ip

/Qipo -ipo

Page 13: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

13

Interprocedural Optimizations (IPO)

Linking

Linux* icc -ipo main.o func1.o

func2.o

Pass 1

Pass 2

mock object

executable

Compiling

Linux* icc -c -ipo main.c func1.c

func2.c

Page 14: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

14

Interprocedural Optimizations

Compile & Optimize

Compile & Optimize

Compile & Optimize

Compile & Optimize

file1.c

file2.c

file3.c

file4.c

Without IPO

Compile & Optimize

file1.c

file4.c file2.c

file3.c

With IPO

-ip Only between modules of one source file

-ipo Modules of multiple files/whole application

Page 15: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Profile-Guided Optimizations (PGO) • Static analysis leaves many questions open for the

optimizer like: – How often is x > y – What is the size of count – Which code is touched how often

• Use execution-time feedback to guide (final) optimization

• Enhancements with PGO: • More accurate branch prediction • Basic block movement to improve instruction cache behavior • Better decision of functions to inline (help IPO) • Can optimize function ordering • Switch-statement optimization • Better vectorization decisions

15

if (x > y) do_this(); else do that();

for(i=0; i<count; ++I

do_work();

Page 16: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

PGO Usage: Three Step Process

16

Compile + link to add instrumentation icc -prof_gen prog.c

Execute instrumented program prog.exe (on a typical dataset)

Compile + link using feedback icc -prof_use prog.c

Dynamic profile: 12345678.dyn

Instrumented executable: prog.exe

Merged .dyn files: pgopti.dpi

Step 1

Step 2

Step 3

Optimized executable: prog.exe

Page 17: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Integrating Intel® C++ Compiler into ADT

• Fully automated via toolsuite installation script

• Warning message if insufficient access rights (root, sudo)

• Warning message if ADT installation in /opt/poky/1.1/ not present:

“Yocto* ADT has not been detected or its directory is not writable”

“The Yocto Project* Application Development Toolkit has not been detected on your system or its directory “/opt/poky/1.1” is not writable. This toolkit is required if you want to use the Intel® Composer XE for building Yocto Project* targeted applications. For automatic Intel® Composer XE integration with the Application Development Toolkit during installation, please make sure that the toolkit is installed and its directory is writable, and then re-check the prerequisites. For manual integration after installation, please consult the product Release Notes.”

17

Page 18: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Integrating Intel® C++ Compiler into ADT

/opt/intel/atom/composerxe/bin/ia32/yocto.env *platform: yocto *yocto_sdk_toolchain: %$(YOCTO_TOOLCHAIN) *sysroot: %$(YOCTO_SYSROOT) *target_root: %(sysroot) *gcc_install: %(sysroot)/usr/lib/gcc/i586-poky-linux/4.5.1 *intel_include: %(intel_root)/../compiler/include *intel_lib: %(intel_root)/../compiler/lib/ia32 *exec_path: %(yocto_sdk_toolchain)/i586-poky-linux

*exec_prefix: i586-poky-linux- *gxx_include: %(sysroot)/usr/include/c++/i586-pokyplinux/bits *link_lib_path: %(intel_lib)%(path_separator)%(gcc_install)%(path_separator)%(sysroot)/lib%(path_separator) %(sysroot )/usr/lib:%(sysroot)/usr/lib/i586-poky-linux/4.5.1 *link_start_files: %(sysroot)/usr/lib/i586-poky-linux/4.5.1/crtbegin.o %(sysroot)/usr/lib/crti.o %(sysroot)/usr/lib/crtn.o %(sysroot)/usr/lib/crt1.o *link_end_files: %(sysroot)/usr/lib/i586-poky-linux/4.5.1/crtend.o

18

Page 19: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Integrating Intel® C++ Compiler into ADT (cont.)

/opt/intel/atom/composerxe/bin/ia32/yocto.env

*link_default_libs: %{!static?%{i-dynamic|shared?-Bdynamic;-Bstatic}} -lsvml -limf \ %{!static?-Bdynamic} -lm \ %{!static?%{i-dynamic|shared?-Bdynamic;-Bstatic}} -lipgo -ldecimal \ %{i_cxxlink? \ %{cxxlib-gcc? \ %{!static?%{i-static|static-libcxa?-Bstatic;-Bdynamic}} -lcxaguard}} \ %{openmp-stubs?%{!static?%{i-static?-Bstatic;-Bdynamic}} -lompstub} \ %{!static?%{i-dynamic|shared?-Bdynamic;-Bstatic}} %{pic-libirc?-lirc_pic;-lirc} \ %{!static?-Bdynamic} -lc \ %{cxxlib-gcc? \ %{!cxxlib-nostd?%{!static?-Bdynamic} -lstdc++;%{!static?-Bdynamic} -lsupc++} \ %{static|static-libgcc? \ %{!static?-Bstatic} -lgcc -lgcc_eh; \ %{!shared?%{!static?%{static-libgcc?-Bstatic;-Bdynamic}} -lgcc -lgcc_s}} \ %{!static?-Bdynamic} -ldl -lc}

19

Page 20: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Integrating Intel® C++ Compiler into ADT (cont.)

/opt/poky/1.1/environment-setup-i586-poky-linux export PATH=/opt/poky/1.1/sysroots/i686-pokysdk-linux/usr/bin:/opt/poky/1.0/sysroots/i686-pokysdk-linux/usr/bin/i586-poky-linux:$PATH export PKG_CONFIG_SYSROOT_DIR=/home/roboima/test-yocto/x86 export PKG_CONFIG_PATH=/home/roboima/test-yocto/x86/usr/lib/pkgconfig export CONFIG_SITE=/opt/poky/1.1/site-config-i586-poky-linux export CC=icc export CXX=icpc export GDB=i586-poky-linux-gdb export TARGET_PREFIX=i586-poky-linux- export CONFIGURE_FLAGS="--target=i586-poky-linux --host=i586-poky-linux --build=i686-linux" export CFLAGS="-march=i586 -platform=yocto" export CXXFLAGS="-march=i586" export LDFLAGS="--sysroot=/home/roboima/test-yocto/x86" export CPPFLAGS="" export POKY_NATIVE_SYSROOT="/opt/poky/1.1/sysroots/i686-pokysdk-linux" export POKY_TARGET_SYSROOT="/home/roboima/test-yocto/x86" export POKY_DISTRO_VERSION="1.1" export POKY_SDK_VERSION="1.1" export POKY_ACLOCAL_OPTS="-I /opt/poky/1.1/sysroots/i686-pokysdk-linux/usr/share/aclocal"

Remove –sysroot references since covered in *.env file

20

Page 21: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Integrating Intel® C++ Compiler into ADT (cont.)

~/.bashrc

source /opt/intel/composerxe/bin/compilervars.sh ia32 source /opt/poky/1.1/environment-setup-i586-poky-linux export YOCTO_TOOLCHAIN=/opt/poky/1.1/sysroots/i686-pokysdk-linux/usr/bin export YOCTO_SYSROOT=/home/roboima/test-yocto/x86 export POKY_ACLOCAL_OPTS="-I /opt/poky/1.1/sysroots/i686-pokysdk-linux/usr/share/aclocal"

21

Page 22: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Intel® VTune™ Amplifier XE in Embedded

Where is my application…

Spending Time? Wasting Time?

• Focus tuning on functions taking time

• See time on source

• See cache misses on your source

• See functions sorted by # of cache misses

• Linux host and targets • Low overhead • No special recompiles

Advanced Profiling for Scalable Performance

22

Page 23: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Intel® VTune™ Amplifier XE

.TB5 file

Sampling Collector

Host

Features

• Statistic Analysis

• Low overhead sampling

• No instrumentation required

• Monitor processor events like cache misses etc.

• View results in source or assembly

Usage Model

• Two components

Intel® VTune™ Amplifier XE on host

Sampling Collector on the target

• Collect data on target and analyze it on the host

The Intel® VTune™ Amplifier XE helps identify

optimization opportunities in modules, functions or routines

Intel® VTune™ Amplifier XE in Embedded

23

Page 24: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Using Intel® VTune™ Amplifier XE Sampling Collector 1. Copy the Sampling Collector for Intel® VTune™ Amplifier XE located at

~/l_MID_DBG_p_2.2.xxx_amplifier_xe/rpm/sep34_linux_ia32.tar.gz

onto the Intel® Atom™ Processor based target device running Linux*

2. On the target device unpack the sampling collector using the following command:

# tar –xvzf sep34_linux_ia32.tar.gz

3. Install, customize, and rebuild sampling driver following instructions in Release_Install_All.pdf

=> New customizable script cc-sep3-driver for SEP cross-build requirements for reduced feature Linux* target OSs (CE Linux, Embedded Linux without core package etc…) This script is only part of the Embedded Software Development Tool Suite SEP Drop

Build custom sampling collector for Yocto Project*

on host, then deploy to target device

24

Page 25: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Sampling - How To Find Hotspots • Pick an event to sample and configure PMU

– Cache misses, branch mis-predictions, Dependency/pipeline stalls

• Start SEP sampling routine and application • Performance Monitoring Unit (PMU) periodically interrupts the processor

– Time based sampling – Event based sampling

Event Counter 1

SEP == ISR PMU

Event Counter 2

Event Counter 3

Event Counter 5

<0

<0

<0

<0

IRQ

Co

un

ter

regi

ster

s Collect

• Execution address in memory (CS:IP)

• OS process and thread ID

• Executable module loaded at that address

Write

• Information into *.TB5 file

• Numbers in counters define sampling rate

Event Counter 4 <0

General Purpose Event Registers Dedicated Event Registers

25

Page 26: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Intel® Debugger for Linux*

Cross-Debug Solution with advanced thread awareness

26

Page 27: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Menu & Toolbars (installation default)

Start / Stop / Run / Step control

Evaluation Windows

ASM, Register, Memory, Vector Register Windows Breakpoint, Call Stack, Thread Windows

27

Page 28: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Two ways of selecting an application to debug Option 1:

LOAD – the application will be loaded and the arguments passed to the app as specified in dialog.

Option 2: ATTACH to a running process by selecting it from the list.

28

Page 29: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Vector registers

29

Page 30: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Vector evaluation

Select and

drag & drop.

30

Page 31: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Assembly window

The ASM window will be displayed automatically if there are no source code available.

Intel style assembly

AT&T style assembly

31

Page 32: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Breakpoint dialog

Code breakpoint

Data breakpoint

32

Page 33: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Multi threading support

33

Page 34: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Control which thread to debug

In the Threads window you can select which thread you would like to debug. The context menu allow you to ‘freeze’ and ‘thaw’ individual threads. When you stop/hit a breakpoint all threads will be stopped and when you continue all threads will run.

34

Page 35: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Summary / Call to Action

• Intel® Embedded Software Development Tool Suite already includes everything needed to support Yocto Linux

• Integration into ADT is simple and automated

• First class performance tools and debug tools from Intel are already available for the Yocto project

• Get a license and give a try

35

Page 36: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

Optimization Notice

36

Page 37: Developing Yocto Linux* targeted Apps using Intel Dev Tools · legal information information in this document is provided in connection with intel products. no license, express or

37 3/23/2012