programming tutorial · april 4-7, 2016 | silicon valley thierry lepley, april 4th 2016 programming...

92
April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4 th 2016 PROGRAMMING TUTORIAL

Upload: others

Post on 31-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

April 4-7, 2016 | Silicon Valley

Thierry Lepley, April 4th 2016

PROGRAMMING TUTORIAL

Page 2: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

2 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

TUTORIAL GOAL

Intermediate Tutorial for Developers

Understand philosophy of the API

Understand main features of the API

Start developing with VisionWorks

Extra Credit

Come and ask more questions at the VisionWorks hangout (H6115)

Page 3: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

3

INTRODUCTION

Page 4: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

4 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

VISIONWORKS API

Core data objects: images, arrays, pyramids, etc.

Execution Framework : graphs, nodes, delays, etc.

Computer Vision primitives

Image filtering functions

Image arithmetic and analysis

Geometric transformations

Feature extraction and tracking

Depth and Flow

User extensibility : user kernels

CUDA Interop

What It Gives Access To ?

color

convert

Gaussian

pyramid

pyrLK optical flow

pyr 0 pyr -1 pts 0 pts -1

Page 5: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

5 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

VISIONWORKS SOFTWARE STACK

Tegra K1/X1, Kepler/Maxwell GPU

CUDA Acceleration Framework

Computer Vision Application

NVIDIA

Khronos

User

Extended

OpenVXTM

API

Low level

NVXCU

API

(alpha)

OpenVX Framework and Primitives

VisionWorks

Framework and Primitive Extensions

Cuda Interop

Page 6: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

6 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

Open consortium creating royalty-free, open standard

Main OpenVX goals

1. Define a subset of relevant primitives and image/data format

2. Enable acceleration on modern heterogeneous architectures

3. Provide portability and target performance portability across systems

Overview

Page 7: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

7 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

Timeline

Early 2012 OpenVX

Working group formed

October 2014

OpenVX 1.0 released

June 2015 OpenVX 1.0.1

released

Jan 2015 First conformant implementation

Nov 2015 First public

implementation

Page 8: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

8

AGENDA

Programming Basics

Efficient IO

Graph and Delay

Page 9: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

9

AGENDA

Programming Basics

General Philosophy

Primitives

Data Objects

Code Example

Page 10: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

10

AGENDA

Programming Basics

General Philosophy

Primitives

Data Objects

Code Example

Page 11: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

11 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

VISIONWORKS : C API

Can interop with any language

Application

Implementation of the API

No portability issue across compilers

Java Application

C++ application

C application

C implementation

C++ Implementation

C API

Page 12: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

12 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

CONTEXT

Need to be created first

Objects are created in a context

An OpenVX World

vx_context context = vxCreateContext();

vx_image img = vxCreateImage(context, 640, 480, VX_DF_IMAGE_RGB);

Page 13: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

13 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

OBJECTS Reference Counted

VisionWorks World (context)

Application World

Image object

Graph

Object

vx_image img (reference)

vx_image img = vxCreateImage(context, ...);

vx_graph graph = vxCreateGraph(context); vxBox3x3Node(graph, img, out);

vxReleaseImage(&img);

(reference) The Application gets object references

Object not destroyed until ref_count == 0

Safe Memory Management

vx_graph graph (reference)

Page 14: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

14 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

OBJECT REFERENCES

One reference type per object type : vx_image, vx_array, etc.

Some functions work on any object reference : down-cast to vx_reference

vx_reference

vx_status status = vxGetStatus((vx_reference)array);

vxSetParameterByIndex(node, 0, (vx_reference)input_image);

vx_array array = vxCreateArray(context, ...); vx_image img = vxCreateImage(array, ...); // Compile time error

Page 15: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

15 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

ERROR MANAGEMENT

Most of API calls : a vx_status code returned

Object creation : use vxGetStatus to check the object

Status Code

if (vxuColorConvert(context, input, output) != VX_SUCCESS) /* Error */

vx_image img = vxCreateImage(context, 640, 480, VX_DF_IMAGE_RGB); if (vxGetStatus((vx_reference)img) != VX_SUCCESS) /* Error */

Page 16: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

16 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

ERROR MANAGEMENT

Registered in a context

Called each time an error occurs

Textual Information : Log Callback

void logCallback(vx_context c, vx_reference r, vx_status s, const vx_char string[] m) /* Do something */ vxRegisterLogCallback(context, logCallback, vx_false_e);

Page 17: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

17 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

THREAD SAFETY

Functions: Same API function can be concurrently called from multiple thread

Objects: A context and its objects can be shared across threads

The application must ensure there is no ‘data race’ (e.g. with synchro)

Read Image

Context

image Read Image

Write Image Write Image

T1 T2

!

Page 18: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

18

ANY QUESTION SO FAR ?

Page 19: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

19

AGENDA

Programming Basics

General Philosophy

Primitives

Data Objects

Code Example

Page 20: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

20 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

COMPUTER VISION PRIMITIVES IMAGE ARITHMETIC Absolute Difference

Accumulate Image

Accumulate Squared

Accumulate Weighted

Add / Subtract/ Multiply +

Channel Combine

Channel Extract

Color Convert +

CopyImage

Convert Depth

Magnitude

Not / Or / And / Xor

Phase

Table Lookup

Threshold

FLOW & DEPTH

Median Flow

Optical Flow (LK) +

Semi-Global Matching

Stereo Block Matching

IME Create Motion Field

IME Refine Motion Field

IME Partition Motion Field

GEOMETRIC

TRANSFORMS Warp Affine +

Warp Perspective +

Flip Image

Remap

Scale Image +

FILTERS BoxFilter

Convolution

Dilation Filter

Erosion Filter

Gaussian Filter

Gaussian Pyramid

Laplacian 3x3

Median 3x3

Scharr 3x3

Sobel 3x3

FEATURES Canny Edge Detector

Fast Corners +

Fast Track

Harris Corners +

Harris Track

Hough Circles

Hough Lines

ANALYSIS Histogram

Histogram Equalization

Integral Image

Mean Std Deviation

Min Max Locations

+ Standard with NVIDIA Extensions

NVIDIA Proprietary

Page 21: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

21 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

PRIMITIVES EXECUTION

Immediate mode

Graph mode

2 options

Primitive

Page 22: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

22 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

PRIMITIVES EXECUTION

Blocking calls similar to OpenCV usage model

Prefixed with ‘vxu’

Useful for fast prototyping

Immediate Mode

// 3x3 box filter vxuBox3x3(context, src0, tmp); // Absolute Difference of two images vxuAbsDiff(context, tmp, src, dest);

Page 23: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

23 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

PRIMITIVES EXECUTION

Workload given ahead-of-time

More optimization opportunities

Good fit with video stream processing

Graph Mode

vx_graph graph = vxCreateGraph(context);

// Create nodes and check the graph ahead of time (errors detected here) vxBox3x3Node(graph, src0, tmp); vxAbsDiffNode(graph, tmp, src1, dest); vxVerifyGraph(graph);

// Execute the graph at runtime vxProcessGraph(graph);

Page 24: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

24 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

BORDER MANAGEMENT Supported Modes

? ? ?

?

?

3x3 box filter

A A B

A A B

C C

Replicate

n n n

n A B

n C

Constant (n)

Undefined

(default)

? ?

?

?

?

Page 25: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

25 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

BORDER MODES

Enum : VX_BORDER_MODE_[UNDEFINED | CONSTANT | REPLICATE]

Immediate execution: context attribute (state)

Graph : node attribute

API

vx_border_mode_t mode = VX_BORDER_MODE_CONSTANT, 0; vxSetContextAttribute(context, VX_CONTEXT_ATTRIBUTE_IMMEDIATE_BORDER_MODE, &mode, sizeof(mode));

vxuBox3x3(context, src, dest);

vx_border_mode_t mode = VX_BORDER_MODE_CONSTANT, 0; vx_node node = vxBox3x3Node(graph, src, tmp);

vxSetNodeAttribute(node, VX_NODE_ATTRIBUTE_BORDER_MODE, &mode, sizeof(mode));

Page 26: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

26 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

TARGET COMPUTE DEVICE

Most primitives have both CPU and GPU implementations

Functionality

CPU

GPU GPU 2 GPU 1 Primitive

Execution

Target controllable with the API

Default: automatic assignment

Context

GPU device ID

controllable

Page 27: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

27 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

TARGET COMPUTE DEVICE API

Options: NVX_DEVICE_GPU, NVX_DEVICE_CPU, NVX_DEVICE_ANY

Immediate execution: context attribute (state)

Graph : node setter function

nvx_device_type_e target = NVX_DEVICE_GPU; vxSetContextAttribute(context, VX_CONTEXT_ATTRIBUTE_IMMEDIATE_TARGET_DEVICE, &target, sizeof(target));

vxuBox3x3(context, src, dest);

vx_node node = vxBox3x3Node(graph, src, tmp);

nvxSetNodeTargetDevice(node, NVX_DEVICE_GPU);

Page 28: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

28

ANY QUESTION SO FAR ?

Page 29: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

29

AGENDA

Programming Basics

General Philosophy

Primitives

Data Objects a) Data Object philosophy

b) Focus: Images

c) Focus: Pyramids

d) Focus: Arrays

Code Example

Page 30: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

30

DATA OBJECTS

Images

Image: vx_image +

Image Pyramid : vx_pyramid +

Arrays

Array : vx_array +

Distribution : vx_distribution +

Look-up-table : vx_lut +

Matrices

Matrix: vx_matrix +

Convolution : vx_convolution +

Remap : vx_remap +

Scalars

Scalar : vx_scalar +

Threshold : vx_threshold +

Object + Standard OpenVX with NVIDIA Extensions (ex: access from CUDA)

Page 31: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

31 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

DATA OBJECT ACCESS

No permanent pointer to data content

Semi-opaque Objects

VisionWorks World (context)

Application World

Pixels vx_image img

(reference)

vx_uint8 *ptr

VisionWorks optimizes the memory management

Synchronize data with application only when needed

Minimize data synchronization between CPU and GPU

vxAccessImagePatch(img, &rect, 0, &addr, &ptr, VX_READ_AND_WRITE);

// Access data at address ‘ptr’

vxCommitImagePatch(img, &rect, 0, &addr, ptr);

// ‘ptr’ is now invalid

vxBox3x3(img, out_img);

Page 32: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

32 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

DATA OBJECT ACCESS

MAP mode (direct access)

• Host memory (CPU)

• CUDA memory

Copy Mode

• Host memory (CPU)

• CUDA memory

Access Modes

VisionWorks World (context)

Application World

vx_image img (reference)

Pixels

(Host mem)

vx_uint8 *ptr

Host Buffer

Pixels

(CUDA mem)

vx_uint8 *dev_ptr

CUDA Buffer

Page 33: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

33 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

DATA OBJECT ACCESS

vxAccess<Object>(…) : access the content

HOST: VX_READ_ONLY, VX_WRITE_ONLY, VX_READ_AND_WRITE

CUDA: NVX_READ_ONLY_CUDA, NVX_WRITE_ONLY_CUDA, NVX_READ_AND_WRITE_CUDA

vxCommit<Object>(…) : release the access and commit changes

API

vx_uint8 * pLut_cu = NULL; // NULL means ‘map’, non-NULL means ‘copy’ vxAccessLUT(lut_, (void **)&pLut_cu, VX_READ_ONLY_CUDA);

// ‘pLut_cu’ is a CUDA dev pointer that can be used by CUDA kernels

vxCommitLUT(lut, pLut_cu);

// ‘pLut_cu’ is now invalid

Page 34: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

34 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FOCUS: IMAGES

Mono or Multiplanar

Color formats: RGB, RGBX, RGB16, NV12, NV21, UYVY, YUYV, IYUV, YUV4

‘Gray’ scale: U8, U16, S16, 2S16, U32, S32, F32, 2F32

Formats

vx_image img = vxCreateImage(context, 640, 480, VX_DF_IMAGE_RGB);

Page 35: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

35 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FOCUS: IMAGES

Constant image

All pixels have the same value (no allocation needed)

Enables performance optimizations without duplicating the primitive API

Uniform Image

vx_uint8 pix[3] = 0x0, 0x33, 0xCC; vx_image img = vxCreateUniformImage(context, 640, 480, VX_DF_IMAGE_RGB, pix);

Page 36: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

36 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FOCUS: IMAGES

Rectangular sub-image

Same format as the parent image

Share pixels with the parent image (same memory)

Region of Interest (ROI)

Parent Image

start : inside

ROI

rectangle

end: outside

struct vx_rectangle_t

vx_uint32 start_x The Start X coordinate.

vx_uint32 start_y The Start Y coordinate.

vx_uint32 end_x The End X coordinate.

vx_uint32 end_y The End Y coordinate.

Page 37: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

37 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FOCUS: IMAGES ROI Example: Stereo Images

vx_rectangle_t left_rect = 0, 0, width, height; vx_image leftROI = vxCreateImageFromROI(inputRGB, &left_rect);

vx_rectangle_t right_rect = width, 0, 2*width, height; vx_image rightROI = vxCreateImageFromROI(inputRGB, &right_rect);

Input images from the Middlebury stereo dataset (http://vision.middlebury.edu/stereo/data)

2*width

height

Page 38: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

38 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FOCUS: IMAGES

Map : VisionWorks returns address and memory layout

Copy : The application provides address and memory layout

Access

void *ptr = NULL; // NULL means ‘map’ vx_imagepatch_addressing_t addr; // The memory layout will be returned here vx_rectangle_t rect = 0u, 0u, width, height ; vxAccessImagePatch(img, &rect, 0, &addr, &ptr, VX_READ_AND_WRITE);

// Access data at address ‘ptr’ with layout specified in ‘addr’

vxCommitImagePatch(img, &rect, 0, &addr, ptr);

void *ptr = &my_image[0]; // non NULL means a ‘copy’ vx_imagepatch_addressing_t src_addr = /* to fill */ ; vx_rectangle_t rect = 0u, 0u, width, height ; vxAccessImagePatch(vx_src, &rect, 0, &addr, &ptr, VX_READ_AND_WRITE);

// Access/modify data in my_buffer

vxCommitImagePatch(vx_src, &rect, 0, &addr, ptr);

Page 39: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

39 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FOCUS: IMAGES Memory Layout

stride_y

stride_x

typedef struct vx_uint32 dim_x; vx_uint32 dim_y; vx_int32 stride_x; vx_int32 stride_y; vx_uint32 scale_x; vx_uint32 scale_y; vx_uint32 step_x; vx_uint32 step_y; vx_imagepatch_addressing_t;

Row Major Ordering / Pitch Linear

Sub-sampled plans:

1 physical pixel every

‘step’ logical pixel

Nb (logical) pixels Row 0

Row 1

Row H-1

Col 0 Col 1 Col W-1 Col 2

WxH Image

Page 40: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

40 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FOCUS: PYRAMIDS

Formats : same as images

Configurable number of levels

Predefined scales : VX_SCALE_PYRAMID_HALF, VX_SCALE_PYRAMID_ORB

Multi-Resolution Image

vx_pyramid pyr = vxCreatePyramid(context, 5, 0.6f, 640, 480, VX_DF_IMAGE_RGB);

Level 0 Level 1 Level 2 Level 3 Level 4

scale

scale

scale scale

Page 41: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

41 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FOCUS: PYRAMIDS

Pyramid ‘levels’ are image objects

Pyramid used for tracking currently:

GaussianPyramid : generate a pyramid from an image

OpticalFlowPyLK : Lucas Kanade tracking

More About Pyramids

vx_image level1 = vxGetPyramidLevel(pyr, 1); // Increment the ref count

vxuBox3x3(context, level1, out_img);

vxReleaseImage(&level1); // Decrement de ref count

Page 42: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

42 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FOCUS: ARRAYS

Fix capacity (used by primitives to avoid overflow)

Variable number of items

Item types: rectangles, keypoints, coordinates 2D/3D

Array Creation

vx_array array = vxCreateArray(context, VX_TYPE_KEYPOINT, 1000);

! A too large capacity can negatively impact the performance

Page 43: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

43 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FOCUS: ARRAYS

Array of rectangles for dynamic ROIs (ex: object bounding box)

Image created from ROI for static ROI (stereo image, ROI for static surveillance cameras)

Usage Example : Dynamic ROIs

vx_array array = vxCreateArray(context, VX_TYPE_RECTANGLE, 50);

Page 44: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

44 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FOCUS: ARRAYS

Map : VisionWorks returns address and stride

Copy : The application provides address and stride

Access Example

void *base = NULL; // NULL means ‘map’ vx_size stride; vxAccessArrayRange(array, 2, 10, &stride, &base, VX_READ_AND_WRITE); // Access data of range [2, 10[ at address base vxCommitArrayRange(array, 2, 10, base);

void *base = &my_buffer[0]; // non NULL means a ‘copy’ void my_stride = sizeof(element_type); vxAccessArrayRange(array, 0, 10, &stride, &base, VX_READ_AND_WRITE); // Access data of range [2, 10[ in my_buffer vxCommitArrayRange(array, 2, 10, base);

Page 45: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

45 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FOCUS: ARRAYS Memory Layout

array range

stride

start index (2) :

inside

end index (10) :

outside

vxAccessArrayRange(array, 2, 10, &stride, &base, VX_READ_AND_WRITE);

Page 46: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

46

AGENDA

Programming Basics

General Philosophy

Primitives

Data Objects

Code Example

Page 47: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

47 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FEATURE DETECTION

Page 48: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

48 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FEATURE DETECTION

Feature detector primitives

HarrisCorner : strongest Harris points in the image

FastCorner : strongest FAST points in the image

HarrisTrack : balanced (per cell) Harris (re)detection

FastTrack : balanced (per cell) FAST (re)detection

Keypoint structures

vx_keypoint_t : int coordinates, strength, tracking error & status, …

nvx_keypointf_t : same as vx_keypoint_t except float coordinates

nvx_point2f_t : lightweight structure, only float coordinates

What is in the Toolkit ?

Page 49: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

49 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FEATURE DETECTION Processing

color

convert

Harris

corner

points (array of vx_keypoint_t)

frame_gray (U8 image)

frame (RGB image)

Page 50: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

50 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FEATURE DETECTION Prepare Data

color

convert

Harris

corner

// RGB image data at (compact layout) // address = pImage, width = W, height = H, compact memory layout // Create data objects vx_image frame = vxCreateImage(context, W, H, VX_DF_IMAGE_RGB); vx_image frame_gray = vxCreateImage(context, W, H, VX_DF_IMAGE_U8); vx_array points = vxCreateArray(context, VX_TYPE_KEYPOINT, 1000); // Copy the input data into the vx_image object vx_imagepatch_addressing_t addr; addr.stride_x = 3*sizeof(vx_uint8); // R + G + B addr.stride_y = addr.stride_x * W; void *p = pImage; // Non NULL pointer means a ‘copy’ vx_rectangle rect = 0, 0, W, H; // Entire image vxAccessImagePatch(frame, &rect, 0, &addr, &p, VX_WRITE_ONLY); vxCommitImagePatch(frame, &rect, 0, &addr, p);

points (array of vx_keypoint_t)

frame_gray (U8 image)

frame (RGB image)

frame (RGB image)

Page 51: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

51 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FEATURE DETECTION Computation and Get Outputs

// RGB to U8 conversion vxuColorConvert(context, frame, frame_gray);

// Keypoint detection : Harris corner vxuHarrisCorners(context, frame_gray, s_strength_thresh, min_dist, k_sensivity, gradientSize, blockSize, points, 0);

// Access keypoints vx_size nb_kp; vxQueryArray(points, VX_ARRAY_ATTRIBUTE_NUMITEMS, &nb_kp, sizeof(nb_kp));

vx_size stride; // Returned by the access function vx_keypoint_t *base = NULL; // NULL means ‘map’ (direct access) vxAccessArrayRange(points, 0, nb_kp, &stride, (void **)&base, VX_READ_ONLY);

// Access keypoints starting from address ‘p’ with ‘stride’ vx_keypoint_t *p = base; for(vx_size i = 0; i < nb_kp; i++, p = (vx_keypoint_t *)((char*)p + stride) ) // ...

vxCommitArrayRange(points, 0, nb_kp, p);

color

convert

Harris

corner

color

convert

Harris

corner

points (array of vx_keypoint_t)

frame_gray (U8 image)

frame (RGB image)

frame_gray (U8 image)

points (array of vx_keypoint_t)

Page 52: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

52 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

STANDARD KEYPOINT vx_keypoint_t Structure

Data Fields Detector Tracker

vx_int32 x The x coordinate. X X

vx_int32 y The y coordinate. X X

vx_float32 strength The strength of the keypoint. Its definition is specific to the

corner detector. X

vx_float32 scale Initialized to 0 by (current) corner detectors. (x)

vx_float32 orientation Initialized to 0 by (current) corner detectors. (x)

vx_int32 tracking_status A zero indicates a lost point. Initialized to 1 by corner

detectors. X X

vx_float32 error A tracking method specific error. Initialized to 0 by corner

detectors. X

Page 53: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

53 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

MORE PRECISION NEEDED ?

nvx_keypointf_t : same as vx_keypoint_t, with float coordinates

nvx_point2f_t : lightweight, no error computation

The rest of the code is unchanged

Simply Change the Output Array

vx_array points = vxCreateArray(context, NVX_TYPE_KEYPOINTF, 1000);

vx_array points = vxCreateArray(context, NVX_TYPE_POINT2F, 1000);

Data Fields

vx_float32 x The X coordinate (-1.0f when tracking lost)

vx_float32 y The Y coordinate (-1.0f when tracking lost)

Page 54: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

54

ANY QUESTION SO FAR ?

Page 55: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

55

AGENDA

Programming Basics

Efficient IO

Graph and Delay

Page 56: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

56 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

EFFICIENT IO BY AVOIDING COPIES

Processing

GStreamer

CSI / GMSL

Camera **

Image

in CUDA

memory

USB Camera Image

in Host

memory OpenCV

Images

created

from handle

NVMedia

** NVIDIA Embedded platforms support CSI Cameras; NVIDIA Automotive platforms support GMSL Cameras

Page 57: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

57 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

IMAGE CREATED FROM HANDLE

VisionWorks World

Application World

Application

Pixel

Buffer

vx_image img (reference)

vx_uint8 *handle

Moves from the Application to the VisionWorks World

The handle must NOT be used

directly by the application

after the image object creation

!

Principle

Page 58: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

58 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

IMAGE CREATED FROM HANDLE

CPU or CUDA memory : VX_IMPORT_TYPE_HOST, NVX_IMPORT_TYPE_CUDA

Useful for both input and output images

Creation API

vx_image img = vxCreateImageFromHandle( context, VX_DF_IMAGE_RGB, &addr[0], // Plane layouts &ptrs[0], // Plane handles NVX_IMPORT_TYPE_HOST );

Page 59: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

59 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

IMAGE CREATED FROM HANDLE

Image access like other images : Map/Copy, Host/CUDA

Access with Standard Access/Commit

Mapped at its original address / memory layout

Property of memory back to the application at image destruction

Page 60: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

60 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

IMAGE CREATED FROM HANDLE EXAMPLE OpenCV Interop

OpenCV USB

webcam

Image

In Host

memory

Import a Webcam image into VisionWorks

directly from the Host memory

Page 61: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

61 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

OPENCV INTEROP EXAMPLE Import a webcam image

// Create a Video Capture from OpenCV cv::VideoCapture inputVideo; inputVideo.open(0); // Grab data from the default webcam

// VideoCapture always returns a BGR image, transform it into RGB cv::Mat cv_bgr, cv_rgb; inputVideo.read(cv_bgr); cv::cvtColor(cv_bgr, cv_rgb, cv::COLOR_BGR2RGB);

// Import into VisionWorks vx_imagepatch_addressing_t addr; addr.dim_x = cv_rgb.cols; addr.dim_y = cv_rgb.rows; addr.stride_x = 3*sizeof(vx_uint8); addr.stride_y = cv_rgb.step; void *ptrs[] = cv_rgb.data ;

vx_image img = vxCreateImageFromHandle(context, VX_DF_IMAGE_RGB, &addr, ptrs, VX_IMPORT_TYPE_HOST);

Page 62: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

62 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

OPENCV INTEROP EXAMPLE Refresh an Image

// Mapping an image created from handle will map at the // exact same address and with the same memory layout void *base = NULL; // NULL means ‘map’ vx_imagepatch_addressing_t addr; vx_rectangle_t rect = 0u, 0u, cv_rgb.cols, cv_rgb.rows; vxAccessImagePatch(img, &rect, 0, &addr, &base, VX_WRITE_ONLY);

// Refresh the OpenCV image inputVideo.read(cv_src_bgr); cv::cvtColor(cv_src_bgr, cv_src_rgb, cv::COLOR_BGR2RGB);

// Commit back changes vxCommitImagePatch(img, &rect, 0, &src_addr, base);

Page 63: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

63

ANY QUESTION SO FAR ?

Page 64: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

64

AGENDA

Programming Basics

Efficient IO

Graph and Delay

User Kernel

Low-Level CUDA API (alpha)

Graphs

Delay Object

Code Example

Page 65: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

65

AGENDA

Programming Basics

Efficient IO

Graph and Delay

User Kernel

Low-Level CUDA API (alpha)

Graphs

Delay Object

Code Example

Page 66: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

66 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

WHAT IS A GRAPH ?

Computer vision pipeline specified ahead-of-time

Adapted to processing of video streams

Best for performance : enables global optimizations

Node

n1

data2

Node

n2 Node

n3

data3 data4

data5

Node

n4

data1

Page 67: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

67 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

ENSURE BEST PERFORMANCE Im

media

te M

ode

Gra

ph M

ode

Page 68: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

68 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

GRAPH LIFE CYCLE

Graph

Creation

Graph

Verification

Graph

Execution

Ahead of time / Boot time

Runtime

Graph

Release

Shutdown time

Page 69: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

69 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

A. GRAPH CREATION

Any number of graphs can be created

Node = instance of vision primitive with well defined parameters

Graph and Nodes

vx_graph graph = vxCreateGraph(context);

vx_image frameRGB, frameYUV; // Already created

vx_node color = vxColorConvertNode(graph, frameRGB, frameYUV);

Color

Convert

frameRGB

frameYUV

Page 70: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

70 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

A. GRAPH CREATION

Implicit (no edge object)

Determined from nodes parameters (write → read relationships)

Object hierarchy considered in the dependency analysis (pyramid levels, ROIs)

Edges

vxColorConvertNode(graph, frameRGB, frameGray); vxGaussianPyramidNode(graph, frameGray, pyramid);

Color

Convert

frameRGB

frameGray

Gaussian

Pyramid

pyramid

Box3x3

level1

box1

vx_image level1 = vxGetPyramidLevel(pyramid, 1); vxBox3x3Node(graph, level1, box1);

Page 71: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

71 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

B. GRAPH VERIFICATION

Error checking (parameter consistency, graph cycles)

Memory allocation

Node or device related initialization

Optimizations

Better done at setup time

What it does ?

Page 72: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

72 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

B. GRAPH VERIFICATION

Explicitly by the application (preferred)

Implicitly by VisionWorks when needed

Must be done before execution

vx_status status = vxVerifyGraph(graph);

vx_status status = vxProcessGraph(graph); // The graph will be automatically // verified if needed at that time

Page 73: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

73 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

C. GRAPH EXECUTION

Synchronous

Asynchronous

Accessing objects used by the graph concurrently to the execution is forbidden

Two Modes

vx_status status = vxProcessGraph(graph);

vx_status status = vxScheduleGraph(graph);

// Do something on CPU

status = vxWaitGraph(graph);

Page 74: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

74 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

VIRTUAL DATA OBJECTS

The application not allowed to access the object content

Images, arrays and pyramids can be virtual

Enables more optimizations (example : kernel fusion)

Intermediate Graph Data

vx_image in = vxCreateImage(context, 1920, 1080, VX_DF_IMAGE_NV12); vx_image out = vxCreateImage(context, 1920, 1080, VX_DF_IMAGE_U8); vx_image tmp = vxCreateVirtualImage(context, 1920, 1080, VX_DF_IMAGE_U8);

// Create, verify the graph and execute the graph vx_node nExtract = vxChannelExtract(graph, in, VX_CHANNEL_Y, tmp); vx_node nBox = vxBox3x3Node(graph, tmp, out);

Channel

Extract

in

tmp

Box3x3

out

Page 75: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

75

AGENDA

Programming Basics

Efficient IO

Graph and Delay

User Kernel

Low-Level CUDA API (alpha)

Graphs

Delay Object

Code Example

Page 76: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

76 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

DELAY Collection of Data Objects

Slot –2 Slot –3 Slot -1 Slot 0

4 slots delay containing images

Page 77: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

77 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

DELAY Rolling Buffer

Delay Aging

Slot –2 Slot –3 Slot -1 Slot 0

Page 78: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

78 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

DELAY Rolling Buffer

Delay Aging

Slot –2 Slot –3 Slot -1 Slot 0

Page 79: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

79 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

DELAY API

Exemplar object replicated in each delay slot (meta-data)

The exemplar object:

• Can be any data object (image, array, pyramid, …)

• Not affected by the delay creation, not link to the delay

• No memory allocation if created and released just for the delay

Creation

vx_image exemplar = vxCreateImage(context, 640, 480, VX_DF_IMAGE_RGB);

// Create a 3 slot delay containing 640x480 VX_DF_IMAGE_RGB images vx_delay delay = vxCreateDelay(context, (vx_reference)exemplar, 3);

// The exemplar can now be released immediately vxReleaseImage(&exemplar);

Page 80: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

80 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

DELAY API

The delay slot can be used as any object

Exception: the delay slot must not be released

Get a Reference of Objects in Slots

// Get references to delay slots vx_image prev_img = (vx_image)vxGetReferenceFromDelay(delay, -1); vx_image cur_img = (vx_image)vxGetReferenceFromDelay(delay, 0); // Add a node to the graph vxAbsDiffNode(context, prev_img, cur_img, out)

Page 81: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

81 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

DELAY API

Aging a delay moves data from object in slot n to object in slot n-1

Only data content shift, not objects

Zero copy : internal handle switch

Aging

// Age the delay vxAgeDelay(delay);

Pixel

Buffer

Pixel

Buffer

Pixel

Buffer

Image Object

Image Object

Image Object

After aging

Before aging

Slot 0 Slot -1 Slot -2

Page 82: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

82

AGENDA

Programming Basics

Efficient IO

Graph and Delay

User Kernel

Low-Level CUDA API (alpha)

Graphs

Delay Object

Code Example

Page 83: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

83 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FEATURE TRACKING

Page 84: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

84 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FEATURE TRACKING

Feature detector primitives

HarrisCorner : strongest Harris points in the image FastCorner : strongest FAST points in the image HarrisTrack : balanced (per cell) Harris (re)detection FastTrack : balanced (per cell) FAST (re)detection

OpticalFlow / tracking primitives

OpticalFlowPyrLK : sparse pyramidal Lucas-Kanade optical flow GaussianPyramid : generate a pyramid from an image

Keypoint structures

vx_keypoint_t : int coordinates, strength, tracking error & status, … nvx_keypoint_t : same as vx_keypoint_t except float coordinates nvx_point2f_t : lightweight structure, only float coordinates

What is in the Toolkit ?

Page 85: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

85 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

FEATURE TRACKING PROCESSING

color

convert

Harris

corner

array of vx_keypoint_t

De

tec

tio

n

.

.

. . .

. . .

.

.

.

. .

. . . . . . .

. . . .

.

. . .

. . . . . . . . . . . . . . . .

.

. .

. .

.

color

convert

Gaussian

pyramid

pyrLK optical flow

pyr 0

array of vx_keypoint_t

pyramids

pyr -1 pts 0 pts -1

Tra

ck

ing

color

convert

Gaussian

pyramid

pyrLK optical flow

pyr 0

array of vx_keypoint_t

pyramids

pyr -1 pts 0 pts -1

Tra

ck

ing

Page 86: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

86 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

TRACKING GRAPH Data Objects

pyr 0

pyramid_delay

pyr -1 pts 0 pts -1

keypoint_delay

// Import the input data into VisionWorks vx_image inputRGB = vxCreateImageFromHandle(context, VX_DF_IMAGE_RGB, &addr, ptrs, NVX_IMPORT_TYPE_CUDA);

// Create the intermediate image vx_image inputGray = vxCreateImage(context, width, height, VX_DF_IMAGE_U8); // Image pyramids for two successive frames (2 slots delay object) vx_pyramid pyramid_exemplar = vxCreatePyramid(context, 4, VX_SCALE_PYRAMID_HALF, width, height, VX_DF_IMAGE_U8); vx_delay pyramid_delay = vxCreateDelay(context, (vx_reference)pyramid_exemplar, 2); vxReleasePyramid(&pyramid_exemplar); // Tracked points need to be stored for two successive frames (2 slot delay object) vx_array keypoint_exemplar = vxCreateArray(context, VX_TYPE_KEYPOINT, 2000); vx_delay keypoint_delay = vxCreateDelay(context,(vx_reference)keypoint_exemplar, 2); vxReleaseArray(&keypoint_exemplar);

inputRGB

array of vx_keypoint_t

inputGray

color

convert

Gaussian

pyramid

pyrLK optical flow

Page 87: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

87 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

TRACKING GRAPH Nodes and Verification

vx_graph graph = vxCreateGraph (context);

// RGB to Y conversion nodes vxColorConvertNode (graph, inputRGB, inputGray);

// Pyramid node vx_node pyr_node = vxGaussianPyramidNode (graph, inputGray, (vx_pyramid) vxGetReferenceFromDelay(pyramid_delay, 0));

// Lucas-Kanade optical flow node, previous keypoints given as new estimates vxOpticalFlowPyrLKNode (graph, (vx_pyramid) vxGetReferenceFromDelay(pyramid_delay, -1), // previous pyramid (vx_pyramid) vxGetReferenceFromDelay(pyramid_delay, 0), // current pyramid (vx_array) vxGetReferenceFromDelay(keypoint_delay, -1), // previous keypoints (vx_array) vxGetReferenceFromDelay(keypoint_delay, -1), // new keypoints estimate (vx_array) vxGetReferenceFromDelay(keypoint_delay, 0), // new keypoints VX_TERM_CRITERIA_BOTH, s_lk_epsilon, s_lk_num_iters, s_lk_use_init_est, 10);

// Graph verification status = vxVerifyGraph (graph);

pyr 0

pyramid_delay

pyr -1 pts 0 pts -1

keypoint_delay

inputRGB

array of vx_keypoint_t

inputGray

color

convert

Gaussian

pyramid

pyrLK optical flow

color

convert

Gaussian

pyramid

pyrLK optical flow

Page 88: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

88 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

TRACKING EXECUTION

pyr 0

pyramid_delay

pyr -1 pts 0 pts -1

keypoint_delay

inputRGB

array of vx_keypoint_t

inputGray

color

convert

Gaussian

pyramid

pyrLK optical flow

color

convert

Gaussian

pyramid

pyrLK optical flow

// Data objects creation

// <…>

// Graph creation & verification

// <…>

// Process the first frame (keypoints detection)

// <…>

// Main processing loop

for (;;)

void *devptr = 0;

vx_rectangle_t rect = 0, width, 0, height;

vxAccessImagePatch (inputRGB, &rect, &addr, 0, &devptr, NVX_WRITE_ONLY_CUDA);

// Get next frame data into ‘devptr’ here

// <…>

vxCommitImagePatch (inputRGB, &rect, 0, &addr, devptr);

// Process graph

vxProcessGraph (graph);

// ‘Age’ delay objects for the next frame processing

vxAgeDelay (pyramid_delay);

vxAgeDelay (keypoint_delay);

new_frame

Page 89: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

89

QUESTIONS ?

Page 90: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

90 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

INTERESTED IN OTHER DEVELOPMENT ASPECTS ?

Samples

Documentation

Debug and profiling

VisionWorks Toolkit Hands-on lab

(L6129)

Page 91: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

91 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

MORE QUESTIONS ?

VisionWorks Hangout (H6115)

Page 92: PROGRAMMING TUTORIAL · April 4-7, 2016 | Silicon Valley Thierry Lepley, April 4th 2016 PROGRAMMING TUTORIAL

April 4-7, 2016 | Silicon Valley

THANK YOU

JOIN THE NVIDIA DEVELOPER PROGRAM AT developer.nvidia.com/join