bridging the semantic gap - university of illinois at ...ece417/lecturenotes/ece417 spring...

Bridging the Semantic Gap Bridging the Semantic Gap

ECE 417 Spring 2013

Mert Dikmen

ECE 417 Spring 2013

Mert Dikmen

Semantic Gap Semantic Gap

Computer

Representation

Semantic Gap

Natural

Language

Representation


Green


Corner


Roof


Ski Slope


Resort


Fun

Holiday

Beautifulhellip

Semantic Gap in Multimedia Semantic Gap in Multimedia

Retrieval Given a description retrieve all ldquorelevantrdquo content from a

database

Parsing Given an input formulate a natural language description

Subtasks

Detection (find ldquothingsrdquo)

Segmentation (find the boundaries of ldquothingsrdquo)

Recognition (assign category)


database


Subtasks




Multimedia Analysis Competitions and

Evaluations


Evaluations

Moderate size dataset

Training set with labels

Evaluation set without labels

Constrained problem

Detect well defined actions

Detect words or concepts

Well defined metric

Challenges

Algorithm design

Computation




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation

Star Challenge Star Challenge

PART I visual data processing PART I visual data processing

What is Star Challenge What is Star Challenge

Competition to Develop Worldrsquos Next-Generation Multimedia Search Technology

Hosted by the Agency for Science Technology and Research (ASTAR) Singapore

A real-world computer vision task which requires large amounts of computation power




But low rewards But low rewards

56 teams

from 17 countries

Round 1

8 teams

Round 2

7 teams

Round 3

5 teams Grand Final

in Singapore

No rewards No rewards No rewards

No rewards

Only one team

can win

US$100000

Xiaodan Lyon Paritosh Mark Tom Mandar Sean Jui-Ting Zhen Huazhong Xi

Vong Xu Mert Dennis Jason Andrey Yuxiao

But we have a team with no fearshellip But we have a team with no fearshellip




Letrsquos go over our experience and

storieshellip


storieshellip

Outlines Outlines

Problems of Visual Retrieval

Data

Features

Algorithms

Results (first 3 rounds)


Data

Features

Algorithms


3 Audio Retrieval Tasks 3 Audio Retrieval Tasks Task Query Target Metric Data Set

AT1 IPA sequence segments that contain the

query IPA sequence

regardless of its languages

Mean Average Precision

25 hours

monolingual

database in

round1

13 hours

multilingual

database in

round3

AT2 an utterance spoken

by different speakers all segments that contain the

query wordphrasesentence

regardless of its spoken

languages

AT3 No queries extract all recurrent segments

which are at least 1 second in

length

F-measure

Xiaodan will talk about this parthelliphellip

3 Video Retrieval Tasks 3 Video Retrieval Tasks

Task Query Target Criteria Metric Data Set

VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics

Classification accuracy 10(20)

categories

including one

ldquoothersrdquo

category

20 VT1 Categories 20 VT1 Categories

100 Not-Applicable None of the labels

101 Crowd (gt10 people)

102 Building with sky as backdrop clearly visible

103 Mobile devices including handphonePDA

104 Flag

105 Electronic chart eg stock charts airport departure chart

106 TV chart Overlay including graphs text PowerPoint style

107 Person using Computer both visible

108 Track and field sports

109 Company Trademark including billboard logo

110 Badminton court sports

111 Swimming pool sports

112 Close-up of hand eg using mouse writing etc

113 Business meeting (gt 2 people) mostly seated down table visible

114 Natural scene eg mountain trees sea no people

115 Food on dishes plates

116 Face close-up occupying about 34 of screen frontal or side

117 Traffic Scene many cars trucks road visible

118 BoatShip over sea lake

119 PC Webpages screen of PC visible

120 Airplane





104 Flag
















120 Airplane

10 Categories for VT2 10 Categories for VT2

201 People enteringexiting doorcar

202 Talking face with introductory caption

203 Fingers typing on a keyboard

204 Inside a moving vehicle looking outside

205 Large camera movement tracking an object person car etc

206 Static or minute camera movement people(s) walking legs visible

207 Large camera movement panning leftright topdown of a scene

208 Movie ending credit

209 Woman monologue

210 Sports celebratory hug









209 Woman monologue






112 Closeup of hand eg using mouse writing etc

116 Face closeup occupying about 34 of screen frontal or side






Video+Audio Tasks in Round 3 Video+Audio Tasks in Round 3

1) Audio search (AT1 or AT2)

5 queries will be given either in the form of IPA sequence or waveform and the participants are required to solve 4

2) Video search (VT1)

5 queries will be given and the participants are required to solve 4

3) Audio + Video search (AT1 + VT2)

The search queries for this task are a combination of IPA sequencewaveform and video category The participants are required to retrieve segments of data which contains sound and video corresponding to the given IPA sequencewaveform and video category respectively 3 queries will be given and the participants are required to solve 2







Examples of Images Examples of Images

More samples More samples

Evaluation Video Data of Round2 Evaluation Video Data of Round2

31 Mpeg Videos ~20 hours

17289 frames for VT1 in total


32508 pseudo key frames 8486 real key frames






Video Files 27 Mpeg1 files (13 hours of videoaudio in total)

Key frames for VT1 10580 jpg files

Key frames for VT2 64546 files in total including 10580 jpg files (true key frames) + 53966 jpg

files (pseudo key frames)

Video 352288





Video 352288

Computation Powers Computation Powers

Work Stations in IFP

10 Servers 2~4 CPU each 36CPU in total

IFP-32 Cluster 32 dual-core 28G 64bit CPU

CSL Cluster

Trusted-ILLIAC 256 nodes with dual 22 GHz Opterons 2 GB of RAM and 73 GB SCSI Ultra320 disks

Monolith 128 node cluster with dual Pentium III CPUs at 1 Ghz with 15 GB of RAM per node

TeraGrid




CSL Cluster



TeraGrid

Time Cost for Video Tasks Time Cost for Video Tasks

Data Decompression 15 minutes

Video Format Conversion 2 hours

Video Segmentation (for VT2) 40 minutes

Sound Track Extraction 30 minutes

Feature Extraction

Global Feature 2 2 hours (c)


Patch-based Feature1 2 hours (c)

Patch-based Feature2 5 hours (matlab)

Semantic Feature 1 24 hours (matlab)

Semantic Feature 2 3 hours (c)


Motion Feature 1 24 hours (matlab)

Motion Feature 2 3 hours on t-Illiac

Classifier Training

Classifier 1 1 hour (on IFP cluster25 CPU matlab)

Classifier 2 20 minutes

Classifier 3 less than 10 minutes





Feature Extraction










Classifier Training




Possible Accelerations for Video Possible Accelerations for Video

Matlab codes to C

Parallel computing

GPU Acceleration

Patch based features

Load time is the major issue

Extracting all the features after one load

Matlab codes to C

Parallel computing

GPU Acceleration




Features for Round2- VT1 Features for Round2- VT1

Image Features

SIFT

HOG

GIST

APC

LBP

Color Texture and etc

Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Features for Round2-VT2 Features for Round2-VT2

Character Detector

Harris corner

morphological operations

Optical Flow

Lucas-Kanade on spatial intensity gradient

Gender recognition

SODA-boost based

Motion History Image

Spatial interest points

Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



GUFE Grand Unified Feature

Extractor


Extractor

Designed by Dennis

Collects features generated by team members into one standard format

Retrieval by Query Expansion based on NN

Feature NormalizationCombination

Result Visualization

Designed by Dennis





Observations Observations 1 Samples under the same category are more semantic

similar to each other

2 The shot boundaries are not well defined

3 some of the key frames are not labeled correctly

eg VT1 101 103(26-141)

1 Samples under the same category are more semantic




eg VT1 101 103(26-141)

Algorithms Algorithms

Input a query image and its category number

0 Preprocessing compute the matching between the evaluation and the

development data

Query Expansion

1 Expand the query image by retrieving all the images from the development

data set with the same category

2 Search the evaluation set with the expanded query

Output return the top 5020 results


Motivation using a GMM to model the distribution of

patches

1 Train a UBM (Universal Background Model) based on

patches from all training images

2 MAP Estimation of the distribution of the patches

belonging to one image given UBM

3 Compute pair-wise image distance based on patch

kernel and within-class covariance normalization

3 Retrieving images based the normalized distance

VT1 Performance (2 in 8) VT1 Performance (2 in 8)

Category MAP

bull101 Crowd (gt10 people) 08419

bull102 Building with sky as backdrop clearly visible 0977

bull103 Mobile devices including handphonePDA 0028

bull107 Person using Computer both visible 02281

bull109 Company Trademark including billboard logo 096

bull112 Closeup of hand eg using mouse writing etc 04584

bull113 Business meeting (gt 2 people) mostly seated down table visible 00644

bull115 Food on dishes plates 02285

bull116 Face closeup occupying about 34 of screen frontal or side 09783

bull117 Traffic Scene many cars trucks road visible 02901

VT2 Performance(1in8) VT2 Performance(1in8)

Category MAP

bull202 Talking face with introductory caption 08432

bull206 Static or minute camera movement people(s)

walking legs visible 00581

bull207 Large camera movement panning leftright

topdown of a scene 07789

bull208 Movie ending credit 02782

bull209 Woman monologue Zhen 09756

Performance of Round3 (1in7) Performance of Round3 (1in7)

Task 2 (VT1)

Target Estimated MAP (R=20)

101 Crowd (gt10 people) 064

102 Building with sky as backdrop clearly visible 1

107 Person using Computer both visible 07

112 Closeup of hand eg using mouse writing etc 0527

116 Face closeup occupying about 34 of screen frontal or side 1

Task3 (AT1 + VT2)

Retrieval Target VT2 only AT1 + VT2

Video R=20

202 face with introductory caption 1 003

209 women monolog 035 01

201 People entering door NA

We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid

TREC Text REtrieval Conference TRECVid Video Retrieval Workshop

Shot Boundary Detection Copy Detection Video Search Summarization High Level Feature Extraction Surveillance Event Detection

Our Task

Surveillance Event Detection

The Dataset Surveillance footage from London Gatwick Airport 5 stationary cameras Training set 100 Hours Testing set 44 Hours Frame size 700x540 px Dataset Size ~ 350 GB Frames ~12 million


List of Events 1 Cell to ear 2 Embrace 3 Object Put 4 Opposing flow 5 Pointing 6 Taking picture 7 Running 8 People meeting 9 People splitting 10Person not entering elevator

Regional Averaging

Door OpenClose Information

Event Detections

Thresholding Rule

Detection of Opposing Flow Event

Vision Video Library (ViVid) Utilizing GPUs in Computer Vision


Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++

Integrates Libraries

Data flow

Lazy pull

Per frame referencing

Caches (lots of them)

Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Motivations Motivations

Most operations are highly local

Applications with real time (or faster) performance requirements

Surveillance

Soft biometrics

Multimedia Indexing

Visual Computing is here

Imaging and Photogrammetry

Pattern Recognition and Statistical Learning

Object Detection and Recognition

Dynamic Vision

Interactive and Internet Vision



Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision


Working with ViVid Working with ViVid

Why parallel Why parallel

Massive amounts of data

20 hours of video uploaded to YouTube every minute

15 billion photos on Facebook

Most operations are local and independent in the (xyt) space

Already available (GPUs)






If individual frames were to be counted as images YouTube replicates the entire Facebook image DB every ~5 days

Image Video Processing

Video Decoder

2D3D Convolution

2D3D Fourier Transform

Optical Flow

Feature Extraction

Motion Descriptor (Efros et al)

Motion History Descriptor

Random Video Interest Points

Histograms of Oriented Gradients Optical Flow

Analysis Vector Quantization

SVM Classifier Evaluation

ViVid ndash Video Computer Vision on Graphics Processors

Download

httpgithubcommertdikmenViVid

TRECVid 2008 System

TRECVid 2009 System TRECVid 2009 System

Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points

Video Interest Point Detectors Video Interest Point Detectors

Laptev

3D Harris Corner Detector

Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB

Random Sampling of the Motion Boundary

Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB

More is Good More is Good

Interest Point Detection Rates

Video Features Video Features

Descriptors of information relevant to the task

Motion

Shape

Appearance

Computationally intensive

Development

Application


Motion

Shape

Appearance


Development

Application

Averaging Flow (Efros et al 2003) Averaging Flow (Efros et al 2003)

Motion History Images

(Bobbick amp Davis 2001)



otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300

CUDA (feature + distance + argmin)

CUDA (distance + argmin)

CUDA (distance)

C

milliseconds per frame

Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow

bull Partition the image window into local regions

bull Histogram the Image GradientOptical Flow based

on the direction and magnitude

bull Normalize over neighboring regions

Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





K-Means Clustering K-Means Clustering

Vector quantization

Turns high dimensional features into discrete number of points

Given data find representative ldquocentersrdquo

Lloydrsquos algorithm

For each data point find the closest center

Update the center to be the mean of the associated data points

Vector quantization






K-Means K-Means

Relies heavily on pairwise distance

Large data sets

1 million features with 100-200 dimensions

1000 centers

Cannot fit output in GPU memory

Will need to reduce computation proceeds

Need efficient reduction operator


Large data sets


1000 centers




Clustering Helps Clustering Helps

Pairwise Distance Implementation Pairwise Distance Implementation

0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given

CPU vs GPU CPU vs GPU

Algorithmic properties that map well to GPUs

1 Independent and highly data local

computations

2Compute bound

3Little branch divergence

Pairwise Distance Computation on the GPU Pairwise Distance Computation on the GPU

Shared

Memory

A B

C

Pairwise Distance Computation Pairwise Distance Computation

A B

C


A B

C

Timings on TRECVid 2008 System Timings on TRECVid 2008 System

53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark

Dictionary Building Strategies Dictionary Building Strategies

Dictionary Size

Histogramming Method Rate of Detection

Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756

Results (2009) Results (2009)

True Positives False

Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060

Person Runs 1 38 106 0997

Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)

Conclusions Conclusions

Some practical problems are very hard to solve

Fusion of many different approaches

Take advantage of all available hardware

Cloud

GPUs

ContestsEvaluations Experience

Working with realistic data

Engineering Programming

Tight schedule streamlined development




Cloud

GPUs





Examples of Evaluations Examples of Evaluations

Trecvid 2012 Task - Semantic indexing (SIN)

Task - Known-item search (KIS)

Task - Interactive surveillance event detection (SED)

Task - Instance search (INS)

Task - Multimedia event detection (MED)

Task - Multimedia event recounting (MER)







Pascal Visual Object Classes Pascal Visual Object Classes

Classificationdetection

Segmentation

Person Layout

Action Classification


Segmentation

Person Layout


ImageNet

Large Scale Visual Recognition

ImageNet


10000 Classes


Computer

Representation

Semantic Gap

Natural

Language

Representation


Green


Corner


Roof


Ski Slope


Resort


Fun

Holiday

Beautifulhellip



database


Subtasks





database


Subtasks





Evaluations


Evaluations




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation











56 teams

from 17 countries

Round 1

8 teams

Round 2

7 teams

Round 3

5 teams Grand Final

in Singapore


No rewards

Only one team

can win

US$100000








storieshellip


storieshellip

Outlines Outlines


Data

Features

Algorithms



Data

Features

Algorithms




query IPA sequence



25 hours

monolingual

database in

round1

13 hours

multilingual

database in

round3





languages



length

F-measure




VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics


categories

including one

ldquoothersrdquo

category






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Green


Corner


Roof


Ski Slope


Resort


Fun

Holiday

Beautifulhellip



database


Subtasks





database


Subtasks





Evaluations


Evaluations




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation











56 teams

from 17 countries

Round 1

8 teams

Round 2

7 teams

Round 3

5 teams Grand Final

in Singapore


No rewards

Only one team

can win

US$100000








storieshellip


storieshellip

Outlines Outlines


Data

Features

Algorithms



Data

Features

Algorithms




query IPA sequence



25 hours

monolingual

database in

round1

13 hours

multilingual

database in

round3





languages



length

F-measure




VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics


categories

including one

ldquoothersrdquo

category






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Corner


Roof


Ski Slope


Resort


Fun

Holiday

Beautifulhellip



database


Subtasks





database


Subtasks





Evaluations


Evaluations




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation











56 teams

from 17 countries

Round 1

8 teams

Round 2

7 teams

Round 3

5 teams Grand Final

in Singapore


No rewards

Only one team

can win

US$100000








storieshellip


storieshellip

Outlines Outlines


Data

Features

Algorithms



Data

Features

Algorithms




query IPA sequence



25 hours

monolingual

database in

round1

13 hours

multilingual

database in

round3





languages



length

F-measure




VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics


categories

including one

ldquoothersrdquo

category






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Roof


Ski Slope


Resort


Fun

Holiday

Beautifulhellip



database


Subtasks





database


Subtasks





Evaluations


Evaluations




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation











56 teams

from 17 countries

Round 1

8 teams

Round 2

7 teams

Round 3

5 teams Grand Final

in Singapore


No rewards

Only one team

can win

US$100000








storieshellip


storieshellip

Outlines Outlines


Data

Features

Algorithms



Data

Features

Algorithms




query IPA sequence



25 hours

monolingual

database in

round1

13 hours

multilingual

database in

round3





languages



length

F-measure




VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics


categories

including one

ldquoothersrdquo

category






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Ski Slope


Resort


Fun

Holiday

Beautifulhellip



database


Subtasks





database


Subtasks





Evaluations


Evaluations




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation











56 teams

from 17 countries

Round 1

8 teams

Round 2

7 teams

Round 3

5 teams Grand Final

in Singapore


No rewards

Only one team

can win

US$100000








storieshellip


storieshellip

Outlines Outlines


Data

Features

Algorithms



Data

Features

Algorithms




query IPA sequence



25 hours

monolingual

database in

round1

13 hours

multilingual

database in

round3





languages



length

F-measure




VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics


categories

including one

ldquoothersrdquo

category






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Resort


Fun

Holiday

Beautifulhellip



database


Subtasks





database


Subtasks





Evaluations


Evaluations




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation











56 teams

from 17 countries

Round 1

8 teams

Round 2

7 teams

Round 3

5 teams Grand Final

in Singapore


No rewards

Only one team

can win

US$100000








storieshellip


storieshellip

Outlines Outlines


Data

Features

Algorithms



Data

Features

Algorithms




query IPA sequence



25 hours

monolingual

database in

round1

13 hours

multilingual

database in

round3





languages



length

F-measure




VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics


categories

including one

ldquoothersrdquo

category






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Fun

Holiday

Beautifulhellip



database


Subtasks





database


Subtasks





Evaluations


Evaluations




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation











56 teams

from 17 countries

Round 1

8 teams

Round 2

7 teams

Round 3

5 teams Grand Final

in Singapore


No rewards

Only one team

can win

US$100000








storieshellip


storieshellip

Outlines Outlines


Data

Features

Algorithms



Data

Features

Algorithms




query IPA sequence



25 hours

monolingual

database in

round1

13 hours

multilingual

database in

round3





languages



length

F-measure




VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics


categories

including one

ldquoothersrdquo

category






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes



database


Subtasks





database


Subtasks





Evaluations


Evaluations




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation











56 teams

from 17 countries

Round 1

8 teams

Round 2

7 teams

Round 3

5 teams Grand Final

in Singapore


No rewards

Only one team

can win

US$100000








storieshellip


storieshellip

Outlines Outlines


Data

Features

Algorithms



Data

Features

Algorithms




query IPA sequence



25 hours

monolingual

database in

round1

13 hours

multilingual

database in

round3





languages



length

F-measure




VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics


categories

including one

ldquoothersrdquo

category






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Evaluations


Evaluations




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation




Constrained problem



Well defined metric

Challenges

Algorithm design

Computation











56 teams

from 17 countries

Round 1

8 teams

Round 2

7 teams

Round 3

5 teams Grand Final

in Singapore


No rewards

Only one team

can win

US$100000








storieshellip


storieshellip

Outlines Outlines


Data

Features

Algorithms



Data

Features

Algorithms




query IPA sequence



25 hours

monolingual

database in

round1

13 hours

multilingual

database in

round3





languages



length

F-measure




VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics


categories

including one

ldquoothersrdquo

category






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes











56 teams

from 17 countries

Round 1

8 teams

Round 2

7 teams

Round 3

5 teams Grand Final

in Singapore


No rewards

Only one team

can win

US$100000








storieshellip


storieshellip

Outlines Outlines


Data

Features

Algorithms



Data

Features

Algorithms




query IPA sequence



25 hours

monolingual

database in

round1

13 hours

multilingual

database in

round3





languages



length

F-measure




VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics


categories

including one

ldquoothersrdquo

category






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes








storieshellip


storieshellip

Outlines Outlines


Data

Features

Algorithms



Data

Features

Algorithms




query IPA sequence



25 hours

monolingual

database in

round1

13 hours

multilingual

database in

round3





languages



length

F-measure




VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics


categories

including one

ldquoothersrdquo

category






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes

Outlines Outlines


Data

Features

Algorithms



Data

Features

Algorithms




query IPA sequence



25 hours

monolingual

database in

round1

13 hours

multilingual

database in

round3





languages



length

F-measure




VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics


categories

including one

ldquoothersrdquo

category






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes



query IPA sequence



25 hours

monolingual

database in

round1

13 hours

multilingual

database in

round3





languages



length

F-measure




VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics


categories

including one

ldquoothersrdquo

category






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes



VT1 Single

Image

20

queries

(short)

Video

Segs

All the

similar

Segs

ldquovisually

similarrdquo


20 categories

multiple labels

possible

VT2 Short

Video

Shot

(lt10s)

20

queries

(long)

Video

Segs

All the

similar

Segs

Perceptually

Similar

10 categories

multiple labels

possible

VT3 Videos

with

sound

(3~10s)

Order

of 10K

Category

number

learning the

common

visual

characteristics


categories

including one

ldquoothersrdquo

category






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes






104 Flag
















120 Airplane





104 Flag
















120 Airplane










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes










209 Woman monologue










209 Woman monologue










































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes









































Video 352288





Video 352288





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes





CSL Cluster



TeraGrid




CSL Cluster



TeraGrid






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes






Feature Extraction










Classifier Training








Feature Extraction










Classifier Training





Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Matlab codes to C

Parallel computing

GPU Acceleration




Matlab codes to C

Parallel computing

GPU Acceleration





Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature

Image Features

SIFT

HOG

GIST

APC

LBP


Semantic Feature


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based



Character Detector

Harris corner


Optical Flow


Gender recognition

SODA-boost based




Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Extractor


Extractor

Designed by Dennis





Designed by Dennis









eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes





eg VT1 101 103(26-141)





eg VT1 101 103(26-141)




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes




development data

Query Expansion







patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes



patches









Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Category MAP












Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Category MAP









Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Task 2 (VT1)







Task3 (AT1 + VT2)


Video R=20




We are

2nd in Audio search

4th in Video search

2nd in AV search

1st overall

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes

TRECVid



Our Task





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes





Regional Averaging


Event Detections

Thresholding Rule




Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull



Research tool

Rapid development

Fast execution

Python glue layer

CUDA CC++


Data flow

Lazy pull






Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision




Surveillance

Soft biometrics

Multimedia Indexing





Dynamic Vision
















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes















Video Decoder

2D3D Convolution


Optical Flow

Feature Extraction








Download


TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes

TRECVid 2008 System


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Features Video

Motion

Shape

Classifier

Event Label

bull Running

bull Pointing

bull Object Put

bull Cell To Ear

Vector

Quantization

Histogram

Interest Points


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev


Corners

Dollar

Space Time Gabor

Corners

Periodic Motion

RSMB


Motion

Laptev

Dollar

RSMB





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes





Motion

Shape

Appearance


Development

Application


Motion

Shape

Appearance


Development

Application






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes






otherwise

1t)yD(xif

1)1)ty(xHmax(0

τt)y(xH

τ

τ

133

438

51

251

0 50 100 150 200 250 300



CUDA (distance)

C


Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes

Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes

Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow





Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes

Histograms of

Oriented Gradients

Optical Flow

Histograms of

Oriented Gradients

Optical Flow






Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Vector quantization






Vector quantization






K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes

K-Means K-Means


Large data sets


1000 centers





Large data sets


1000 centers






0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes



0 2000 4000 6000

CUDA

C

)bd(a)bd(a

)bd(a

)bd(a)bd(a)bd(a

nm1m

12

n11111

n1

m1

bbB

aaA

Compute Given




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes




computations

2Compute bound



Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Shared

Memory

A B

C


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


A B

C


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


A B

C


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


53 79

23

240 150

53 79

3030 4947

1

10

100

1000

10000

Fetch

Frame

Optical

Flow

Transfer to

GPU

Feature

Extraction

Pairwise

Distance

millise

co

nd

s

GPU + CPU CPU

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes

Benchmark Benchmark


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes


Dictionary Size


Low Medium High

1000 Raw 0681 0804 0844

Norm 0708 0799 0840

Mt Inf 0594 0804 0848

500 Raw 0675 0792 0833

Norm 0701 0791 0825

Mt Inf 0626 0783 0819

200 Raw 0671 0772 0811

Norm 0701 0779 0818

Mt Inf 0614 0720 0756



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes



Alarm

Miss Min DCR

Pointing 13 225 1050 1006

Cell To Ear 0 58 194 1060


Object Put 1 190 620 1020



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes



Alarm

Miss Min DCR

Pointing 13 (57) 225 (2505) 1050 1006

Cell To Ear 0 (8) 58 (4005) 194 1060

Person Runs 1 (0) 38 (314) 106 0997

Object Put 1 (21) 190 (2703) 620 1020

(2008 Results)





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes





Cloud

GPUs








Cloud

GPUs




















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes
















Segmentation

Person Layout



Segmentation

Person Layout


ImageNet


ImageNet


10000 Classes

ImageNet


ImageNet


10000 Classes

bridging the semantic gap - university of illinois at ...ece417/lecturenotes/ece417 spring...

Documents