1 miodrag bolic architectures for efficient implementation of particle filters department of...

30
1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University Advisor: Prof. Petar M. Djuric STONY BROOK UNIVERSITY Dissertation Defense

Upload: denis-hudson

Post on 27-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

1

Miodrag BolicMiodrag Bolic

ARCHITECTURES FOR EFFICIENT IMPLEMENTATION

OF PARTICLE FILTERS

Department of Electrical and Computer EngineeringStony Brook University

Advisor: Prof. Petar M. Djuric

STONY BROOK UNIVERSITY

Dissertation Defense

Page 2: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

2

Outline

PART I: Introduction

Conclusions and future work

PART II: Theory of PFs

Dynamic model Monte Carlo sampling Importance sampling Resampling Bearings-only tracking example Steps and complexity

PART III: Implementation of PFs

VLSI signal processing architectures Methodology

Non-parallel implementation Algorithm characteristics Modifications of the PF New resampling algorithms Architecture Implementation results

Parallel implementation Propagation of particles Parallel resampling Architectures for parallel

resampling Space exploration

Gaussian PFs

Motivation and goals Challenges

Page 3: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

3

sensor

ParticleFilter

t

Obs

erve

d si

gnal

t

Estimation

PARTICLE FILTERCHIP

Introduction – Motivations and Goals

Goal

Increase speed of particle filters

Page 4: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

4

Introduction - Challenges

First hardware implementation of particle filters (50 times improvement in speed in comparison with DSP)

New resampling algorithms suitable for hardware implementation

Fast particle filtering algorithms that do not use memories

First distributed algorithms and architectures for particle filters

Contributions

Reducing computational complexity

Randomness – difficult to exploit regular structures in VLSI

Exploiting temporal and spatial concurrency

Challenges

Page 5: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

5

Outline

PART I: Introduction

Conclusions and future work

PART II: Theory of PFs

Dynamic model Monte Carlo sampling Importance sampling Resampling Bearings-only tracking example Steps and complexity

PART III: Implementation of PFs

VLSI signal processing architectures Methodology

Non-parallel implementation Algorithm characteristics Modifications of the PF New resampling algorithms Architecture Implementation results

Parallel implementation Propagation of particles Parallel resampling Architectures for parallel

resampling Space exploration

Gaussian PFs

Motivation and goals Challenges

Page 6: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

6

States: position and velocity xk=[xk, Vxk, yk, Vyk]T

Observations: angle zk

Theory of PFs – Dynamic model

zk=fz(xk,vk)

xk=fx(xk-1, uk)

Example: Bearings-only tracking

Observation equation: zk=atan(yk/ xk)+vk

State equation:xk=Fxk-1+ Guk

x

y

T rajec to ry

xk xk + 1

ykyk + 1

zkzk + 1

fz measurement functionvk observation noise

fx state transition functionuk process noise

General dynamic model

Page 7: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

7

Objective in Bayesian approach

p(x0:k|z1:k)

posterior distribution

Theory of PFs – Bayesian approach

xk? State space model

Solution Problem

Estimate posterior

Difficult to drawsamples

Integrals are not tractable

Monte Carlo Sampling

ImportanceSampling

Use of knowing the posterior

All kinds of estimates can be calculated

Gaussian processes and

linear model

Kalman filter

Non-Gaussian processes and/or

non-linear model

Particle filter

Page 8: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

8

Theory of PFs – Monte Carlo Sampling

Densities can be approximated by discrete random measures:

Particles and Weights

• χ approximates the density p(x)

• Integrals simplify to summations

t

State space model

Solution Problem

Estimate posterior

Difficult to drawsamples

Integrals are not tractable

Monte Carlo Sampling

ImportanceSampling

Page 9: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

9

State space model

Solution Problem

Estimate posterior

Difficult to drawsamples

Integrals are not tractable

Monte Carlo Sampling

ImportanceSampling

Theory of PFs - Importance Sampling

Objective:

Approximate a density p(x) by a discrete random measure

• Steps:1. Generation of particles proposal density

2. Updating of the weights Bayes theory

Page 10: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

10

Theory of PFs - Resampling

t

1t

Particles after resampling

Particles after resampling

time

Problems:

Weight Degeneration

Wastage of Computational resources

Solution RESAMPLING

Replicate particles in proportion to their weights

Page 11: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

11

Theory of PFs – Bearings-Only Tracking Example

Page 12: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

12

Theory of PFs - Bearings-Only Tracking Example (Cont.)

• Blue – True trajectory

• Red – Estimates

Page 13: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

13

Theory of PFs – Steps and ComplexityInitialize particles

Output

Output estimates

1 2 M. . .

Particlegeneration

New observation

Exit

Normalize weights

1 2 M. . .

Weigthcomputation

Resampling

4M random number generations

Propagation of the particles

M exponential and arctangent functions

Bearings-only tracking problemNumber of particles M=1000

Complexity

More observations?

yes

no

Page 14: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

14

Outline

PART I: Introduction

Conclusions and future work

PART II: Theory of PFs

Dynamic model Monte Carlo sampling Importance sampling Resampling Bearings-only tracking example Steps and complexity

PART III: Implementation of PFs

VLSI signal processing architectures Methodology

Non-parallel implementation Algorithm characteristics Modifications of the PF New resampling algorithms Architecture Implementation results

Parallel implementation Propagation of particles Parallel resampling Architectures for parallel

resampling Space exploration

Gaussian PFs

Motivation and goals Challenges

Page 15: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

15

Implementation of PFs – VLSI Signal Processing Architectures

Approach

Temporal and spatial concurrency One-to-one mapping between operations and hardware blocks FPGA implementation

Speed is the main goal Functionality of the system does not change

Application specific processors

Programmable digital signal processors Application-domain specific processors Application specific processors

Types of architectures

Page 16: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

16

Implementation of PFs – Methodology

Algorithmiclevel

Architecturelevel

RT level

Gate level

Impact of adesign decision

Complexity

Systemlevel

Joint algorithmic and architectural design

To increase performances, algorithms must be matched to architectures

Page 17: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

17

Implementation of PFs – Algorithm Characteristics

Start

1 2 M. . .

Particle generation

New observation

Exit

Resampling

1 2 M. . .

Weightcomputation

Propagation of particles

Particle generation andweight computation

High computational complexity

No data dependencies among particles

Complexity depends on the state space model

Suitable for parallel andpipelined implementation

Resampling

Data dependent algorithmLow complexity

operations

Propagation of particles:random

Algorithm does not depend on the state space model

Page 18: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

18

Implementation of PFs – Modifications of the PF

Ge n e r a t io n o f p a r t ic le s

W e igh t c o m p ut a t io n

R e sa m p lin g

O ut p ut c a lc ula t io n

Ge n e r a t io n o f p a r t ic le s

W e igh t c o m p ut a t io n

O ut p ut c a lc ula t io n

L S L I

T sir f

M 2 M - 1

T T + 1

Modifications

Architecture Algorithm

Fine-grain pipelining

Avoiding normalization

Looptransformations

Finite precision arithmetic

Spatialconcurrency

Dedicated hardware

Addressingschemes

Parameter Current Limits

Sample period ~2MTclk ~MTclk

Memories (2N+1)M (N+1)M

Page 19: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

19

Implementation of PFs –New Resampling Algorithms

Ge n e r a t io n o f p a r t ic le s

W e igh t c o m p ut a t io n

R e sa m p lin g

O ut p ut c a lc ula t io n

Ge n e r a t io n o f p a r t ic le s

T sir f

M M

T T + 1

L

Parameter Algorithm 1 Algorithm 2

Sample period ~2MTclk ~MTclk

Memories Particle memory: (N+1)MIndex memory: 2M

Particle memory: (N+1)MIndex memory: 4M

Performances Same Worse (deterministic algorithm)

Ge n e r a t io n o f p a r t ic le s

W e igh t c o m p ut a t io n

R e sa m p lin g

O ut p ut c a lc ula t io n

Ge n e r a t io n o f p a r t ic le s

T sir f

M L R

T T + 1

L

Page 20: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

20

Implementation of PFs – Architecture

P a rtic lege ne ra tion

R e s a m plingW e ightC om puta tion

Inde xm e m ory

R e plic a tionfa c tor

m e m ory

a ddr

a ddr

da ta

P M E Mc ontro l

P a r tic lem e m oryP M E M

Page 21: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

21

Implementation of PFs – Implementation results

Particle generation

Weight Computation

Resampling

Logic blocks 16% 75% 9%

Block RAMs 67% 11% 22%

Logic blocks: 4%

Memories: 3%

Resources

DSP: ~ 1kHz

FPGA: ~ 50 kHz

Sampling frequency

Percentage of utilization of the PF blocks

Hardware platform is Xilinx Virtex-II Pro

Clock period is 10ns

PFs is applied to the bearings-only tracking problem

1000 particles is used

Page 22: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

22

• Universal architecture with a central unit

ProcessingElement 1

ProcessingElement 4

ProcessingElement 2

CentralUnit

Implementation of PFs – Parallelism

Start

New observation

Exit

1 2 M. . .

Particle generation

Resampling

1 2 M. . .

Weightcomputation

Propagation of particles

ProcessingElement 3

Processing elements (PE) Particle generation Weight computation

Central Unit Algorithm for particle propagation Resampling

1 M

1 M

Page 23: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

23

PE 2PE 1 PE 3 PE 4

Implementation of PFs – Propagation of Particles

ProcessingElement 1

ProcessingElement 4

ProcessingElement 2

CentralUnit

ProcessingElement 3

Disadvantages of the particle propagation step

Random communication pattern

Decision about connections is not known before the run time

Requires dynamic type of a network

Speed-up is significantly affected

Particles after resampling

time

t

Page 24: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

24

Implementation of PFs – Parallel Resampling

1 2

3 4

N=13N=0

N=0 N=3

14

4 1 2

3 4

N=8N=0

N=0 N=8

4

4

1 2

3 4

N=4N=4

N=4 N=4

1

1

1 1

Advantages Propagation is only local Propagation is controlled in advance by a designer Performances are the same as in the sequential applications

Solution The way in which Monte Carlo sampling is performed is modified

Result Speed-up is almost equal to the number of PEs (up to 8 PEs)

Page 25: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

25

PE1

PE2 PE4

PE3

CentralUnit

Architecture that allows adaptive connection among the processing elements

Implementation of PFs Architectures for Parallel Resampling

• Controlled particle propagation after resampling

Page 26: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

26

1

2

4

8

16

32

1

10

100

1000

1 10 100

Number of PEs

Sam

ple

per

iod

(us

) 500

1000

5000

10000

50000

Vir tex I I P r o d es ig n s p ac e

K= 1 4

Num ber ofpart ic les M

Implementation of PFs – Space exploration

Hardware platform is Xilinx Virtex-II Pro

Clock period is 10ns

PFs are applied to the bearings-only tracking problem

Limit: Available memory

Limit: Logic blocks

Page 27: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

27

1 2 M. . .

Implementation of PFs – Gaussian PFs

Sampling period is minimal ~ MTclk

No need for memories for storing particles

Simple communication in parallel implementation

Advantages

Start

1 2 M. . .

Particle generation

Exit

1 2 M. . .

Weightcomputation

Computing the mean and the covariance

matrix

Drawing conditioning particles

New observation No

Yes

Propagates only first two moments

Approximates densities by Gaussians

No need for resampling

Functionality

Higher computational complexity

Limited scope of applications

Disadvantages

Page 28: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

28

Implementation of PFs – Gaussian PFs (cont.)

1

10

100

1000

0 5 10 15 20 25 30 35

Number of processing elements

Sam

ple

perio

d (u

s) SIRF (M=500)

SIRF (M=5000)

SIRF (M=50000)

GPF (M=500)

GPF (M=5000)

GPF (M=50000)

Minimum sampling period versus number of PEs of parallel GPFs and SIRs

Page 29: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

29

Conclusions and Future Work

Simplifying floating to fixed-point conversion

Developing application-domain specific processor for PFs

Developing reconfigurable architectures for PFs

Future work

Summary

Modification of the algorithms to be suitable for hardware implementation

Development of parallel algorithms and architectures

Implementation of the particle filter in FPGA

Analysis of the other types of particle filtering algorithms

Page 30: 1 Miodrag Bolic ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF PARTICLE FILTERS Department of Electrical and Computer Engineering Stony Brook University

30

Miodrag BolicMiodrag Bolic

ARCHITECTURES FOR EFFICIENT IMPLEMENTATION

OF PARTICLE FILTERS

Department of Electrical and Computer EngineeringStony Brook University

Advisor: Prof. Petar M. Djuric

STONY BROOK UNIVERSITY

Dissertation Defense