fast computational methods for visually guided robots

Fast Computational Methods for Visually Guided Robots

Maryam Mahdaviani, Nando de Freitas, Bob Fraser and Firas Hamze Department of Computer Science, University of British Columbia, CANADA

We apply Semi-supervised and active learning algorithms (Zhu et al) for interactive object recognition in visually guided robots.

These algorithms are O(M3), but we will show that the cost can be reduced to O(M). We will also reduce storage from O(M2) to O(M).

2

Object recognition with semi-supervised data and simple color features

3

Aibo is able to identify objects in different settings.

Aibo can learn and classify several objects at the same time.

4

Semi-supervised Learning

2||||1

ji xx

ij ew

1ly

0ly

?uyxi

xj

2

1 1

)( yyw ji

M

i

M

jij

Error

We have:Input data x, two labels yl

We want:A full labeling of the data

wij

5

lllu

uluuW

ww

ww

NjNj

Njj

Njj

w

w

w

D

2

1

Semi-supervised Learning Leads to a Linear System of Equations

Differentiating the Error function and equating it Differentiating the Error function and equating it to zero, gives the solution in terms of a linear to zero, gives the solution in terms of a linear system of equations (Zhu et al, 2003):system of equations (Zhu et al, 2003):

luluuuuu yWyWD )(Where W is the adjacency matrix.Where W is the adjacency matrix.

0

0

6

The big computational bottleneck is M3

What is large M??

1955: M=201965: M=2001980: M=2000 1995: M=200002005: M=200000

Solving the linear system of equations Solving the linear system of equations

costs costs O(MO(M33)) , where M is a large number of , where M is a large number of unlabeled features.unlabeled features.

luluuuuu yWyWD )(

So over the course of 50 years M has increased by a factor of 104.

However, the speed of computers has increased by a factor of 1012. From this, the problematic O(M3) bottleneck is evident.

7

• Using iterative methods (of which Krylov are well known to work best), the cost can be reduced to O(M2) times the number of iterations. The expensive step in each iteration is the following matrix-vector multiplication:

• This matrix vector multiplication can be written as two O(M2) Gaussian kernel estimates:

• These kernel estimates can be solved in O(M) operations using the Fast Gauss Transform.

M

jiji wd

1

1

qWDv uuuu )(

M

jijji wqg

1

From O(M3) to O(M2): Krylov Iterative Methods (MINRES)

8

From O(M2) to O(M): The Fast Gauss Transform

• Intuition:

*L Greengard and V Rokhlin,1987

Storage requirement is also reduced from O(M2) to O(M) !!

9

Training in Real Time

10

Predicting Pixel Labels

• Once we have labels for M points in our training data, we use a classical kernel discriminant for N test pixels

• The cost is O(NM)!

• By applying the Fast Gauss Transform the cost can be reduced to O(N+M).

M

iik

M

iiik

k

w

ywy

1

1

11

Predicting Pixel Labels in Real Time

12

Active Learning

• Labeling data is an expensive process. We use active learning to choose what pixels should be labeled automatically.

• Active learning calls the semi-supervised learning subroutine at each iteration.

13

Active learning: asking the right questions

Aibo recognizes the ball without a problem.

Since the orange ring is close to the ball in colour space, Aibo gets confused and decides to prompt the user for labels.

• We want the robot to ask the right questions. The robot prompts the user for the labels that will improve its performance the most.

14

• We managed to reduce computational cost from O(M3) to O(M) and storage requirement from O(M2) to O(M).

• Currently we are using more sophisticated features (SIFT) and dual KD-tree recursions methods to

deal with high dimensions. • These methods can be applied to other problems such as SLAM, segmentation, ranking and

Gaussian Processes.

Thank You!

Questions?

15

• One solution: Power Method

• O(N2) per iteration

• But it might take TOO MANY iterations to converge

][ )(1)1(lul

tuuuuu

tu yWyWDy

16

The Fast Gauss Transform- Reduction of

Complexity Straightforward (nested

loop)

end

end

qwgg

Nifor

g

Mjfor

iijjj

j

,...,1

0

,...,1

ij

N

j

tji wqg

1

)(

17

The Fast Gauss Transform- Reduction of Complexity

p<<N

end

end

qxxacc

Nifor

c

pmfor

iimmm

m

)(

,...,1

0

1,...,0

*

end

end

xyfcgg

pmfor

g

Njfor

jmmjj

j

B

)(

1,...,0

0

,...,1

*

19

Training Set Time Comparison

M Computational time (seconds)

Naïve MINRES-FGT MINRES-FGT

60 0.521703 0.119514 0.312126

120 4.23732 0.250050 0.518589

240 78.7864 0.729464 0.791181

480 501.46 2.56246 1.165930

960 -------- 63.9487 2.02537

1920 -------- 497.59 3.97674

20

Test Set Time Comparison

N Computational time (seconds)

Naive FGT

260 0.0036083 0.035944

520 0.128507 0.113086

1040 0.458446 0.178275

2080 1.69306 0.321210

4160 6.62728 0.682747

8320 20.56953 0.858313

21

Krylov Subspace Methods: MINRES Algorithm

||)(~

||min 1ebct

HtocFind

nt

n

tttt

Ttn

vqandv

qqvv

vq

)1(

)()()1()1(

)(

...3,2,1tFor

)()( tquu

Wuu

Dv

ctQtuysetand )()(

The cost can be reduced to The cost can be reduced to O(MO(M22)) times number of times number of iterations.iterations.

fast computational methods for visually guided robots

Documents

semisupervised data

fast computational methods

om2 times

computational cost

om2 gaussian kernel

semisupervised learningxixjwe

labeling data

training data