dan feldman - collège de france...dan feldman [email protected] our robotics & big data lab...

92
Learning streaming and distributed big data using core-sets Dan Feldman Robotics & BigData Lab University of Haifa 1

Upload: others

Post on 05-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Learning streaming and distributed big data using core-sets

1

Dan Feldman

Robotics & BigData LabUniversity of Haifa 1

Page 2: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

2

Forge links between:

•Computational Geometry

•Core-sets

•Machine Learning of Big Data

•Robotics

Challenges of this talk

Page 3: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

3

Forge links between:

•Approximated CaratheodoryTheorem

•Core-sets for mean queries

•Google’s PageRank

•Real time pose estimation

Challenges of this talk

Page 4: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Big Data

• Volume: huge amount of data points

• Variety: huge number of sensors

• Velocity: data arrive in real-time streaming

Need:

• Streaming algorithms (use logarithmic memory)

• Parallel algorithms (use networks, clouds)

• Simple computations (use GPUs)

• No assumption on order of points

Page 5: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Big Data Computation model• = Streaming + Parallel computation

• Input: infinite stream of vectors

• 𝑛 = vectors seen so far

• ~log 𝑛 memory

• M processors

• ~log (n)/M insertion time per point

(Embarrassingly parallel)

5

Page 6: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Challenge:Find RIGHT data from Big Data

6

Given data D and Algorithm A with A(D)intractable, can we efficiently reduce D to C so that A(C) fast and A(C)~A(D)?

Provable guarantees on approximation with respect to the size of C

Page 7: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Naïve Uniform Sampling (RANSAC)

7

Page 8: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

8

Naïve Uniform Sampling

Small cluster is missed

Sample a set U of m points uniformly

Page 9: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

9

Coreset for Image Denoising

• Existing de-noising algorithms works only on small (low-definition) images off-line

• For HD or real-time streaming: Use random sampling (RANSAC)

[F, Feigin , Sochen [SSVM’13]

Page 10: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

10

RANSAC will not find rare but important parts

Page 11: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

From Big Data to Small Data

Suppose that we can compute such a corset 𝐶 of size 1

𝜖for every set 𝑃 of n points

• in time 𝑛3,• off-line, non-parallel, non-streaming algorithm

1 2 3 4 5 6 7 8 9 10 11t

10

9

5

11

y

1 2 3 4 5 6 7 8 9 10 11t

10 11p

10 10|| (10) ||p f

9

5

11

y

(10)f

~

1 ± 𝜖

Page 12: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Read the first 2

𝜖streaming points and reduce them

into 1

𝜖weighted points in time

2

𝜖

5

1 + 𝜖 corset for 𝑃1

Page 13: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Read the next 2

𝜖streaming point and reduce them

into 1

𝜖weighted points in time

2

𝜖

5

1 + 𝜖 corset for 𝑃21 + 𝜖 corset for 𝑃1

Page 14: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Merge the pair of 𝜖-coresets into an 𝜖-corset

of 2

𝜖weighted points

1 + 𝜖-corset for 𝑃1 ∪ 𝑃2

Page 15: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Delete the pair of original coresets from memory

1 + 𝜖-corset for 𝑃1 ∪ 𝑃2

Page 16: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Reduce the2

𝜖weighted points into

1

𝜖weighted

points by constructing their coreset

1 + 𝜖-corset for 𝑃1 ∪ 𝑃2

1 + 𝜖-corset for

Page 17: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Reduce the2

𝜖weighted points into

1

𝜖weighted

points by constructing their coreset

1 + 𝜖-corset for 𝑃1 ∪ 𝑃2

1 + 𝜖-corset for

= 1 + 𝜖 2-corset for 𝑃1 ∪ 𝑃2

Page 18: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

1 + 𝜖 2-corset for 𝑃1 ∪ 𝑃2

1 + 𝜖 -corset for 𝑃3

Page 19: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

1 + 𝜖 2-corset for 𝑃1 ∪ 𝑃2

1 + 𝜖 -corset for 𝑃3 1 + 𝜖 -corset for 𝑃4

Page 20: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

1 + 𝜖 2-corset for 𝑃1 ∪ 𝑃2 1 + 𝜖 -corset for 𝑃3 ∪ 𝑃4

Page 21: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

1 + 𝜖 2-corset for 𝑃1 ∪ 𝑃2 1 + 𝜖 2-corset for 𝑃3 ∪ 𝑃4

Page 22: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,
Page 23: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

1 + 𝜖 2-coreset for

𝑃1 ∪ 𝑃2 ∪ 𝑃3 ∪ 𝑃4

Page 24: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

1 + 𝜖 3-coreset for

𝑃1 ∪ 𝑃2 ∪ 𝑃3 ∪ 𝑃4

Page 25: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,
Page 26: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Parallel Computation

Page 27: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Parallel Computation

Page 28: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Parallel ComputationRun off-line algorithm on corset using single computer

Page 29: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

29

Parallel+ Streaming Computation

Page 30: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

30ICRA’14 (With Rus, Paul and Newman)

Page 31: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,
Page 32: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,
Page 33: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

33

Example Coresets

• Graph/Vector Summarization [F, Rus, Ozer]• LSA/PCA/SVD [F, Rus, and Volkob, NIPS’16]• k-Means [F, Barger, SDM’16]• Non-Negative Matrix Factorization [F, Tassa, KDD15]• Robots Localization [F, Cindy, Rus, ICRA’15]• Robots Coverage [F, Gil, Rus, ICRA’13]• Segmentation [F, Rosman, Rus, Volkob, NIPS’14]• Dictionary Learning and Image Denoising

[F, Sochen, J. of Math. Image & Vision, 12]• Mixture of Gaussians [F Krause, NIPS’11]• k-Line Means [F, Fiat, Sharir, FOCS’06]• …

Page 34: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

34

Coreset for robotics (video)

Page 35: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

M ean Q ueries² Input: P in Rd

Page 36: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

M ean Q ueries² Input: P in Rd

² Query: a point q 2 Rd

q

Page 37: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

M ean Q ueries² Input: P in Rd

² Query: a point q 2 Rd

q

Page 38: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

¡dist (p; q)

¢2= kp ¡ qk

2

= kpk2

+ kqk2

¡ 2p ¢q

²

X

p2 P

¡dist (p; q)

¢2=

X

p2 P

kpk2

+X

p2 P

kqk2

¡ 2q ¢X

p2 P

p

=X

p2 P

kpk2

+ jPj kqk2

²

Coreset For M ean Q ueries

nמn

n

Page 39: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

¡dist (p; q)

¢2= kp ¡ qk

2

= kpk2

+ kqk2

¡ 2p ¢q

²

X

p2 P

¡dist (p; q)

¢2=

X

p2 P

kpk2

+X

p2 P

kqk2

¡ 2q ¢X

p2 P

p

=X

p2 P

kpk2

+ jPj kqk2

²

Coreset For M ean Q ueries

nמn

n

Problem: compute a small weighted subset deterministically. [ICML’17, with Rus and Ozer]

Page 40: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

40

Relation to Google’s PageRank• Input: Binary adjacency matrix 𝐺 of a graph.• Scale every column to have sum of 1• (𝐺 is now a stochastic matrix)

• Let 𝑑 = 0.85 to get a positive stochastic matrix:𝐴 = 𝑑 ∗ 𝐺 + 1 − 𝑑 ⋅ 𝟏

• There is a distribution 𝑥 such that 𝐴𝑥 = 𝑥(Perron–Frobenius theorem)

• 𝐵𝑥 = 0 for 𝐵 = 𝐴 − 𝐼• Output: 𝑥 (PageRank vector)

Page 41: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

41

Relation to Google’s PageRank• Input: Binary adjacency matrix 𝐺 of a graph.• Scale every column to have sum of 1• (𝐺 is now a stochastic matrix)

• Let 𝑑 = 0.85 to get a positive stochastic matrix:𝐴 = 𝑑 ∗ 𝐺 + 1 − 𝑑 ⋅ 𝟏

• There is a distribution 𝑥 such that 𝐴𝑥 = 𝑥(Perron–Frobenius theorem)

• 𝐵𝑥 = 0 for 𝐵 = 𝐴 − 𝐼• Output: 𝑥 (PageRank vector)

• Core-set: a sparse 𝑥′ such that 𝐵𝑥′ < 𝜖

Page 42: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

[email protected] Feldman

Common Localization of quadcopter

• Many sensors: GPS, Kinect, GoPro, LiDAR, IMU, Sonar

• Good:Easy to hover and navigate

• Bad:- Dangerous, expensive, heavy- Hard to compare & analyze

Page 43: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

[email protected] Feldman

Our Robotics & Big Data lab

• Toy-drones, no sensors or tiny analog camera

• Good:- Safe for indoor navigation, and low-cost- Easy to model

• Bad:- Unstable- Need ~ 30 location updates per second

Page 44: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Expensive Tracking System

Page 45: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Challenge: use weak hardware

Using stronger algorithms

Page 46: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

𝑡

𝑝4

𝑝2

𝑝3

𝑝1

𝑃

𝑞1𝑞2

𝑞3𝑞4

𝑄

Exact Translation Recovery

Page 47: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

𝑄 = 𝑃 + 𝑡

𝑞1 = 𝑝1 + 𝑡𝑞2 = 𝑝2 + 𝑡

𝑞3 = 𝑝3 + 𝑡𝑞4=𝑝4 + 𝑡

𝑝2

𝑝3𝑝4

𝑝1

𝑃

𝑡

Exact Translation Problem

Page 48: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

𝑄 = 𝑃 + 𝑡

𝑞1 = 𝑝1 + 𝑡𝑞2 = 𝑝2 + 𝑡

𝑞3 = 𝑝3 + 𝑡𝑞4=𝑝4 + 𝑡

𝑡 = 𝑞1 − 𝑝1

𝑝2

𝑝3𝑝4

𝑝1

𝑃

Exact Translation Recovery

𝑡 = 𝑞1 − 𝑝1Solution:

Page 49: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

𝑝2

𝑝3𝑝4

𝑝1

𝑃

𝑞1𝑞2

𝑞3𝑞4

𝑄

Noisy Observations

Added Gaussian noise due to:- Low resolution- Few Frames Per

Second (FPS)- Latency (delay)- Communication errors- Camera Tilting

Page 50: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

𝑡

𝑝4

𝑝2

𝑝3

𝑝1

𝑃

𝑞1𝑞2

𝑞3𝑞4

𝑄

Added Gaussian noise due to:- Low resolution- Few Frames Per

Second (FPS)- Latency (delay)- Communication errors- Camera Tilting

𝑃 + 𝑡 ~ 𝑄

Translation Estimation

Page 51: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

𝑡

𝑝4 +𝑡

𝑝2 +𝑡

𝑝3 + 𝑡

𝑝1 + 𝑡

𝑃 + 𝑡

𝑞1𝑞2

𝑞3𝑞4

𝑄Compute a translation 𝑡 of 𝑃 that minimizes the sum of squared distances to Q

min𝑡

𝑖=1

𝑛

dist2 𝑝𝑖 + 𝑡, 𝑞𝑖

Translation Estimation

Page 52: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

𝑝2

𝑝3𝑝4

𝑝1

𝑃

The object not only moves, but also rotates in space

𝑄 = Translation & Rotation of 𝑃

Page 53: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

𝑝2

𝑝3𝑝4

𝑝1

𝑃

𝑞1

𝑞2

𝑞3

𝑞4 𝑄

𝑅 ⋅ 𝑃 𝑅 ⋅ 𝑃 + 𝑡

The Pose-Estimation Problem

A rotation corresponds to a

rotation matrix 𝑅 in ℝ𝑑×𝑑:

𝑞𝑖 = 𝑅𝑝𝑖 + 𝑡

𝑅𝑜𝑡𝑎𝑡𝑖𝑜𝑛𝑀𝑎𝑡𝑟𝑖𝑥

𝑹

Page 54: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

translation 𝑡

𝑅 ⋅ 𝑝4 +𝑡

𝑅 ⋅ 𝑝2 +𝑡

𝑅 ⋅ 𝑝3 + 𝑡

𝑅 ⋅ 𝑝1 + 𝑡

𝑅 ⋅ 𝑃 + 𝑡

𝑞1𝑞2

𝑞3𝑞4

𝑄

𝑅𝑜𝑡𝑎𝑡𝑖𝑜𝑛𝑀𝑎𝑡𝑟𝑖𝑥

𝑹

min𝑡 ,𝑅

𝑖=1

𝑛

dist2 𝑅 ⋅ 𝑝𝑖 + 𝑡, 𝑞𝑖

The Pose-Estimation Problem

Compute Rotation & Translation of P that minimizes its sum of squared distances to 𝑄 :

Page 55: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

55

Matching & Pose-Estimation

- Matching of each 𝑝𝑖 to its 𝑞𝑖 is also unknown.

- Needs to compute a permutation 𝜋: 1,⋯ , 𝑛 → 1,⋯ , 𝑛where 𝑝𝑖 is assigned to 𝑞𝜋 𝑖

𝑅 ⋅ 𝑝4 +𝑡

𝑅 ⋅ 𝑝2 +𝑡

𝑅 ⋅ 𝑝3 + 𝑡

𝑅 ⋅ 𝑝1 + 𝑡

𝑞?𝑞?

𝑞?𝑞?

𝑄

Page 56: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

56

Matching & Pose-Estimation

Compute Permutation, Rotation & Translation of Pthat minimizes its sum of squared distances to 𝑄 :

min𝜋,𝑡 ,𝑅

𝑖=1

𝑛

dist2 𝑅 ⋅ 𝑝𝑖 + 𝑡, 𝑞𝜋(𝑖)

𝑅 ⋅ 𝑝4 +𝑡

𝑅 ⋅ 𝑝2 +𝑡

𝑅 ⋅ 𝑝3 + 𝑡

𝑅 ⋅ 𝑝1 + 𝑡

𝑞𝜋(1)𝑞𝜋(2)

𝑞𝜋(3)

𝑞𝜋(4)

𝑄

Page 57: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

• Optimal Translation is simply the mean

• Let 𝑈𝐷𝑉𝑇 be a Singular Value Decomposition (SVD)

of the matrix 𝑃𝑇𝑄. That is:

𝑈𝐷𝑉𝑇 = PTQ

• Theorem 1 (𝑲𝒂𝒃𝒔𝒄𝒉 𝒂𝒍𝒈𝒐𝒓𝒊𝒕𝒉𝒎 ).

• The matrix 𝑅∗ = 𝑉𝑈𝑇 is the optimal rotation

and can be computed in 𝑂 𝑛𝑑2 time.

Existing Solutions

Page 58: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

𝑝2

𝑝3𝑝4

𝑝1

𝑃

Ordered set 𝑃 of 𝑛 markers.Initial position of object.

Observed ordered set 𝑄 (now) of 𝑛markers

Core-set For Pose Estimation

Page 59: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Core-set For Pose Estimation

A weight vector 𝑤1, ⋯ ,𝑤𝑛 ≥ 0 whose most entries are zeroes and for every 𝑅 and 𝑡:

𝑅 ⋅ 𝑝2 +𝑡

𝑅 ⋅ 𝑝3 + 𝑡

𝑞2

𝑞3𝑄

𝑖=1

𝑛

dist2 𝑅 ⋅ 𝑝𝑖 + 𝑡, 𝑞𝑖 =

𝑖=1

𝑛

widist2 𝑅 ⋅ 𝑝𝑖 + 𝑡, 𝑞𝑖

Page 60: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

The Pose-Estimation Problem

“Full version”

Matching. Assuming 𝑃 is an initial set of 𝑛 markers (points in 𝑅𝑑),

and 𝑄 is the observed set of markers, we need to match each point in

𝑃 to it’s corresponding point in 𝑄.

𝑂 𝑛! 𝑃𝑒𝑟𝑚𝑢𝑡𝑎𝑡𝑖𝑜𝑛𝑠

Page 61: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Every set of 𝑛 points has a core-set of size 𝑂 𝑑2 that

can be computed in 𝑂 𝑛𝑑 time.

Main Theorem [S. Nasser, I. Jubran, F]

𝑅 ⋅ 𝑝2 +𝑡

𝑅 ⋅ 𝑝3 + 𝑡

𝑞2

𝑞3𝑄

𝑖=1

𝑛

dist2 𝑅 ⋅ 𝑝𝑖 + 𝑡, 𝑞𝑖 =

𝑖=1

𝑛

widist2 𝑅 ⋅ 𝑝𝑖 + 𝑡, 𝑞𝑖

Page 62: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Off-line solution

Optimal rotation:

𝑅 = 𝑉𝑈𝑇 𝑤ℎ𝑒𝑟𝑒 𝑆𝑉𝐷

𝑖=1

𝑛

𝑝𝑖𝑇𝑞𝑖 = 𝑈𝐷𝑉𝑇

𝐴3𝑥3 =

𝑖=1

𝑛

𝑝𝑖𝑇𝑞𝑖 =

𝑖=1

𝑛

𝐴𝑖3𝑥3 =

𝑖=1

𝑛

𝑎𝑖1𝑥9

62

Page 63: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Solving the Problem cont.

𝐴3𝑥3 = 𝐴1𝑥9 = σ𝑖=1𝑛 𝑎𝑖1𝑥9 = σ𝑖=1

𝑘 𝜔𝑖𝑎𝑖1𝑥9

Coreset𝑎𝑖3𝑥3 = 𝑎𝑖1𝑥9

63

Page 64: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

64

Matrix Approximation by rows subset

For every matrix A there is a diagonal matrix W

of only 𝒅𝟐 non-zeros entries such that for every

𝒙 ∈ 𝑅𝒅

𝑨𝒙 = | 𝑾𝑨𝒙 |

Proof: 𝐴𝑥2= 𝑥𝑇 𝐴𝑇𝐴 𝑥 = 𝑥𝑇 σ𝑖 𝑎𝑖𝑎𝑖

𝑇 𝑥

= 𝑥𝑇 σ𝑖𝑤𝑖𝑎𝑖𝑎𝑖𝑇 𝑥

=𝑥𝑇 𝐴𝑇𝑊𝑇𝑊𝐴 𝑥 = 𝑊𝐴𝑥2

Page 65: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Intuition (𝑑 = 2)

65

Page 66: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Caratheodory’s

Theorem

If a point 𝑥 lies in the convex hull of a

set, there is a subset consisting of at

most 𝑑 + 1 points such that 𝑥 lies in the

convex hull of 𝑃′.

𝑑 = 2

Page 67: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Caratheodory’s

Theorem

(Illustration)

𝑑 = 2

Y

X

𝜔6 = 1/𝑛

𝜔7 = 1/𝑛𝜔1 = 1/𝑛

𝜔2 = 1/𝑛

𝜔3 = 1/𝑛

𝜔4 = 1/𝑛

𝜔5 = 1/𝑛

(320,249)

Page 68: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Caratheodory’s

Theorem

(Illustration)

𝑑 = 2

Y

X

𝜔6 = 1.8657

𝜔5 = 0

𝜔1 = 2.1218

𝜔2 = 0

𝜔3 = 0

𝜔4 = 1.2824

𝝎𝟕 = 𝟎

(320,249)

Page 69: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,
Page 70: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,
Page 71: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Example

71

Page 72: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

1) Initialize

72

Page 73: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

1) Initialize

2) Farthest Point

73

Page 74: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

1) Initialize

2) Farthest Point

3) New Center

74

Page 75: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

1) Initialize

2) Farthest Point

3) New Center

4) Repeat

75

Page 76: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

1) Initialize

2) Farthest Point

3) New Center

4) Repeat

76

Page 77: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

1) Initialize

2) Farthest Point

3) New Center

4) Repeat

In the 𝑖th iteration

The error is < 1/𝑖

77

Page 78: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Open Problems

• More Coresets

- Deep learning, Topological Data, Sparse data

- 3D Navigation and Mapping, Robotics

• Sensor Fusion (GPS+Video+Audio+Text+..)

• Private Coresets, [STOC’11, with Fiat et al.]

• For biometric face database (with R. Osadchy)

• Coresets for Cybersecurity (with S. Goldwasser)

• Generic software library

- Coresets on Demand on the cloud78

Page 79: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

[email protected] Feldman

• More Coresets

- Deep learning, Decision trees, Sparse data

- 3D Navigation and Mapping, Robotics

• Sensor Fusion (GPS+Video+Audio+Text+..)

• Private Coresets, [STOC’11, with Fiat et al.]

• Coresets for Cybersecurity (with A. Akavia, S. Goldwasser)

• Generic software library

- Coresets on Demand on the cloud

79

Thank you !

Page 80: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Theorem [Feldman, Langberg, STOC’11]

A sample 𝐶 ⊆ 𝑃 from the distribution

sensitivity p = max𝑞∈𝑄

𝑑𝑖𝑠𝑡(𝑝, 𝑞)

σ𝑝′𝑑𝑖𝑠𝑡(𝑝′, 𝑞)

is a coreset if 𝐶 ≥dimension of 𝑄

𝜖2⋅ σ𝑝 sensitibity(𝑝)

Suppose that

cost 𝑃, 𝑞 ≔

𝑝∈𝑃

𝑤 𝑝 dist 𝑝, 𝑞

where dist: 𝑃 × 𝑄 → 0,∞ .

Page 81: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Theorem [Feldman, Langberg, STOC’11]

A sample 𝐶 ⊆ 𝑃 from the distribution

sensitivity p = max𝑞∈𝑄

𝑑𝑖𝑠𝑡(𝑝, 𝑞)

σ𝑝′𝑑𝑖𝑠𝑡(𝑝′, 𝑞)

is a coreset if 𝐶 ≥dimension of 𝑄

𝜖2⋅ σ𝑝 sensitibity(𝑝)

Suppose that

cost 𝑃, 𝑞 ≔

𝑝∈𝑃

𝑤 𝑝 dist 𝑝, 𝑞

where dist: 𝑃 × 𝑄 → 0,∞ .

Page 82: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

[email protected] Feldman82

Surprising Applications

1.(1-epsilon) approximations:Heuristics work better on coresets

2.Running constant factor on epsilon-coresets helps

3.Coreset for one problem is good for a lot of unrelated problems

4.Coreset for O(1) points

Page 83: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

83

Implementation

• The worst case and sloppy (constant) analysis is not so relevant

• In Thoery:a random sample of size 1/𝜖 yields (1 + 𝜖)approximation with probability at least 1 − 𝛿.In Practice: Sample s points, output the approximation 𝜖 and its distribution

• Never implement the algorithm as explained in the paper.

Page 84: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

84

Page 85: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Coreset for k-means[Feldman, Sohler, Monemizadeh, SoCG’07]

Coreset for 𝑘-means can be computed by choosing points from the distribution:

sensitivity(𝑝) = 𝑑𝑖𝑠𝑡(𝑝,𝑞∗)

σ𝑝′ 𝑑𝑖𝑠𝑡(𝑝′,𝑞∗)+

1

𝑛𝑝

𝑛𝑝 = number of points in the cluster of p

𝑞∗ = k-means of P

|C|=𝑘⋅𝑑

𝜖2

Page 86: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Coreset for k-means[Feldman, Sohler, Monemizadeh, SoCG’07]

Coreset for 𝑘-means can be computed by choosing points from the distribution:

sensitivity(𝑝) = 𝑑𝑖𝑠𝑡(𝑝,𝑞∗)

σ𝑝′ 𝑑𝑖𝑠𝑡(𝑝′,𝑞∗)+

1

𝑛𝑝

𝑛𝑝 = number of points in the cluster of p

𝑞∗ = k-means of P

|C|=𝑘⋅𝑑

𝜖2

Or approximation [SoCg07, Feldma, Sharir, Fiat]

Page 87: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Coreset for k-means[Feldman, Sohler, Monemizadeh, SoCG’07]

Coreset for 𝑘-means can be computed by choosing points from the distribution:

sensitivity(𝑝) = 𝑑𝑖𝑠𝑡(𝑝,𝑞∗)

σ𝑝′ 𝑑𝑖𝑠𝑡(𝑝′,𝑞∗)+

1

𝑛𝑝

𝑛𝑝 = number of points in the cluster of p

𝑞∗ = k-means of P

|C|=𝑘⋅𝑑

𝜖2

𝑘⋅𝑘

𝜀

𝜖2

Or approximation [SoCg07, Feldma, Sharir, Fiat]

[SODA’13, Feldman, Schmidt, ..]

Page 88: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Coreset for k-means[Feldman, Sohler, Monemizadeh, SoCG’07]

Coreset for 𝑘-means can be computed by choosing points from the distribution:

sensitivity(𝑝) = 𝑑𝑖𝑠𝑡(𝑝,𝑞∗)

σ𝑝′ 𝑑𝑖𝑠𝑡(𝑝′,𝑞∗)+

1

𝑛𝑝

𝑛𝑝 = number of points in the cluster of p

𝑞∗ = k-means of P

|C|=𝑘⋅𝑑

𝜖2

Page 89: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Coreset for k-means[Feldman, Sohler, Monemizadeh, SoCG’07]

Coreset for 𝑘-means can be computed by choosing points from the distribution:

sensitivity(𝑝) = 𝑑𝑖𝑠𝑡(𝑝,𝑞∗)

σ𝑝′ 𝑑𝑖𝑠𝑡(𝑝′,𝑞∗)+

1

𝑛𝑝

𝑛𝑝 = number of points in the cluster of p

𝑞∗ = k-means of P

|C|=𝑘⋅𝑑

𝜖2

Or approximation [SoCg07, Feldma, Sharir, Fiat]

Page 90: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Coreset for k-means[Feldman, Sohler, Monemizadeh, SoCG’07]

Coreset for 𝑘-means can be computed by choosing points from the distribution:

sensitivity(𝑝) = 𝑑𝑖𝑠𝑡(𝑝,𝑞∗)

σ𝑝′ 𝑑𝑖𝑠𝑡(𝑝′,𝑞∗)+

1

𝑛𝑝

𝑛𝑝 = number of points in the cluster of p

𝑞∗ = k-means of P

|C|=𝑘⋅𝑑

𝜖2

Or approximation [SoCg07, Feldma, Sharir, Fiat]

Page 91: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

Coreset for k-means[Feldman, Sohler, Monemizadeh, SoCG’07]

Coreset for 𝑘-means can be computed by choosing points from the distribution:

sensitivity(𝑝) = 𝑑𝑖𝑠𝑡(𝑝,𝑞∗)

σ𝑝′ 𝑑𝑖𝑠𝑡(𝑝′,𝑞∗)+

1

𝑛𝑝

𝑛𝑝 = number of points in the cluster of p

𝑞∗ = k-means of P

|C|=𝑘⋅𝑑

𝜖2

𝑘⋅𝑘

𝜀

𝜖2

Or approximation [SoCg07, Feldma, Sharir, Fiat]

[SODA’13, Feldman, Schmidt, ..]

Page 92: Dan Feldman - Collège de France...Dan Feldman dannyf@csail.mit.edu Our Robotics & Big Data lab • Toy-drones, no sensors or tiny analog camera • Good: - Safe for indoor navigation,

92

The chicken-and-egg problem

1. We need approximation to compute thecoreset

2. We compute coreset to get a fast approximation to a problem

Lee-ways: I. Bi-criteria approximationII. HeuristicsIII. polynomial time reduced to linear time by the merge-reduce tree