matrix completion with queries - semantic scholarmatrix completion with queries natali ruchansky,...

60
Property of Natali Ruchansky Matrix Completion with Queries Natali Ruchansky, Mark Crovella, Evimaria Terzi

Upload: others

Post on 08-Feb-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Matrix Completion with Queries

Natali Ruchansky, Mark Crovella, Evimaria Terzi

Page 2: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Can you guess the picture?

Page 3: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

3

What about now?

Page 4: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

And now?

4

Page 5: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali RuchanskySalvador Domingo Felipe Jacinto Dalí i Domènech

5

Page 6: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

How did you do it?

For most there is too little information to

recognize shapes or patterns.

Available Information Our Estimate

Recognize human features — ear, eye brow shape, and

facial contour.

I know, a human face. (not Van Gogh)

I’m not sure. Arbitrary guess.

I know this mustache! My friend Salvador Dali!

Input Image

6

Page 7: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

How Much and Which Information?

So the questions is, if we start at this image: !

!

!

!

How much and which information do I need to add so that my particular algorithm can infer the image?

abracadabra

Page 8: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

If we can answer: How much and which information do I need to add so that my particular algorithm can infer the image?

!

!

1. Choose which information to add, tailored to the particular reconstruction algorithm.

!

2. Reconstruct based on this information.

Page 9: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

The example of reconstructing Dali is an instance of the problem of Matrix Completion: !

Given a partially-observed matrix M, fill in the missing entires.

Page 10: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

In particular, the version applied to real world data is Low Rank Matrix Completion: !

Given a partially-observed matrix M of low rank r, fill in the missing entires.

Page 11: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Completion of what?

Page 12: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

!

• Yelp users rate restaurants

Completion of what?

Page 13: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

!

• Yelp users rate restaurants

Completion of what?

But a given user has not visited all restaurants …

So the matrix is partially observed.users

restaurants

Page 14: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

!

• Yelp users rate restaurants • Traffic counters measure traffic on roads

Completion of what?

Page 15: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

!

• Yelp users rate restaurants • Traffic counters measure traffic on roads

Completion of what?

But counters do not exist on all roads …

So the matrix is partially observed.source

destination

Page 16: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

!

• Yelp users rate restaurants • Cities can install traffic counters • Biologists measure interaction of proteins

!

Completion of what?

Page 17: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

!

• Yelp users rate restaurants • Cities can install traffic counters • Biologists measure interaction of proteins

!

Completion of what?

But they cannot exhaustively run all experiments …

So the matrix is partially observed.protein

protein

Page 18: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

!

• Yelp users rate restaurants • Cities can install traffic counters • Biologists measure interaction of proteins

Completion of what?

https://www.telegeography.com/telecom-maps/global-traffic-map.1.html

Page 19: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

!

• Yelp users rate restaurants • Cities can install traffic counters • Biologists measure interaction of proteins

!

!

And many more instance of partially observed data…

Completion of what?

Page 20: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Statistical Matrix CompletionTraditional approaches assume:

1. A random distribution of observations 2. At least n r log(n) observation

!With these (at least) these assumptions, statistical matrix

completion methods pose the problem as an optimization and find the best solution to match the visible information.

input meets assumptions reconstruction

Page 21: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Statistical Matrix CompletionTraditional approaches assume:

1. A random distribution of observations 2. At least n r log(n) observation

!The challenge with these assumptions is that in real data:

1. The distribution is often not random 2. Very few entries are actually known.

Page 22: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Statistical Matrix CompletionTraditional approaches assume:

1. A random distribution of observations 2. At least n r log(n) observation

!The challenge with these assumptions is that in real data:

1. The distribution is often not random 2. Very few entries are actually known.

9e7

2.5e8required n r log(n) :known ratings : ≈160,000,000

fewer entries

Page 23: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Statistical Matrix CompletionTraditional approaches assume:

1. A random distribution of observations 2. At least n r log(n) observation

!The challenge with these assumptions is that in real data:

1. The distribution is often not random 2. Very few entries are actually known.

real observed data best guess

match on Ω, not elsewhere

Page 24: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Our Question. !

!

How can we design one querying and matrix completion

algorithm, that minimizes the reconstruction error and number of queries ?

!

!

We call this the Active Completion problem.

+ + =

Page 25: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Our Question. !

!

How can we design one querying and matrix completion

algorithm, that minimizes the reconstruction error and number of queries ?

!

!

We call this the Active Completion problem.

+ + =

1 2

Page 26: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Our Question. !

!

How can we design one querying and matrix completion

algorithm, that minimizes the reconstruction error and number of queries ?

!

!

We call this the Active Completion problem.

+ + =

1 fixed to budget b

Page 27: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

With great power…Many data owners are in the powerful position to add additional observations: !

Page 28: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Many data owners are in the powerful position to add additional observations: !

• Yelp can ask some users to rate some restaurants

With great power…

Page 29: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

With great power…Many data owners are in the powerful position to add additional observations: !

• Yelp can ask some users to rate some restaurants • Cities can install traffic counters

Page 30: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

With great power…Many data owners are in the powerful position to add additional observations: !

• Yelp can ask some users to rate some restaurants • Cities can install traffic counters • Biologists can experiment with a particular protein pair

Page 31: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

With great power…Many data owners are in the powerful position to add additional observations: !

• Yelp can ask some users to rate some restaurants • Cities can install traffic counters • Biologists can experiment with a particular protein pair

Page 32: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

With great power…Many data owners are in the powerful position to add additional observations: !

• Yelp can ask some users to rate some restaurants • Cities can install traffic counters • Biologists can experiment with a particular protein pair

!

How to make the most use of the limited budget of queries?

Page 33: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

The AnswerWe construct an algorithm called Order&Extend

that is the first to integrate a querying strategy into its matrix completion algorithm.

!

!

Able to select a small number of queries needed to find an accurate completion.

Page 34: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Our ApproachThe key to our approach is viewing matrix completion through a sequence of linear systems. !This allows us to identify: 1. Parts of the matrix that can be recovered given the observations 2. Other parts that cannot due to insufficient information 3. The additional entries needed to recover those areas. !!Note this means our algorithm will not do this: It will only estimate the parts it can.

Page 35: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

MC as Linear Systems

Mn

m

= X

Y

n

m

r r

Write the data M = XY as a product of factors.

Page 36: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

MC as Linear Systems

Mn

m

= X

Y

n

m

r r

Page 37: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

MC as Linear Systems

Page 38: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

MC as Linear Systems

Mij = xi1y1j + xi2y2j Mi’j = xi’1y1j + xi’2y2j

Mij

Mi’j

yj

for rank 2 :

xixi’

Page 39: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Mij = xi1y1j + xi2y2j Mi’j = xi’1y1j + xi’2y2j

MijMi’j

yj

xixi’

﹖known

unknown

Two equations in two variables

Page 40: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Mij = xi1y1j + xi2y2j Mi’j = xi’1y1j + xi’2y2j

solve for y

MijMi’j

yj

xixi’

Page 41: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

M = xi1y1j + xi2y2j M = xi’1y1j + xi’2y2j

Iteratively solve systems of this form

fill in X and Y, then multiply to get the

estimate M=XY.~

MijMi’j

yj

xixi’

Page 42: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

How do we know when and what we need to query?

42

Page 43: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Incomplete Systems

Mij = xi1y1j + xi2y2j Mi’j = xi’1y1j + xi’2y2j

﹖known

unknown

Two equations in two variables

MijMi’j

xixi’

Page 44: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Incomplete Systems

Mij = xi1y1j + xi2y2j Mi’j = xi’1y1j + xi’2y2j

﹖known

unknown

Two equations in three variables

Mi’j was not observed in the input data.

MijMi’j

xixi’

Page 45: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Incomplete Systems

Mij = xi1y1j + xi2y2j Mi’j = xi’1y1j + xi’2y2j

﹖known

unknown

Query: what is the value of Mi’j ?

﹖MijMi’j

xixi’

Page 46: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Incomplete Systems

Mij = xi1y1j + xi2y2j Mi’j = xi’1y1j + xi’2y2j

﹖known

unknownTwo equations in two unknowns,

so we can solve for y.

MijMi’j

xixi’

Page 47: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Unstable systemsX y = M

1 1/2

1/2 1/3

3/2

1y =1

1 1/2

1/2 1/3

3/2

5/6y’ =2

Page 48: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Unstable systems

1 1/2

1/2 1/3

3/2

1

1 1/2

1/2 1/3

3/2

5/6

y =

y’ =

X y = M

Page 49: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Unstable systems

1 1/2

1/2 1/3

3/2

1

1 1/2

1/2 1/3

3/2

5/6

y =

y’ =

X y = M

y =

y’ =

0

3

1

1

Page 50: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Unstable systems

In the paper… !

1. How can we detect unstable systems? !!

2. How mitigate unstable systems?

Page 51: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Minimizing QueriesEncountering an incomplete

or unstable systemAlgorithm needs

to query.

Page 52: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Minimizing Queries

How can we also keep the number of queries asked to a minimum?

Encountering an incomplete or unstable system

Algorithm needs to query.

Page 53: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Minimizing Queries

How can we also keep the number of queries asked to a minimum?

!

By manipulating the order in which we solve the systems. (Hence the ‘order’ in Order&Extend)

Encountering an incomplete or unstable system

Algorithm needs to query.

Page 54: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

TakeawayObserved data is typically:

- not random - sparse

…But we can query!

+ + = estimate

(minimally!)

Page 55: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Option 1: Independent

Query Limit = 1

+ =

Decide what to query independently of how you complete.

Page 56: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Option 2: Integrated

+

+

=

Who is guessing?

normal person

an artist

Decide what to query based on of how you complete.

Page 57: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky!

Our algorithm Order&Extend is the first one composed of 1. a querying strategy 2. a completion algorithm

!

!

This integrated nature enables Order&Extend to : - carefully select a small number of queries,

so that the completion algorithm can - recover the matrix with high accuracy. !

!

And allows it to output partial completions for strict limits of the number allotted of queries.

tailored to

Page 58: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

A Flavor

other algorithms do not achieve comparable error

even with <40k queriesFor full and accurate

completion, Order&Extend

asks 13k queries

…while

(of internet traffic data)

Page 59: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali RuchanskyDeeper discussion of:

• Matrix completion as a sequence of linear systems • Sequence of linear systems as graph propagation • Predicting unstable systems

• distinction from ill-condition • Efficient computation of stability checks • Finding a good solving-order

• through the lens of graph propagation !

Experiments: • Comparison with Matrix Completion algorithms

• extended with a querying ability • Approximate low-rank • Exact low-rank

Read the paper!

Page 60: Matrix Completion with Queries - Semantic ScholarMatrix Completion with Queries Natali Ruchansky, Mark Crovella, ... What about now? Property of Natali Ruchansky And now? 4. Property

Property of Natali Ruchansky

Thank you.

from the book Dali’s Mustache

(and read the paper)