q j k a u v c p r g k a j decision-theoretic...

75
Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas, Csaba Szepesvári, and Michael Bowling AAAI Spring Symposium 2015 on Applied Computational Game Theory March 23th, 2015 (appearing in AAMAS 2015) U V A ! A ! C K " K " P Q # Q # R J $ J $ G 10 ! 10 ! University of Alberta Computer Poker Research Group

Upload: vonhan

Post on 07-Sep-2018

233 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Decision-theoretic Clustering of Strategies

Nolan Bard, Deon Nicholas, Csaba Szepesvári, and Michael Bowling

AAAI Spring Symposium 2015on Applied Computational Game Theory

March 23th, 2015

(appearing in AAMAS 2015)

U

VA!

A!

CK"

K"

PQ#

Q#

RJ$

J$

G10!

10!

University of AlbertaComputer Poker Research Group

Page 2: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Motivation

Given: Knowledge of agents/entities

Goal: Maximize utility by exploiting data

Problem: Limited response personalization

• Resource constrained

• Online learning cost

2

Page 3: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Motivation

3

...Portfolio P

Utility

Page 4: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Solution

• Cluster agents/entities into groups

• Tailor responses to the aggregate clusters

• But…

4

Page 5: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

One of these things…

5

Rock

Paper Scissors

E2

3

1

Page 6: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

One of these things…

6

Rock

Paper Scissors

P

S

RE2

3

1

Page 7: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Objective

7

4

7

2

9

3

5(E

)ntit

ies

(R)esponsesk-element partition of rows:

argmax

P2Partk(E)

X

C2P

max

r2R

X

e2C

u(e, r)

P = {C1, . . . , Ck} 2 Partk(E)

Page 8: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Segmentation Problems

8

4

7

2

9

3

5(E

)ntit

ies

(R)esponsesk-element partition of rows:

argmax

P2Partk(E)

X

C2P

max

r2R

X

e2C

u(e, r)

P = {C1, . . . , Ck} 2 Partk(E)

Cluster based on actionability. [Kleinberg et al.]

Page 9: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Maximum Coverage

9

4

7

2

9

3

5(E

)ntit

ies

(R)esponsesk-element subset of columns:

argmax

R0✓R|R0|=k

X

e2E

max

r2R0u(e, r)

Page 10: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponsesSegmentation Problems

10

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

argmax

R0✓R|R0|=k

X

e2E

max

r2R0u(e, r)

argmax

P2Partk(E)

X

C2P

max

r2R

X

e2C

u(e, r)

Page 11: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Segmentation Problems

11

argmax

R0✓R|R0|=k

X

e2E

max

r2R0u(e, r)

argmax

P2Partk(E)

X

C2P

max

r2R

X

e2C

u(e, r)

argmax

le1 ,...,lem|S

e2E le|=k

X

e2E

u(e, rle)

Page 12: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Segmentation Problems

12

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Page 13: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Segmentation Problems

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Page 14: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Segmentation Problems

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Page 15: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Segmentation Problems

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Page 16: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Segmentation Problems

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Page 17: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Segmentation Problems

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Page 18: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Segmentation Problems

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Page 19: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Segmentation Problems

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Page 20: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Segmentation Problems

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Page 21: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Segmentation Problems

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Page 22: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Segmentation Problems

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

O(|E||R|)

Page 23: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

23

Page 24: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 25: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 26: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 27: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 28: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 29: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 30: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 31: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 32: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 33: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 34: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 35: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 36: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 37: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 38: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 39: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 40: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 41: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 42: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 43: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 44: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 45: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 46: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

Page 47: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

Segmentation Problems

O(k|E|)

Page 48: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Maximum Coverage

48

4

7

2

9

3

5

(E)n

titie

s

(R)esponses

argmax

R0✓R|R0|=k

X

e2E

max

r2R0u(e, r)

Exact: NP-hard

Approximation[Nemhauser et al.]

• Greedy submodular• -approximation• Complexity:

(1� 1/e)

O(k|E||R|)

Page 49: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Nemhauser’s Greedy

1 2 5

5 4 1

3 4 1

(E)n

titie

s

(R)esponses

argmax

R0✓R|R0|=k

X

e2E

max

r2R0u(e, r)

Page 50: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Nemhauser’s Greedy

1 2 5

5 4 1

3 4 1

(E)n

titie

s

(R)esponses

X

e2E

max

r2R0u(e, r)

0

0

0

0

R’{}

Page 51: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Nemhauser’s Greedy

1 2 5

5 4 1

3 4 1

(E)n

titie

s

(R)esponses

X

e2E

max

r2R0u(e, r)

0

0

0

0

1

5

3

9

MarginalGain

R’{}

Page 52: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Nemhauser’s Greedy

1 2 5

5 4 1

3 4 1

(E)n

titie

s

(R)esponses

X

e2E

max

r2R0u(e, r)

0

0

0

0

2

4

4

10

MarginalGain

R’{}

Page 53: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Nemhauser’s Greedy

1 2 5

5 4 1

3 4 1

(E)n

titie

s

(R)esponses

X

e2E

max

r2R0u(e, r)

0

0

0

0

5

1

1

7

MarginalGain

R’{}

Page 54: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Nemhauser’s Greedy

1 2 5

5 4 1

3 4 1

(E)n

titie

s

(R)esponses

X

e2E

max

r2R0u(e, r)

2

4

4

10

R’{r2}

Page 55: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Nemhauser’s Greedy

1 2 5

5 4 1

3 4 1

(E)n

titie

s

(R)esponses

X

e2E

max

r2R0u(e, r)

2

4

4

10

R’{r2}

2

5

4

1

MarginalGain

Page 56: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Nemhauser’s Greedy

1 2 5

5 4 1

3 4 1

(E)n

titie

s

(R)esponses

X

e2E

max

r2R0u(e, r)

2

4

4

10

R’{r2}

5

4

4

3

MarginalGain

Page 57: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Nemhauser’s Greedy

1 2 5

5 4 1

3 4 1

(E)n

titie

s

(R)esponses

X

e2E

max

r2R0u(e, r)

5

4

4

13

R’{r2,r3}

Page 58: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Problem

Infinitely/exponentially large response space?

• Nemhauser et al.’s greedy is infeasible

58

Page 59: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Structured Utility

• May be able to exploit structure in utility

59

argmax

P2Partk(E)

X

C2P

max

r2R

X

e2C

u(e, r)

Page 60: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Structured Utility

• May be able to exploit structure in utility

60

argmax

P2Partk(E)

X

C2P

max

r2R

X

e2C

u(e, r)

f(C) ⌘ argmax

r2R

X

e2C

u(e, r)

Page 61: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Response Oracle

61

f(C) ⌘ argmax

r2R

X

e2C

u(e, r)

Example: Sequence-form games

• # Responses: at least exponential in infosets

• Best response: linear in infosets

Page 62: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Greedy Heuristic

Greedy agglomerative (“bottom up”) clustering

62

Initialize: singletonsRock

Paper Scissors

2

3

1

Page 63: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Greedy Heuristic

Greedy agglomerative (“bottom up”) clustering

63

Initialize: singletons

Clustering Loss

Iteration: merge with min marginal loss

Ci, Cj 2 P

X

e2E

max

r⇤2Ru(e, r⇤)�

X

C2P

max

r2R

X

e2C

u(e, r)

Rock

Paper Scissors

2

3

1

Page 64: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Greedy Heuristic

Greedy agglomerative (“bottom up”) clustering

64

Rock

Paper Scissors

2

3

1

Initialize: singletons

Clustering Loss

Iteration: merge with min marginal loss

Ci, Cj 2 P

X

e2E

max

r⇤2Ru(e, r⇤)�

X

C2P

max

r2R

X

e2C

u(e, r)

Page 65: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Greedy Heuristic

Greedy agglomerative (“bottom up”) clustering

• oracle calls (using memoization)

• Feasible given efficient oracle

• k not needed in advance

• Lazy evaluations and parallelizable

65

O(|E|2)

Page 66: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Results

66

Page 67: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Worst-case Approximation Bounds

67

u(Gk) � max

✓1

k,k

m

◆u⇤k � 1p

mu⇤k

u(Gk)� " 2pmu⇤k

Lower:

Upper:

Page 68: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Experimental Design

• Sampled 200 static strategies uniformly

• Compared to k-means/Lloyd’s

• k-means++ seeding [Arthur and Vassilvitskii]

• Feature vectors: sequence-form

• 50 random restarts

68

Page 69: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Qualitative: Kuhn Poker

69

Page 70: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

K-means

70

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0⇠

Page 71: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Greedy

71

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0⇠

Page 72: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Quantitative

72

Page 73: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Kuhn Poker

73

2 4 6 8 10 12k (number of clusters)

0

10

20

30

40

50

60

Mea

n lo

ss v

s. s

ingl

eton

resp

onse

s (m

bb/g

)

Greedyk-meansOptimal

Page 74: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Leduc Hold’em

74

10 20 30 40 50 60k (number of clusters)

0

50

100

150

200

250

300

350

Mea

n lo

ss v

s. s

ingl

eton

resp

onse

s (m

bb/g

)

Greedyk-means

Page 75: Q J K A U V C P R G K A J Decision-theoretic …teamcore.usc.edu/people/feifang/AAAISS15/AgentClustering.pdf · Decision-theoretic Clustering of Strategies Nolan Bard, Deon Nicholas,

Questions?75

U

VA!

A!

CK"

K"

PQ#

Q#

RJ$

J$

G10!

10!

University of AlbertaComputer Poker Research Group

poker.cs.ualberta.ca

Contact: [email protected]