q j k a u v c p r g k a j decision-theoretic...

Decision-theoretic Clustering of Strategies

Nolan Bard, Deon Nicholas, Csaba Szepesvári, and Michael Bowling

AAAI Spring Symposium 2015on Applied Computational Game Theory

March 23th, 2015

(appearing in AAMAS 2015)

University of AlbertaComputer Poker Research Group

Motivation

Given: Knowledge of agents/entities

Goal: Maximize utility by exploiting data

Problem: Limited response personalization

• Resource constrained

• Online learning cost

Motivation

...Portfolio P

Utility

Solution

• Cluster agents/entities into groups

• Tailor responses to the aggregate clusters

• But…

One of these things…

Paper Scissors

One of these things…

Paper Scissors

Objective

(R)esponsesk-element partition of rows:

argmax

P2Partk(E)

u(e, r)

P = {C1, . . . , Ck} 2 Partk(E)

Segmentation Problems

(R)esponsesk-element partition of rows:

argmax

P2Partk(E)

u(e, r)

P = {C1, . . . , Ck} 2 Partk(E)

Cluster based on actionability. [Kleinberg et al.]

Maximum Coverage

(R)esponsesk-element subset of columns:

argmax

R0✓R|R0|=k

r2R0u(e, r)

(R)esponsesSegmentation Problems

(R)esponses

argmax

R0✓R|R0|=k

r2R0u(e, r)

argmax

P2Partk(E)

u(e, r)

argmax

R0✓R|R0|=k

r2R0u(e, r)

argmax

P2Partk(E)

u(e, r)

argmax

le1 ,...,lem|S

e2E le|=k

u(e, rle)

(R)esponses

O(|E||R|)

(R)esponses

O(k|E|)

Maximum Coverage

(R)esponses

argmax

R0✓R|R0|=k

r2R0u(e, r)

Exact: NP-hard

Approximation[Nemhauser et al.]

• Greedy submodular• -approximation• Complexity:

(1� 1/e)

O(k|E||R|)

Nemhauser’s Greedy

(R)esponses

argmax

R0✓R|R0|=k

r2R0u(e, r)

(R)esponses

r2R0u(e, r)

R’{}

(R)esponses

r2R0u(e, r)

MarginalGain

R’{}

(R)esponses

r2R0u(e, r)

MarginalGain

R’{}

(R)esponses

r2R0u(e, r)

MarginalGain

R’{}

(R)esponses

r2R0u(e, r)

R’{r2}

(R)esponses

r2R0u(e, r)

R’{r2}

MarginalGain

(R)esponses

r2R0u(e, r)

R’{r2}

MarginalGain

(R)esponses

r2R0u(e, r)

R’{r2,r3}

Problem

Infinitely/exponentially large response space?

• Nemhauser et al.’s greedy is infeasible

Structured Utility

• May be able to exploit structure in utility

argmax

P2Partk(E)

u(e, r)

Structured Utility

• May be able to exploit structure in utility

argmax

P2Partk(E)

u(e, r)

f(C) ⌘ argmax

u(e, r)

Response Oracle

f(C) ⌘ argmax

u(e, r)

Example: Sequence-form games

• # Responses: at least exponential in infosets

• Best response: linear in infosets

Greedy Heuristic

Greedy agglomerative (“bottom up”) clustering

Initialize: singletonsRock

Paper Scissors

Greedy Heuristic

Initialize: singletons

Clustering Loss

Iteration: merge with min marginal loss

Ci, Cj 2 P

r⇤2Ru(e, r⇤)�

u(e, r)

Paper Scissors

Greedy Heuristic

Paper Scissors

Initialize: singletons

Clustering Loss

Iteration: merge with min marginal loss

Ci, Cj 2 P

r⇤2Ru(e, r⇤)�

u(e, r)

Greedy Heuristic

• oracle calls (using memoization)

• Feasible given efficient oracle

• k not needed in advance

• Lazy evaluations and parallelizable

O(|E|2)

Results

Worst-case Approximation Bounds

u(Gk) � max

◆u⇤k � 1p

mu⇤k

u(Gk)� " 2pmu⇤k

Lower:

Upper:

Experimental Design

• Sampled 200 static strategies uniformly

• Compared to k-means/Lloyd’s

• k-means++ seeding [Arthur and Vassilvitskii]

• Feature vectors: sequence-form

• 50 random restarts

Qualitative: Kuhn Poker

K-means

0.0 0.2 0.4 0.6 0.8 1.0

1.0⇠

Greedy

0.0 0.2 0.4 0.6 0.8 1.0

1.0⇠

Quantitative

Kuhn Poker

2 4 6 8 10 12k (number of clusters)

Greedyk-meansOptimal

Leduc Hold’em

10 20 30 40 50 60k (number of clusters)

Greedyk-means

Questions?75

University of AlbertaComputer Poker Research Group

poker.cs.ualberta.ca

Contact: nolan@cs.ualberta.ca

q j k a u v c p r g k a j decision-theoretic...

Documents

mtt-semantics is model-theoretic as well as...

theoretic bases of law

information-theoretic analysis of neural...

graph theoretic applications

the scheme-theoretic theta convolutionmotizuki/the...

designing patrol strategies to maximize pristine forest...

theoretic arithmetic thomas taylor

number-theoretic reference problems

information theoretic cryptography introduction

theoretic arithmetic

a graph theoretic window

hongyaoma feifang davidc.parkes september26,2018 arxiv

empirical game-theoretic methods for strategy design and...

information theoretic sensor management

game-theoretic optimal portfolios*

graph theoretic analyses:

robust information-theoretic clustering

number-theoretic algorithms

number-theoretic fast-decodable space–time … ·...

measure-theoretic chaos