introduction to auction design via machine learningeviterci/slides/adviaml.pdfintroduction to...
TRANSCRIPT
Introduction to Auction Design via Machine Learning
Ellen Vitercik
December 4, 2017 CMU 10-715
Advanced Introduction to Machine Learning
Ad auctions contribute a huge portion of large internet companiesβ revenue.
Ad revenue in 2016 Total revenue in 2016
Google $79 billion $89.46 billion
Facebook $27 billion $27.64 billion
Leonardo da Vinci painting sells for $450.3m, shattering auction highs [NY Times β17]
Diamond fetches $33.7m at Christieβs auction in Geneva [BBC β17]
The FCC spectrum auction is sending $10 billion to broadcasters [NiemanLab β17]
Very-large-scale generalized combinatorial multi-attribute auctions: Lessons from conducting $60 billion of sourcing [Sandholm β13]
How can we design revenue-maximizing auctions using
machine learning?
1. Introduction
2. Auction design background
3. Revenue-maximizing auctions
4. Learning-theoretic auction design
5. Tools
6. Case study: Second price auctions with reserve prices
7. Related work
Common misconception: Thereβs only one way to hold an auction.
There are infinitely many ways to design auctions.
Ascending price auction: Auctioneer calls out the opening bid. Bidders call out bids until there is no new bid within a period of time. Last bidder to bid wins the item and pays their bid.
Ascending price auction: Auctioneer calls out the opening bid. Bidders call out bids until there is no new bid within a period of time. Last bidder to bid wins the item and pays their bid. Online auctions can only take a fraction of a second, so we need faster auction formats.
Second price auction for a single good :
Each bidder submits a bid. Highest bidder wins. They
pay the second highest bid.
2
1 pays 1
The second price auction is incentive compatible (every bidder will maximize their utility by bidding truthfully). Why not bid above value ? If winner, will stay winner and price wonβt change. If loser, might become winner, but will have to pay a price larger than value , which will result in negative utility.
value β payment β’ 1 wins item
The second price auction is incentive compatible (every bidder will maximize their utility by bidding truthfully). Why not bid below value ? If winner, might become loser, and will shift from non-negative to zero utility. If loser, will still be loser, so utility will still be zero.
value β payment β’ 1 wins item
Second price auction with a reserve price π β β for a
single good :
Each bidder submits a bid. Highest bidder wins if their
bid is above π. They pay maximum of π and the
second highest bid.
Optimizing reserve prices can lead to higher revenue. But how should we choose the reserve price?
Second price auction with a reserve price of 1.5 for a
single good :
Each bidder submits a bid. Highest bidder wins if their bid is above 1.5. They pay maximum of 1.5 and the
second highest bid.
pays 1.5
2
1
1. Introduction
2. Auction design background
3. Revenue-maximizing auctions
4. Learning-theoretic auction design
5. Tools
6. Case study: Second price auctions with reserve prices
7. Related work
In 1981, Roger Myerson discovered the revenue-maximizing single-item, multi-bidder auction. He won the 2007 Nobel prize in economics for his work on mechanism design. R. Myerson. Optimal auction design. Mathematics of Operations Research, 6(1):58β73, 1981.
Standard assumption A buyer's value for a bundle of goods is defined by a probability distribution over all the possible valuations she might have for that bundle.
value ~ Uniform(1, 3)
For simplicity, suppose all bidders have the same valuation distribution, with PDF π, CDF πΉ, and support in [0, 1].
Standard assumption A buyer's value for a bundle of goods is defined by a probability distribution over all the possible valuations she might have for that bundle.
Myersonβs optimal auction
Let π π₯ = π₯ β1βπΉ(π₯)
π(π₯). Run a second price auction with a reserve
price of πβ1 0 .
Myerson also analyzed the case where the biddersβ values are not identically distributed.
The revenue-maximizing multi-item auction is unknown!
This requires that we know the distribution over biddersβ values. Where does this knowledge come from? In multi-item cases, the support of the distribution might be doubly exponential!
Myersonβs optimal auction
Let π π₯ = π₯ β1βπΉ(π₯)
π(π₯). Run a second price auction with a reserve
price of πβ1 0 .
1. Introduction
2. Auction design background
3. Revenue-maximizing auctions
4. Learning-theoretic auction design
5. Tools
6. Case study: Second price auctions with reserve prices
7. Related work
Thereβs an unknown distribution over valuations. Use a sample to
learn a mechanism with high expected revenue.
value ~
Research papers:
[Likhodedov and Sandholm β04, β05, Elkind β07, Dhangwatnotai et al. β10, Medina and
Mohri β14, Cole and Roughgarden β14, Huang et al. β15, Morgenstern and Roughgarden β15, β16, Roughgarden and Schrijvers β16, Devanur
et al. β16, Balcan, Sandholm, and V β16, β17, Hartline and Taggart β17, Syrgkanis β17, Alon
et al. β17, Cai and Daskalakis β17, Gonczarowski and Nisan β17, Medina and
Vassilvitskii β17, DΓΌtting et al. β17]
value ~
Nina Balcan Connections between Learning, Game Theory, and Optimization
Georgia Tech
Fall 2010
Constantinos Daskalakis and Vasilis Syrgkanis
Algorithmic Game Theory and Data Science
MIT Spring 2017
Classes:
value ~
Workshops and Tutorials:
value ~
2016 EC Tutorial on Algorithmic Game Theory and Data Science
2015, 2016, 2017
EC Workshop on Algorithmic Game Theory and Data Science
2017 NIPS Workshop on Learning in Presence of Strategic Behavior
2017 Dagstuhl Workshop on Game Theory Meets Computational Learning Theory
Central question: Is the resulting auction nearly optimal?
Fix an auction class β³
Look at the samples
Optimize over β³
Field auction
Central question: Does the resulting auction approximately maximize revenue in expectation over the unknown distribution?
Fix an auction class β³
Look at the samples
Optimize over β³
Field auction
Depends on the class β³. Let πβ be the optimal auction and let π β β³ be the auction returned by the learning algorithm.
πΌ Rev πβ β πΌ Rev π
= Rev πβ β maxπββ³
Rev π + maxπββ³
Rev(π) β Rev π
Central question: Does the resulting auction approximately maximize revenue in expectation over the unknown distribution?
Total revenue loss
Approximation loss
Depends on the class β³. Let πβ be the optimal auction and let π β β³ be the auction returned by the learning algorithm.
πΌ Rev πβ β πΌ Rev π
= πΌ Rev πβ β maxπββ³
πΌ Rev π + maxπββ³
πΌ Rev π β πΌ Rev π
Central question: Does the resulting auction approximately maximize revenue in expectation over the unknown distribution?
Estimation loss
Depends on the class β³.
Central question: Does the resulting auction approximately maximize revenue in expectation over the unknown distribution?
Simple class
Bad approximation loss Good estimation loss
Good approximation loss Bad estimation loss
Complex class
These are the same problems we face in machine learning.
Simple class
Bad approximation loss Good estimation loss
Good approximation loss Bad estimation loss
Complex class
Say β³ is the class of second price auctions with reserve prices. β³ has zero approximation loss when the biddersβ values are i.i.d.! This talk: What is the estimation loss of β³?
Myersonβs optimal auction for i.i.d. bidders
Let π π₯ = π₯ β1βπΉ(π₯)
π(π₯). Run a second price auction with a reserve
price of πβ1 0 .
1. Introduction
2. Auction design background
3. Revenue-maximizing auctions
4. Learning-theoretic auction design
5. Tools
6. Case study: Second price auctions with reserve prices
7. Related work
VC dimension: A complexity measure for binary-valued functions β:π³ β 0,1
Example.
For π, π β β, π < π, let βπ,π π₯ = 1, π₯ β (π, π)&0, π₯ β (π, π)
Let βint = βπ,π: π, π β β, π < π .
No set π β β of size 3 can be labeled in all 23 ways by βint.
1 1 0
1 0
1 0
0
1
0
1
The VC dimension of a class β is the size of the largest set π that can be labeled in all 2|π| ways by functions in β.
VC dimension
A set π = π₯1, β¦ , π₯π is shattered by β if for all π β {0,1}|π|, there exists a function β β β such that for all π β [π], β π₯π = ππ. The VC dimension of β is the size of the largest set π that can be shattered by β.
1 0 0 0 1 0 1 1
The VC dimension of a class β is the size of the largest set π that can be labeled in all 2|π| ways by functions in β.
1 0 0 0 1 0 1 1
The VC dimension of βint is at least 2.
1 0
1 0
0
1
0
1
The VC dimension of βint is at most 2. 1 1 0
The VC dimension of a class β is the size of the largest set π that can be labeled in all 2|π| ways by functions in β.
1 0 0 0 1 0 1 1
1 0
1 0
0
1
0
1
1 1 0
The VC dimension of βint is 2.
Pseudo-dimension: A complexity measure for real-valued functions β:π³ β β
Set π = π₯1, β¦ , π₯π Class of functions β = β:π³ β β
π = π§(1), β¦ , π§(π) β βπ witnesses the shattering of π&by β if for all π β {0,1}π, there exists β β β such that β π₯π β€ π§(π) if ππ = 0 and β π₯π > π§(π) if ππ = 1
ππ π ππ β€ π(π) π
ππ π ππ β€ π(π) π
ππ π ππ > π(π) π
ππ π ππ β€ π(π) π
ππ π ππ > π(π) π
ππ π ππ β€ π(π) π
ππ π ππ > π(π) π
ππ π ππ > π(π) π
ππ π ππ β€ π(π) π
Set π = π₯1, β¦ , π₯π Class of functions β = β:π³ β β
π = π§(1), β¦ , π§(π) β βπ witnesses the shattering of π&by β if for all π β {0,1}π, there exists β β β such that β π₯π β€ π§(π) if ππ = 0 and β π₯π > π§(π) if ππ = 1
π₯1 β π₯1 β€ π§(1) 0
π₯2 β π₯2 β€ π§(2) 0
π₯3 β π₯3 > π§(3) 1
π₯4 β π₯4 β€ π§(4) 0
π₯5 β π₯5 > π§(5) 1
π₯6 β π₯6 β€ π§(6) 0
π₯7 β π₯7 > π§(7) 1
π₯8 β π₯8 > π§(8) 1
β π₯9 β€ π§(9) 0
Set π = π₯1, β¦ , π₯π Class of functions β = β:π³ β β
π = π§(1), β¦ , π§(π) β βπ witnesses the shattering of π&by β if for all π β {0,1}π, there exists β β β such that β π₯π β€ π§(π) if ππ = 0 and β π₯π > π§(π) if ππ = 1
π₯1 β π₯1 β€ π§(1) 0
π₯2 β π₯2 β€ π§(2) 0
π₯3 β π₯3 > π§(3) 1
π₯4 β π₯4 β€ π§(4) 0
π₯5 β π₯5 > π§(5) 1
π₯6 β π₯6 β€ π§(6) 0
π₯7 β π₯7 > π§(7) 1
π₯8 β π₯8 > π§(8) 1
Set π = π₯1, β¦ , π₯π Class of functions β = β:π³ β β
π = π§(1), β¦ , π§(π) β βπ witnesses the shattering of π&by β if for all π β {0,1}π, there exists β β β such that β π₯π β€ π§(π) if ππ = 0 and β π₯π > π§(π) if ππ = 1
π₯1 β π₯1 β€ π§(1) 0
π₯2 β π₯2 β€ π§(2) 0
π₯3 β π₯3 > π§(3) 1
π₯4 β π₯4 β€ π§(4) 0
π₯5 β π₯5 > π§(5) 1
π₯6 β π₯6 β€ π§(6) 0
π₯7 β π₯7 > π§(7) 1
π₯8 β π₯8 > π§(8) 1
Set π = π₯1, β¦ , π₯π Class of functions β = β:π³ β β
π = π§(1), β¦ , π§(π) β βπ witnesses the shattering of π&by β if for all π β {0,1}π, there exists β β β such that β π₯π β€ π§(π) if ππ = 0 and β π₯π > π§(π) if ππ = 1
π₯1 β π₯1 β€ π§(1) 0
π₯2 β π₯2 β€ π§(2) 0
π₯3 β π₯3 > π§(3) 1
π₯4 β π₯4 β€ π§(4) 0
π₯5 β π₯5 > π§(5) 1
π₯6 β π₯6 β€ π§(6) 0
π₯7 β π₯7 > π§(7) 1
π₯8 β π₯8 > π§(8) 1
The pseudo-dimension of β is the cardinality of the largest set that can be shattered by β.
Set π = π₯1, β¦ , π₯π Class of functions β = β:π³ β β
π = π§(1), β¦ , π§(π) β βπ witnesses the shattering of π&by β if for all π β {0,1}π, there exists β β β such that β π₯π β€ π§(π) if ππ = 0 and β π₯π > π§(π) if ππ = 1
π₯1 β π₯1 β€ π§(1) 0
π₯2 β π₯2 β€ π§(2) 0
π₯3 β π₯3 > π§(3) 1
π₯4 β π₯4 β€ π§(4) 0
π₯5 β π₯5 > π§(5) 1
π₯6 β π₯6 β€ π§(6) 0
π₯7 β π₯7 > π§(7) 1
π₯8 β π₯8 > π§(8) 1
The pseudo-dimension of β is the VC dimension of the set of βbelow the graphβ functions (π₯, π§) βΌ πβ π₯ βπ§>0|&β β β .
Theorem [Haussler 1992] Suppose β = β:π³ β [βπ,π]
For every π, πΏ > 0, and distribution π·, if π = π π
π
2β Pdim(β) ,
then with probability 1 β πΏ over the draw of π = π₯1, β¦ , π₯π ~π·π, for all β β β,
1
π β π₯π β πΌπ₯~π·[β π₯ ]
π
π=1
< π
Corollary [Haussler 1992] Suppose β = β:π³ β [βπ,π] Given a sample π = π₯1, β¦ , π₯π , let βπ = max
βββ β π₯π
ππ=1 .
For all π, πΏ > 0, π = π π
π
2Pdim(β) samples are sufficient to
ensure that w.p. 1 β πΏ over the draw of π = π₯1, β¦ , π₯π ~π·π,
maxβββ
πΌπ₯~π· β π₯ β πΌπ₯~π· βπ π₯ β€ π
Approximation loss
Depends on the class β³. Let πβ be the optimal auction and let π β β³ be the auction returned by the learning algorithm.
πΌ Rev πβ β πΌ Rev π
= πΌ Rev πβ β maxπββ³
πΌ Rev π + maxπββ³
πΌ Rev π β πΌ Rev π
Central question: Does the resulting auction approximately maximize revenue in expectation over the unknown distribution?
Estimation loss
Corollary [Haussler 1992] Suppose β = β:π³ β [βπ,π] Given a sample π = π₯1, β¦ , π₯π , let βπ = max
βββ β π₯π
ππ=1 .
For all π, πΏ > 0, π = π π
π
2Pdim(β) samples are sufficient to
ensure that w.p. 1 β πΏ over the draw of π = π₯1, β¦ , π₯π ~π·π,
maxβββ
πΌπ₯~π· β π₯ β πΌπ₯~π· βπ π₯ β€ π
Pseudo-dimension allows us to bound estimation loss.
Example. Let β be the set of all constant functions
β = β βΆ βπ₯ β π³, β π₯ = π&for&some&π β β
Example. For π = π₯1, π₯2 β π³, let π§(1) and π§(2) be two potential witnesses. WLOG, suppose π§(1) β€ π§(2). There is no constant function β π₯ = π where π§(2) < β π₯2 = π = β π₯1 β€ π§(1). β π is not shatterable.
π₯1 π₯2
π§(1)
π§(2)
Example. For π = π₯1, π₯2 β π³, let π§(1) and π§(2) be two candidate witnesses. WLOG, suppose π§(1) β€ π§(2). There is no constant function β π₯ = π where π§(2) < β π₯2 = π = β π₯1 β€ π§(1). β π is not shatterable. Clearly, a single point is shatterable. Therefore, the pseudo-dimension of β is 1.
1. Introduction
2. Auction design background
3. Revenue-maximizing auctions
4. Learning-theoretic auction design
5. Tools
6. Case study: Second price auctions with reserve prices a) Recap
b) Result
7. Related work
Central question: Does the resulting auction approximately maximize revenue in expectation over the unknown distribution?
Fix an auction class β³
Look at the samples
Optimize over β³
Field auction
Approximation loss
Depends on the class β³. Let πβ be the optimal auction and let π β β³ be the auction returned by the learning algorithm.
πΌ Rev πβ β πΌ Rev π
= πΌ Rev πβ β maxπββ³
πΌ Rev π + maxπββ³
πΌ Rev π β πΌ Rev π
Central question: Does the resulting auction approximately maximize revenue in expectation over the unknown distribution?
Estimation loss
Suppose β³ is the class of second price auctions with reserve prices. β³ has zero approximation loss when the bidders valuations are i.i.d.! What is the estimation loss of π?
Myersonβs optimal auction for i.i.d. bidders
Let π π₯ = π₯ β1βπΉ(π₯)
π(π₯). Run a second price auction with a reserve
price of πβ1 0 .
1. Introduction
2. Auction design background
3. Revenue-maximizing auctions
4. Learning-theoretic auction design
5. Tools
6. Case study: Second price auctions with reserve prices a) Recap
b) Result
7. Related work
Notation: π = π£1, β¦ , π£π ~π·, where π£π is the value of bidder π
Let π£(1) = max π£1, β¦ , π£π and π£(2) = max π£1, β¦ , π£π β π£ 1
For a reserve price π, revπ π = max π, π£(2) β π π£ 1 β₯ π .
Define the class β = revπ π βΆ π& β₯ 0 . Weβve seen that the approximation loss for learning over β is zero when the bidders have i.i.d. valuations.
Notation: π = π£1, β¦ , π£π ~π·, where π£π is the value of bidder π
Let π£(1) = max π£1, β¦ , π£π and π£(2) = max π£1, β¦ , π£π β π£ 1
For a reserve price π, revπ π = max π, π£(2) β π π£ 1 β₯ π .
Define the class β = revπ π βΆ π& β₯ 0 . What is the estimation loss for learning over β? To answer this, weβll bound the pseudo-dimension of β.
*Special case of a theorem by Morgenstern and Roughgarden [2016]
Theorem* The pseudo-dimension of the class β = revπ π βΆ π& β₯ 0 is 2.
Proof.
Suppose π = π(1), β¦ , π(π) is shatterable by β = revπ π βΆ π β₯ 0 .
There is a set of witnesses π§(1), β¦ , π§(π) such that for every π β 0,1 π, there exists ππ β₯ 0 such that revππ
π(π) β€ π§(π) if ππ = 0 and revππ
π(π) > π§(π) if ππ = 1.
Proof. Let π β π and let π§ be its witness. Write revπ π = revπ π .
π£(2)
revπ π
π
π£(2)
π£(1)
Proof. Suppose π§ < π£(2).
When π β€ π£(1), revπ π > π§ and when π > π£(1), revπ π β€ π§.
revπ π
π
π£(2)
π§
π£(1)
Proof. Suppose π§ β₯ π£(2).
When π β 0, π§ or π β π£ 1 , β , revπ π β€ π§.
When π β π§, π£ 1 , revπ π > π§.
π£(1)
revπ π
π π§
π§
Proof. Notation: Let π(π) β π and let π§(π) be its witness.
π
revπ(π) π
Second highest value of
π(π)
π£(2)(π)
π£(1)(π)
Highest value of
π(π)
Proof.
Sort π§ 1 , π£ 11, β¦ , π§ π , π£ 1
π . For all π(π) β π, between these sorted
values, either revπ(π) π > π§(π) or revπ(π) π β€ π§(π).
revπ(π) π
π
π£(2)(π)
π§ π
π£(1)(π)
Proof.
Sort π§ 1 , π£ 11, β¦ , π§ π , π£ 1
π . For all π(π) β π, between these sorted
values, either revπ(π) π > π§(π) or revπ(π) π β€ π§(π).
revπ(π) π
π§ π
π£(1)(π)
π§ π
Proof. Recall:
Suppose π = π(1), β¦ , π(π) is shatterable by β = revπ π βΆ π β₯ 0 .
There is a set of witnesses π§(1), β¦ , π§(π) such that for every π β 0,1 π, there exists ππ β₯ 0 such that revππ
π(π) β€ π§(π) if ππ = 0 and revππ
π(π) > π§(π) if ππ = 1. Let π β = ππ βΆ π β 0,1 π .
Proof.
Sort π§ 1 , π£ 11, β¦ , π§ π , π£ 1
π . For all π(π) β π, between these sorted
values, either revπ(π) π > π§(π) or revπ(π) π β€ π§(π).
revπ(π) π
π
π£(2)(π)
π§ π
π£(1)(π)
Proof.
Sort π§ 1 , π£ 11, β¦ , π§ π , π£ 1
π . For all π(π) β π, between these sorted
values, either revπ(π) π > π§(π) or revπ(π) π β€ π§(π).
revπ(π) π
π§ π
π£(1)(π)
π§ π
Proof.
Sort π§ 1 , π£ 11, β¦ , π§ π , π£ 1
π . For all π(π) β π, between these sorted
values, either revπ(π) π > π§(π) or revπ(π) π β€ π§(π).
At most one π β π β is from each interval between sorted values. Therefore, 2π = π β β€ 2π + 1, so π β€ 2.
Theorem The pseudo-dimension of the class β = revπ π βΆ π& β₯ 0 is 2.
Corollary Suppose the biddersβ valuations are i.i.d with support in [0,1]. Let β be the class of second price auctions with reserves. Approximation loss: 0 Estimation loss: π 1
|π|2
Corollary Suppose the biddersβ valuations are i.i.d with support in [0,1]. Let β be the class of second price auctions with reserves. For a sample π = π(1), β¦ , π(π) ~π·, let ππ be the reserve price that maximizes revπ π ππ
π=1 . π
1
π2 samples are sufficient to ensure that with probability at
least 1 β πΏ, πΌπ~π· revπππ β₯ max
πβ₯0πΌπ~π· revπ π β ν.
This is tight [Cesa-Bianchi et al. 2015]!
1. Introduction
2. Auction design background
3. Revenue-maximizing auctions
4. Learning-theoretic auction design
5. Tools
6. Case study: Second price auctions with reserve prices
7. Related work
Other single-item auction batch learning papers: Elkind β07, Dhangwatnotai et al. β10, Medina and Mohri β14, Cole and Roughgarden β14, Huang et al. β15, Morgenstern and Roughgarden β15, Roughgarden and Schrijvers β16, Devanur et al. β16, Hartline and Taggart β17, Alon et al. β17, Gonczarowski and Nisan β17 Multi-item auction batch learning: Likhodedov and Sandholm β04, β05, Morgenstern and Roughgarden β16, Balcan, Sandholm, and V β16, β17, Syrgkanis β17, Cai and Daskalakis β17, Medina and Vassilvitskii β17 Online auction design: Kleinberg and Leighton β03, Blum and Hartline β05, Cesa-Bianchi et al. β15, DudΓk et al. β17, Balcan, Dick and V β17
Auction design via deep learning: DΓΌtting et al. β17
Solid regions represent allocation probability learned by neural network when thereβs a single bidder with π£1, π£2~π[0,1]. The optimal mechanism of Manelli and Vincent [β06] is described by the regions separated by the dashed orange lines.