machine learning for display advertising - nyu
TRANSCRIPT
1
©Provost2009,2010
MachineLearningforDisplayAdvertising
©Provost2009,2010
The Idea: Social Targeting for Online Advertising
Perf
orm
ance
2
©Provost2009,2010
Currentadspendingseemsdisproportionate…
©Provost2009,2010
OnlineAdvertisingSpendingBreakdown(2009)
Source: IAB Internet Advertising Revenue Report Conducted by PricewaterhouseCoopers and Sponsored by the Interactive Advertising Bureau (IAB) April 2010
3
©Provost2009,2010
Twodifferentgoalsfordisplayadvertising
• Driveconversions(shortterm)• Brandadvertising(longerterm)
©Provost2009,2010
OnlineBrandAdvertising
– goal:todeliverbrandmessagetoselectedaudience– contrastwith“directmarketing”onlineadvertising
• forbrandadvertising,goalisnotnecessarilyclicksoronlineconversions
– key:selectingaudience• examplestrategy(traditional):findaudiencebasedonpublishedcontent(tvshows,magazines)orlocation(billboards,etc.)
– traditionalbrandadvertisingstrategyappliesonline:• premiumdisplayslotsorremnants(e.g.,onespn.com,etc.)• contextualtargeting(e.g.,GoogleAdSense)
– alternativestrategy:identifymembersofthetargetaudienceandtargetthemanywhereontheweb(e.g.,bidforthemonadexchanges–thenon‐premiumdisplaymarket)
4
©Provost2009,2010
• Non‐premiumdisplayadmarketpredictedtogrowsignificantlyfasterthantherestofonlineadvertising(e.g.,sponsoredsearch,premiumdisplay,contextual)
– (Coolbrith2007)– largelyduetothestabilizationofthetechnicalad‐servinginfrastructurebasedontheconsolidationintoasmallnumberofadexchanges(e.g.,DoubleClick,RightMedia)
• Thereisevidencethatdisplaybrandadvertisingincreasespurchases(onlineandoffline),andimprovessearchadvertisingaswell(Manchandaetal.2006,Comscore2008,AtlasInstitute2007,Fayyadpersonalcommunication,Klaassen2009,Lewis&Reiley2009)
– other(older)workshowsdisplayadsleadtoincreasedadawareness,brandawareness,purchaseintention,andsitevisits(seecitesinManchanda)
©Provost2009,2010
Twodifferentgoalsfordisplayadvertising
• Driveconversions(shortterm)• Brandadvertising(longerterm)
• Bothareimportant– (intheoff‐lineworldmostadspendingisonbrandads)– OurKDD‐2009paperfocusedononlinebrandadvertising– TodayI’llmeldthetwotogether
• WhatI’mnotinterestedinisclicksondisplayads– we’llreturntothatlater
5
©Provost2009,2010
Mainpointsforthismorning
1. Machinelearningcanbeusedasthebasisforeffective,privacyfriendlytargetingforonlineadvertising
2. Importanttoconsidercarefullythetargetvariableusedfortraining
3. Question:shouldmachinelearningresearchersbespendingmoretimeconsideringtheeffectivenessofadvertising?
©Provost2009,2010
Hill, Provost, and Volinsky. “Network-based Marketing: Identifying likely adopters via consumer networks. ” Statistical Science 21 (2) 256–276, 2006.
Priorwork:
Socialnetworktargeting
• DefinedSocialNetworkTargeting‐‐>crossbetweenviralmarketingandtraditional
• target“networkneighbors”ofexistingcustomers• basedondirectcommunicationbetweenconsumers• thiscouldexpand“virally”throughthenetworkwithoutanyword‐of‐mouthadvocacy,
orcouldtakeadvantageofit.
• Exampleapplication:– Product:newcommunicationsservice– Firmwithlongexperiencewithtargetedmarketing– Sophisticatedsegmentationmodelsbasedondata,experience,andintuition
• e.g.,demographic,geographic,loyaltydata• e.g.,intuitionregardingthetypesofcustomersknownorthoughttohaveaffinityfor
thistypeofservice
• Results:tremendousliftinresponserate(2‐5x)1
4.82
2.96
0.4
Non-NN 1-21 NN 1-21 NN 22 NN nottargeted
6
©Provost2009,2010
1
4.82
2.96
0.4
Non-NN 1-21 NN 1-21 NN 22 NN nottargeted
(0.28%)
(1.35%)
(0.83%)
(0.11%)
RelativeSalesRatesforNetworkNeighbors(NN)andothers
Salesratesaresubstantiallyhigherfornetworkneighbors(Hill,Provost,VolinskyStat.Sci.2006)
1‐21aretargetedmarketingsegments;22comprisesNNsnotdeemedgoodtargetsbytraditionalmodel
©Provost2009,2010
Thanks to (McPherson, et al., 2001)
• Birds of a feather, flock together – attributed to Robert Burton (1577-1640)
• (People) love those who are like themselves -- Aristotle, Rhetoric and Nichomachean Ethics
• Similarity begets friendship -- Plato, Phaedrus
• Hanging out with a bad crowd will get you into trouble
-- Fosterʼs Mom
Issuch“guilt‐by‐association”targetingjustifiedtheoretically?
7
©Provost2009,2010
©Provost2009,2010
“Privacy”online?
Aretherepointsbetweentheextremesthatgiveusacceptabletradeoffsbetween“privacy”andefficacy?
“Wecandowhateverwewantwithwhateverdatawecangetourhandson.”
“Youcan’tdoanythingwithMYdata!”
Wherewouldwelikefirmstooperateonthespectrumbetweenthetwounacceptableextremes:
8
18 © 2010 Sinan Aral. All rights Reserved.
Lift in adoption of neighbors of adopters
Seeming adoption influence between network neighbors can be largely explained by homophily
from (Aral et al. PNAS 2009)
Social connections reveal deep similarity - profiles, interests, attitudes,…
©Provost2009,2010
privacy‐friendlysocialtargetingisdifferent…Doubly‐anonymizedbipartitecontent‐affinitynetwork
contentvisited(socialmediapages)
browsers
9
©Provost2009,2010
contentvisited(e.g.,socialmediapages)
browsers
amongbrowsers
content‐affinitynetwork
network neighbors
target the strong network neighbors in ad exchanges
Fromdoubly‐anonymizedbipartitecontent‐affinitynetworktoquasi‐socialnetwork
©Provost2009,2010
Somebrandproximitymeasures
• POSCNT– numberofuniquecontentpiecesconnecting
browsertoB+
• MATL– maximumnumberofcontentpiecesthrough
whichpathsconnectbrowsertosomeparticularactiontaker(i.e.,seednodeinB+)
• minEUD– minimumEuclideandistanceofnormalized
contentvectortoaseednode
• maxCos– maximumcosinesimilaritytoaseednode
• ATODD– “odds”ofaneighborbeinganactiontaker(i.e.,seed
nodeinB+).
B+
A
B
CD
E
See: Audience Selection for On-line Brand Advertising: Privacy-friendly Social Network Targeting. Provost, F., B. Dalessandro, R. Hook, X. Zhang, and A. Murray.. In Proceedings of the Fifteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2009).
Plus, multivariate statistical models using these as features
10
©Provost2009,2010
Multivariatemodel
• Foreachbrowserbi,afeaturevectorφbicanbecomposedofthevariousbrandproximitymeasures
• Thedifferentevidencecanbecombinedviaarankingfunctionf(φbi)
• Weletf(.)beamultivariatelogisticfunction,trainedviastandardMLElogisticregression(notregularized)
• Trainingisbasedonaheld‐outtrainingset
©Provost2009,2010
InitialStudy:Data
• asampleofabout10millionanonymizedbrowsers• alloftheirobservedvisitstosocialmediacontentover90days(here:fromseveralofthelargestSNsites)
• bipartitegraph:– 107x108with~2.5x108non‐zeroentries
• quasi‐socialnetwork:– 107nodeswith20‐40neighborseach(onaverage)
• morethanadozenwell‐knownbrands:– HotelA,HotelB,ModelingAgency,CellPhone,CreditReport,AutoInsurance,Parenting,VOIPA,VOIPB,Airline,ElectronicsA,ElectronicsB,Apparel:Athletic,Apparel:Women’s,Apparel:HipHop
– onaverage~100Kseednodesperbrand
KDD-2009
11
©Provost2009,2010
Exampleresultfrominitialstudy:Liftfortop10%ofNNsBased on study of about 10^7 browsers 10^8 social network pages 15 (mostly) well-known brands
KDD-2009
©Provost2009,2010
Networkneighborsoftenshowsimilardemographics Foronecampaign(CellPhone)weaskedQuantcast.comfordemographicprofilesoftheseedbrowsersandtheirclosenetworkneighbors:
Demographic Seeds Neighbors
Gender Female Female
Ethnicity Hispanic Hispanic
Age Young Young
Income Low Low
Education NoCollege NoCollege
but this belies the advantage of network neighbor targeting: all people who share demographics don’t share interests AND all people who share interests don’t share demographics
12
©Provost2009,2010
Socialvs.Quasi‐Social
Brand F‐AUConallBF‐AUConN
onlyHotel A 0.96 0.79 Modeling Agency 0.98 0.84 Credit Report 0.93 0.79 Parenting 0.94 0.80 Auto Insurance 0.97 0.81 …
15 Brand Average 0.96 0.81
Thecontent‐affinitynetworkembedsafriendshipnetwork?
• estimateeachbrowser’shomepagebasedontechniquesanalogoustoauthoridbasedoncitations(Hill&Provost,2003)• estimate“friends”tobethosewhovisiteachother’shomepage• Ask:dobrandproximitymeasuresrankbrandactors’friendshighly?
• F‐AUCmeasuresprobabilitythataknown‐friendisrankedhigherthanabrowsernot‐known‐to‐be‐friend
Airline
©Provost2009,2010
Exampleresultfrominitialstudy:Liftfortop10%ofNNsBased on study of about 10^7 browsers 10^8 social network pages 15 (mostly) well-known brands
KDD-2009
13
x
2x
4x
6x
8x
10x
12x
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Performan
ceOverBa
selin
e
Each Bar = Lift for one client during the month of December
Baseline
Lift = freq. that targeted browsers visit site / freq. that baseline browsers visit site
Fastforward1.5years…“Invivo”performanceLiftforstrongnetworkneighborsacrosssampleofcampaigns
median lift = 5x
©Provost2009,2010
100%50%10%
• Shows lift for a particular size targeted population (%) • Left-to-right decreases targeting threshold
Performanceimprovessubstantiallywithmoredata..
14
©Provost2009,2010
• The orange horizontal line intersects the curves at the % of the population we can reach to get a 2x lift.
• The maroon vertical line intersects the curves at the lift multiple for the best 10% of each population.
100%50%10%
Performanceimprovessubstantiallywithmoredata..
©Provost2009,2010
Potentialstumblingblock:…whatdothoseredcirclesrepresentagain?
contentvisited(socialmediapages)
browsers
If there are not very many conversions, how can we build effective predictive models?
15
©Provost2009,2010
Whatnottodo:useclicksasasurrogateforconversions
0.4
2.4
4.4
6.4
8.4
10.4
12.4
14.4
16.4
0.4 2.4 4.4 6.4 8.4 10.4 12.4 14.4
CR
C T
ran
SV: E
val P
urch
CRC Train Purch : Eval Purch
Click vs. Purchase Training: Lift @ 5% Li
ft:
Tra
in C
lick
; Eva
l P
urc
hase
Lift: Train Purchase; Eval Purchase
©Provost2009,2010
Butthereisagood,easy‐to‐obtainsurrogate
0.4
2.4
4.4
6.4
8.4
10.4
12.4
14.4
16.4
0.4 2.4 4.4 6.4 8.4 10.4 12.4 14.4
CR
C T
ran
SV
: Eva
l P
urc
h
CRC Train Purch : Eval Purch
SV vs. Purchase Training: Lift @ 5%
:sitevisits
Lift
: Tra
in S
V;
Eva
l P
urc
hase
Lift: Train Purchase; Eval Purchase
16
©Provost2009,2010
Generally,sitevisitsisbetterthanconversionswhentherearerelativelyfewconversions
-20%
-15%
-10%
-5%
0%
5%
10%
15%
0 1 2 3 4 5 6 7 8 9
(AU
C p
-A
UC
sv)
/ A
UC
p
N umber of Purchasers - Log Scale
Bubble Size represents #SV/#Purchasers
Purch better
SV better
How could that be?
©Provost2009,2010
Summaryofmainpoints
1. Machine learning can be the basis for effective privacy friendly targeting for online advertising – effective from several different angles, clearly improves with more data
2. Important to consider carefully the target used for training – conversions are good if you can get them; site visits can be a surprisingly good surrogate; clicks generally are not a good surrogate
3. Question: should machine learning researchers be spending more time considering the effectiveness of advertising? – initial evidence shows surprisingly strong influence of seeing an online advertising impression. This deserves more study.