![Page 1: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/1.jpg)
CHALLENGES DOING
PERSONALIZATION
AT WEB-SCALE
Peter Bailey (pbailey)
Bing Contextual Relevance, Microsoft
YOW! Australia :: Dec 2011
![Page 2: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/2.jpg)
Meet Bruce
He’s an Australian
expatriate …
YOW! Australia 2011 © Microsoft Bing : Peter Bailey 2
![Page 3: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/3.jpg)
Meet Bruce
From Sydney … … who still likes to read
his hometown
newspaper
YOW! Australia 2011 © Microsoft Bing : Peter Bailey 3
![Page 4: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/4.jpg)
Meet Bruce
He lives in Sarasota,
Florida, and has a long
term history of health
related queries. Must be
all that sun …
… so it’s lucky the
Sarasota Memorial
Hospital has a full array
of skin cancer services
YOW! Australia 2011 © Microsoft Bing : Peter Bailey 4
![Page 5: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/5.jpg)
Meet Bruce
He traveled to Seattle
for work this week,
where he is depressed
by the bleak weather.
He’s searched on
Seasonal Affective
Disorder and similar
maladies.
YOW! Australia 2011 © Microsoft Bing : Peter Bailey 5
![Page 6: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/6.jpg)
Meet Bruce
Bruce’s most recent
searches however
include {BBH} and {PPH}
… which happen to be
stock codes for some
HOLDRS ETFs from
Merril-Lynch
YOW! Australia 2011 © Microsoft Bing : Peter Bailey 6
![Page 7: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/7.jpg)
Meet Bruce
Bruce’s next query is:
{smh}
What do we do?!
YOW! Australia 2011 © Microsoft Bing : Peter Bailey 7
![Page 8: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/8.jpg)
WHAT IS THIS TALK ABOUT? It’s all about the users …
YOW! Australia 2011 © Microsoft Bing : Peter Bailey 8
![Page 9: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/9.jpg)
User history in context: Peter Bailey
Search work:
• ANU (twice)
• NUIX
• Synop
• CSIRO
• Microsoft
Other work:
• Object Technology
International (IBM)
Products:
• Parallel programming
software systems (for
Fujitsu AP1000)
• VisualAge for Smalltalk
• Panoptic (Funnelback)
• Sytadel CMS
• IE Suggested Sites
• Bing
• 4/5TH of a CORBA ORB
YOW! Australia 2011 : What is this talk about? © Microsoft Bing : Peter Bailey 9
![Page 10: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/10.jpg)
Focus areas
Kinds of personalization
Measurement
Experimentation
User feedback systems
Performance
Another thing
YOW! Australia 2011 : What is this talk about? © Microsoft Bing : Peter Bailey 10
© Andrew Sieber
© Todd Lappin
![Page 11: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/11.jpg)
KINDS OF
PERSONALIZATION
YOW! Australia 2011 © Microsoft Bing : Peter Bailey 11
![Page 12: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/12.jpg)
Several core types of online systems
Pull information
e.g. Search engines
Push information
e.g. Advertisements
Interaction
e.g. Games
Transactional
e.g. Travel bookings
YOW! Australia 2011 : Kinds of personalization © Microsoft Bing : Peter Bailey 12
Browse information e.g. Web sites
![Page 13: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/13.jpg)
Kinds of
Personalization
Level Description Prerequisites
0 No personalization
1
Contextualization - all people who
belong to some easily identified
cohort receive the same experience;
but this experience is different across
cohorts (e.g. your location)
The system can identify
you belong to a particular
cohort
2
You see different things because you
have a different "view" onto some
underlying data than someone else
(e.g. your bank account)
You have an "account" of
sorts with the system - it
knows who "you" are wrt
to other users - that is, an
online identity exists
3
You can do different things or receive
different information because of how
the system maps certain access rights
associated with your identity (e.g. your
security groups)
Some form of "role-
based" access controls,
typically hard-coded
through business rules
4
You experience different results and
interactions with the system based on
pre-computed user models about you
(e.g. your credit history)
User models are
computed in batch mode,
and are used to adjust
the interaction algorithms
of the system
5
You experience different results and
interactions with the system based on
your real-time interaction with it (e.g.
your searches and browsing patterns)
Dynamic behavior
analysis and user model
updates are used to
adjust the interaction
algorithms of the system
0. None
1. Context-only
2. Different data
3. ACLs
4. Static user models
5. Dynamic user models
YOW! Australia 2011 : Kinds of personalization © Microsoft Bing : Peter Bailey 13
![Page 14: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/14.jpg)
MEASUREMENT
YOW! Australia 2011 © Microsoft Bing : Peter Bailey 14
![Page 15: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/15.jpg)
Measure driven development (MDD)
Measurement first
• Know what you’re trying
to improve for the user
• Know how you’re going
to tell if you succeed
Antonym
“shipping and hoping”
Challenges:
• Changing mindsets
• Finding or designing
user-sensitive measures
• Finding repeatable
measures
YOW! Australia 2011 : Measurement © Microsoft Bing : Peter Bailey 15
![Page 16: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/16.jpg)
User-centric behavior
Measurement of user
behavior is fundamental
Measures posit some
model of user task
achievement
• Typical user behavior is
not unambiguous
Challenges:
• HTTP is a stateless
protocol
• User-centric systems
require additional
infrastructure
• Different basis to
logging activity
• More data to log
YOW! Australia 2011 : Measurement © Microsoft Bing : Peter Bailey 16
![Page 17: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/17.jpg)
User data may be sensitive
Personally identifiable
information (PII)
Most individual PII and
user-associated
behavioral data is not
computationally
interesting other than in
relation to aggregated
data
Challenges:
• Storage
• Access
• Retention
• Data mining
• PII to non-PII boundaries
YOW! Australia 2011 : Measurement © Microsoft Bing : Peter Bailey 17
![Page 18: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/18.jpg)
EXPERIMENTATION
YOW! Australia 2011 © Microsoft Bing : Peter Bailey 18
![Page 19: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/19.jpg)
Scale of experimentation data
In-person user study
Web-scale user behavior
YOW! Australia 2011 : Experimentation © Microsoft Bing : Peter Bailey 19
Crowd-sourced investigation
![Page 20: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/20.jpg)
Culture of experimentation
Measurement matters! • And an organizational
culture that respects it
Good user-based experimentation processes
• These are real people
Failure is acceptable • Indeed, it’s expected!
• But fail fast otherwise you’ll lose your users
Challenges • Infrastructure complexity
• E.g. 2 versions of your user models
• Testing pre-production systems involving users
• Measurement, evaluation, and statistical expertise
• Fast failure monitoring and recovery systems
YOW! Australia 2011 : Experimentation © Microsoft Bing : Peter Bailey 20
![Page 21: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/21.jpg)
Experimental design with users
Typical experimental
design seeks to
eliminate variance due
to users
• What happens when
that’s the whole point?
Determine that your
user sample will reflect
behavior at large
Challenges
• User sampling
• Sufficient user scale
• Interaction effects unless
experiment is completely
naturalistic
YOW! Australia 2011 : Experimentation © Microsoft Bing : Peter Bailey 21
![Page 22: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/22.jpg)
Efficiency
Online • Online expt is really
expensive! • Real users
• Not directly repeatable
• Production quality code
• Failure responsibilities
• Worse user experience if experiment not successful
• Wasted finite user “bandwidth” if experiment wrongly designed or has bugs
• But ultimately you have to test online because it’s all about the users …
Offline • Find simple surrogates for
online user behavior
• Use a preliminary version of any new system with you and many others
• Use representative scenarios, not just hand-picked ones
• Build offline experimentation platforms that are repeatable and scalable, with a low barrier to entry for non-developers
YOW! Australia 2011 : Experimentation © Microsoft Bing : Peter Bailey 22
![Page 23: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/23.jpg)
USER FEEDBACK SYSTEMS
YOW! Australia 2011 © Microsoft Bing : Peter Bailey 23
![Page 24: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/24.jpg)
Explicit
Explicit feedback from
users should provide
discernible value to users
Explicit feedback should
be:
• Non-invasive
• Painless (one-click)
• Obvious
• Progressive for the
enthusiastic
Challenges:
• Low volume (high value)
• Spam-able
• Explicability of
personalization choices
• Potential for “creepiness”
YOW! Australia 2011 : User feedback © Microsoft Bing : Peter Bailey 24
![Page 25: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/25.jpg)
Implicit
Implicit feedback can be an invaluable source of information from your users
Probably the most vital data source for personalization • The more users, the
more implicit feedback, the better
Challenges: • High volume
• High noise
• Interpretability of actions
• Interpretability of changes in actions
• Additional code, storage, and tooling
YOW! Australia 2011 : User feedback © Microsoft Bing : Peter Bailey 25
![Page 26: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/26.jpg)
PERFORMANCE
YOW! Australia 2011 © Microsoft Bing : Peter Bailey 26
![Page 27: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/27.jpg)
Amdahl’s Law redux
1
1 − 𝑃 + 𝑃/𝑆 𝑞𝑝𝑠 =
𝐶
𝑇
where C is the scale-unit CPUs to deliver 1 query in time T, without personalization
𝑢𝑞𝑝𝑠 =𝐶
𝑇 − 𝑈 + 𝑈. 𝑛
where U is the personalization time within T (akin to 1-P in Amdahl’s Law), and n is the number of users making requests at any time
With simplifying assumptions, sustaining your original qps requires scaling C by
𝑞𝑝𝑠
𝑢𝑞𝑝𝑠≅ 1 +
𝑛.𝑈
𝑇
YOW! Australia 2011 : Performance © Microsoft Bing : Peter Bailey 27
![Page 28: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/28.jpg)
Amdahl’s law redux
1 +𝒏.𝑼
𝑻
Challenges: • It’s all about what you can’t
speedup
• Personalization breaks most of your preconceptions about using caching for performance optimization
• Be ultra-focused on optimizing personalization code if you have a large number of users • Milliseconds matter!
YOW! Australia 2011 : Performance © Microsoft Bing : Peter Bailey 28
![Page 29: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/29.jpg)
(At) page load time
Fast sites get more use
Personalization makes page load slower
Well engineered design can mask additional latencies
Challenges: • Get user id and data as
quickly as possible
• Separate personalized vs non-personalized content for progressive loading
• Avoid caching of other people’s personalized information at any point in the server-to-client delivery chain
YOW! Australia 2011 : Performance © Microsoft Bing : Peter Bailey 29
![Page 30: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/30.jpg)
ANOTHER THING
YOW! Australia 2011 © Microsoft Bing : Peter Bailey 30
![Page 31: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/31.jpg)
Thoughts on the “filter bubble”
My thoughts are not about whether search engines have an
ethical responsibility to deliver diversity of opinion
• Though that’s a great debate to have
We should acknowledge that people with an ability to
interact with global search engines have (speed of) access
to information unparalleled in the history of humanity
• Though there are many who still do not have access
But people have no more time than they ever did
YOW! Australia 2011 : Another thing © Microsoft Bing : Peter Bailey 31
![Page 32: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/32.jpg)
Thoughts on the “filter bubble”
It’s all about trust
• Do your users grow to believe you are trying to do the best things
possible for them?
• How do you demonstrate this?
• How do you let them tell you that you got it wrong?
• And how do you learn continuously?
• And do it, and know that you’re doing it, better the next time?
YOW! Australia 2011 : Another thing © Microsoft Bing : Peter Bailey 32
![Page 33: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/33.jpg)
Thoughts on the “filter bubble”
In the end, all non-trivial information system interactions
will be personalized to some degree
• I encourage you to start thinking about what that may mean for
your development practices sooner rather than later!
YOW! Australia 2011 : Another thing © Microsoft Bing : Peter Bailey 33
© Daniel Agostini
![Page 34: Challenges doing Personalization at Web-scalegotocon.com/dl/jaoo-melbourne-2011/slides/... · where C is the scale-unit CPUs to deliver 1 query in time T, without personalization](https://reader033.vdocuments.us/reader033/viewer/2022042220/5ec62498467b83320007c15e/html5/thumbnails/34.jpg)
QUESTIONS? And thank you for listening!
YOW! Australia 2011 © Microsoft Bing : Peter Bailey 34