inform: targeting the interest graph
DESCRIPTION
Personalization of content and ad selection using the Inform ServiceTRANSCRIPT
Targeting the Interest Graph:
Marc Hadfield CTO, Inform Semantic Technology Conference, 2011
Personalization of content and ad selection using the Inform Service
Introduction Marc Hadfield is CTO of Inform Technologies.
Interests: Natural Language Processing, Semantics, Life Science Graph Algorithms, Machine Learning, Big Data
Inform Technologies is a semantic technology company.
Inform provides semantic technology – NLP and Analytics to Publishers, and operates a user generated forum site Yuku.com.
We at Inform have been evolving our technology to the user generated content space. We’ve adapted our technology to different kinds of content such as informal text, photos, videos, and questions.
We’ve recently addressed Ad Selection, Video Selection, and Personalization.
I’ll discuss some of our results with the Interest Graph.
2
Inform Service
Semantic Software-as-a-Service for Publishers
Advantage: ~30% boost in engagement in “traditional” publisher websites.
Tracks 4,000+ Subjects and 320,000+ Entities: Inform Topics
Inform Service: – In-Article links to Topics Pages – Related Articles from the Archive – Related Articles around the Web – Related Photos – Related Videos – Topic Pages including mix of content sources – Tools (Publishing Tools, etc.)
3
Inform Publisher Customers
4
Yuku Forums Forum Content
– “Old School” user generated content – ~40,000 forums – Top 100 forums account for about 50% of traffic – ~1 Billion short form content pieces – ~1 Million monthly unique users – ~150K new content objects per day – ~1 Million Page Views per Day
Subscription / Advertising Revenue Inform adapting / integration our Semantic Tech
Great laboratory for testing algorithms / theories – Apply more broadly than Yuku platform
Nice A/B testing environment Testing new algorithms on our ForumFind search engine
– And embedded widgets in Yuku
Good reason to improve Ad Selection
5
6
Occam
Today: Personalization for Enhanced Targeting
• Capturing the Interest Graph
• Personalized experience Help People find interesting content Make Ads relevant
7
Inform Content & Analytics Platform
Publisher site Widgets Yuku
Licensed / Crawled Content
3rd Party / Activity Data
Core Engine “Occam”
Content Distribution
Content / Data Ingestion
Text Analysis
Categorization / Personalization
Algorithms
Inform “Occam” Architecture
8
Receive Message
• REST Webservice Call • Queue
Extract
• Get URL • Extract Document Features • Extract Text
NLP
• NLP Features (Machine Learning) • Inference Engine (Prolog / Frame Logic) • Discourse / Behavior / Sentiment Models (Prolog / Frame Logic) (New)
Analysis
• Trend Analysis (incremental data) • Graph Analysis (incremental data)
Reply
• Store in Semantic Repository (if needed) • Send Reply Message (via Queue or Webservice)
Example Workflow:
Inform API REST Based Queue for high volume content exchange Returns data in RDF, XML, or JSON All Content has a URI All Inform Topics have URIs (can be dereferenced) Insert Content, Update Content, Delete Content Login / Logout Change Status of Content (Published, Unpublished) Content can be “GET”
– Associated Topics (Subjects and Entities) returned – Include scores
Search Inform Topics Semantic Search
– Simplified queries (not full sparql) – Typical Query: Get Content of Type “Article” about “Barack Obama”
ranked by score
9
Inform API (2) Related Content
– Articles, Messages, Photos, Videos, Questions, Web
AdContext™ (new) – URL IAB Topics + Inform Topics
VideoContext™ (new) – URL Inform Topics – Related Videos
InterestGraph (new) – Parameters: user-id / session-id Inform Topics
Personalized AdContext™ (new) – URL + session-id / user-id (anonymized) IAB Topics + Inform Topics
10
AdContext™: IAB Ad Standards IAB (Interactive Advertising Bureau) Standard to return a set of
metadata about a website, webpage, section of a webpage to assist advertising within web content.
Defines how a Topic may be associated with web content.
Defines a set of standard upper level Topics such as “Science”, “Sports”, and “Business”, and mid-level Topics such as “Golf” and “Fashion”. These are tier-1 and tier-2.
Inform has aligned the IAB Topics with Inform’s Topics. Inform can deliver more specific Topics (the full set of Inform Topics) as “tier-3” IAB Topics.
The AdContext™ service returns this metadata. Ad Networks may use the service to assist in ad selection.
Semantic Ad Selection may improve yield 2X – 5X (as per various external studies).
11
Aside: rNews RDFa Standard rNews: embedding metadata in online news
rNews is a proposed standard for using RDFa to annotate news-specific metadata in HTML documents. The rNews proposal has been developed by the IPTC, a consortium of the world's major news agencies, news publishers and news industry vendors. rNews is currently in draft form and the IPTC welcomes feedback on how to improve the standard in the rNews Forum.
http://dev.iptc.org/rNews
Why? SEO, Rich Snippets, Reduce “scrapper” error, better metadata.
Inform API returns via the API rNews metadata ready to embed in news articles (in testing).
12
Publisher Customer Example:
13
Inform automatically tags entities (people, places, companies, and organizations) and provides related topics, articles, and media
The Related News Widget pulls in the most relevant and recent articles from within the New York Daily News Archive
Customer Example:
14
Inform’s tags can be brought together in numerous ways to create a richer experience for consumers
Inform also generates highly engaging and relevant slideshows
Demo Inform API w/Facebook
15
How to connect Inform to the social graph?
Demo Inform API w/Facebook
16
Demo Inform API w/Facebook
17
Demo Inform API w/Facebook
18
Inform Topics mapped to Wikipedia Pages and thus to other Concepts – including the Facebook “Like” Graph
Interest Graph • Inform Topics
4,000+ Subjects in Hierarchy (SKOS)
320,000+ Entities Wikipedia Pages Wikipedia Categories Inform “same-as” links to
Wikipedia
• 1 Million+ Monthly Unique Users
• ~1 Billion content pieces total Forum Messages, Replies,
Photos, Videos
• 150K new content pieces per day
• 1 Million+ PageViews per Day
• ~5 Million ads serviced per Day
19
Goal: Link Users to Topics for selection of content and ads
Personalization Signals • Content is “about” a Topic (subject or entity)
• User submits Content (“write”) Message, Reply, Photo, Video, Question, …
• User reads Content (“view”) Message, Reply, Photo, Video, Question
Trends / Global Aggregation:
• Importance Metric
• Bursty / Velocity
• Sentiment ( “:-)”, “LOL”, …) “Like” the topic? “Dislike” the topic? Context?
– i.e. dislike a Football Team, so “likes” to hear when they lose (negative sentiment)
• Other features… 20
Interest Graph Algorithms Criteria:
• Near Real-Time
• Highly parallel to allow for scaling
• Fuzzy Data, Flexible data model
Implementation:
• General Graph Representation Node Weights, Edge Weights, Node Types, Edge Types
• Graph walk to extract a User’s Interest Graph
• Parallel Message-Passing Algorithms for Graph Analysis Importance, PageRank, Centrality Spreading Activitation Pregel-like implementation (Signal/Collect)
• Add Graph Analytics to Workflow 21
Neighborhood around JJB User
22
Niketalk User Interest Graph (local)
23
Without global importance metric:
Niketalk User Interest Graph (global)
24
With global importance metric: Recommendations can be made reflecting the shifting interests of the global community.
Example Yuku Forum - Gymnastics
25
ForumFind – “laboratory”
26
ForumFind – Topic, Ad, Content
27
ForumFind – MyForumFind (user: jjb2 )
28
Interest Graph – User Insights • “Everybody Lies” (“House” TV Show)
– The only way to know the users interests is to have an implicit channel to detect interests without impacting user behavior
• People have broad / dynamic interests
• People read “trash” – i.e. everyone reads Celebrity Gossip – If convenient / no one looking
• Global Data can be used to make recommendations No surprise, but nice to have confirmation
• People move on “Likes” need to expire
• Recommendations for content and ads can be implemented in a highly dynamic and parallel fashion running in real time with reasonable resources using graph analysis
29
Interest Graph – Conclusion
• Using a User’s Graph of Interests can dramatically improve the user’s engagement Data still being gathered within Inform as to percentage
increase, but so far very encouraging numbers!
• The Inform Service can be used to implement a more personalized content and ad experience with minimal implementation effort.
• Talk to me about using our API!
30
Example CMS Integration
32
Published Article:
33