danbri 2nd review wp3 challenges
Post on 08-Apr-2018
223 Views
Preview:
TRANSCRIPT
-
8/6/2019 Danbri 2nd Review WP3 Challenges
1/30
WP3
Challenges & Hybrid models
Dan Brickley, VUA
Pro-netics & BBC
-
8/6/2019 Danbri 2nd Review WP3 Challenges
2/30
2
Overview
Challenges for TV in Social Web
theory and practice of our hybrid approach
3 Interconnected problems: Privacy, Sparsity and Heterogeneity
What we built (and why)
different kinds of recommender ways of integrating them
Plans and options for final developments
-
8/6/2019 Danbri 2nd Review WP3 Challenges
3/30
3
Theory and Practice
89 05 2 9
00 88 8 6
23 97 9 8
-
8/6/2019 Danbri 2nd Review WP3 Challenges
4/30
4
More likely ...
09 00 0 9
00 88 0 0
00 97 0 8
-
8/6/2019 Danbri 2nd Review WP3 Challenges
5/30
5
5
-
8/6/2019 Danbri 2nd Review WP3 Challenges
6/30
-
8/6/2019 Danbri 2nd Review WP3 Challenges
7/30
77
-
8/6/2019 Danbri 2nd Review WP3 Challenges
8/30
88
-
8/6/2019 Danbri 2nd Review WP3 Challenges
9/30
9
TV preference data is very sparse!
Even for a single service (eg. Netflix), data is
overwhelmingly sparse
For NoTubes open systems, challengesmultiply:
often no global view, only per-user data
many ways of identifying the same content item
many ways of identifying the same user
never mind other entities (actors, directors, ...)
-
8/6/2019 Danbri 2nd Review WP3 Challenges
10/30
10
Challenges: Sparsity, Fragmentation
Content identifiers (WP1)
Wikipedia/DBpedia URLs? Freebase?
RottenTomatoes.com, IMDB.com, broadcaster IDs
Social Web interoperability
Bobs on Facebook, Charlies on Twitter
negotiating access to non-public data (OAuth)
reconciling metadata models, rating models
-
8/6/2019 Danbri 2nd Review WP3 Challenges
11/30
11
Fragmentation by site
-
8/6/2019 Danbri 2nd Review WP3 Challenges
12/30
12
A hybrid approach to sparsity
Find patterns and paths in factual data
Collaborative filtering - from bulk rating data
Experiments with big data (e.g. Twitter
crawl)
Models for combining recommenders
Strategies for inferring sameAs links
...or grouping items together (by series,
brand)
-
8/6/2019 Danbri 2nd Review WP3 Challenges
13/30
13
Challenge: Privacy
TV preferences are very personal data
Relevant standards (OAuth) are new
deployed widely in Social Web during NoTube slower adoption in TV and broadcast world
We can use OAuth to request permission to
read a users closed data (eg. FacebookLikes)
limits ability to find general trends across an
entire audience (except public data - twitter?)
-
8/6/2019 Danbri 2nd Review WP3 Challenges
14/30
14
Diversity and Fragmentation
Diversity of the Web
reading lists: bookcrossing, librarything, amazon
music on last.fm, spotify, ... news sites, social networks, blogs ...
How to integrate while respecting privacy?
Good news:O
Auth deployment growing &social sites expose their recommendations
Bad news: user-by-user data makes large-
scale analysis of trends harder
-
8/6/2019 Danbri 2nd Review WP3 Challenges
15/30
15
OAuth? RDFa?
OAuth lets sites negotiate access with users
e.g., Facebook knows lots of movies I like.
NoTube can use OAuth to ask me to sharethat data with TV services
RDFa data from movie pages (IMDB, Rotten
tomatoes) is consumed at Facebook This makes certain pages attractive as content
identifiers, a taste graph alongside social
graph
-
8/6/2019 Danbri 2nd Review WP3 Challenges
16/30
16
-
8/6/2019 Danbri 2nd Review WP3 Challenges
17/30
17
-
8/6/2019 Danbri 2nd Review WP3 Challenges
18/30
18
RDFa in IMDB and
RottenTomatoes HTML
Aggregated by Facebook (and then, by us...
-
8/6/2019 Danbri 2nd Review WP3 Challenges
19/30
19
What we built
Main WP3 work: beancounter and pattern
recommender
Aggregate, normalize and merge social Webactivity streams, then match against enriched
TV metadata to produce recommendations
We also have a Mahout-based collaborative
filtering recommender, with item to item
recommendations based on bulk ratings data
-
8/6/2019 Danbri 2nd Review WP3 Challenges
20/30
20
LOD challenges
Linked Open Data for TV is new
datasets evolving, changing
qualityv
aries modelling styles vary
lumpy, uneven coverage
Pattern recommender finds paths
from items in user profile to new content
handles variation between Linked Data sources
-
8/6/2019 Danbri 2nd Review WP3 Challenges
21/30
Content Pattern-based
Recommendations Paths in Linked Open Data
Diversity & Serendipity measures
21
-
8/6/2019 Danbri 2nd Review WP3 Challenges
22/30
Participation Pattern
Person Xplayed role Y
in TV program Z
194,649lmdb:actor triples
53,180 lmdb:director triples
28,549 lmdb:writer triples
1,262
lmdb:film_story_contributortriples
22
-
8/6/2019 Danbri 2nd Review WP3 Challenges
23/30
Influence Pattern
Person X influenced
byperson Y (direct)
Person X andY
influenced byperson
Z (in-direct)
6,562dbpedia:influencedtriples
11,776dbpedia:influencedBy
triples
23
-
8/6/2019 Danbri 2nd Review WP3 Challenges
24/30
Analysis of Patterns in Dataset
Dataset (BBC EPG metadata): 12,777 (7,756 title enrichment) programmes
1260 (401 enriched) brands (unique titles)
35,227 (19,394 enriched) person names in metadata
9,315 (4,590 enriched) unique person names in metadata
24
# items
recommendations 1266
- Individual brands 411
paths 17,001
- with linkedmdb:actor 15,257
- with linkedmdb:director 1155
- with linkedmdb:writer 569
- with linkedmdb:film_story_contributor 20
# items
recommendations 222
- Individual brands 100
paths
- influencedBy (all) 1202
- influencedBy (unique) 521
-
8/6/2019 Danbri 2nd Review WP3 Challenges
25/30
25
Collaborative filtering
(item similarity measures from bulk ratings data)
-
8/6/2019 Danbri 2nd Review WP3 Challenges
26/30
26
-
8/6/2019 Danbri 2nd Review WP3 Challenges
27/30
27
-
8/6/2019 Danbri 2nd Review WP3 Challenges
28/30
28
Hybrid models:
factual paths and statistical similarity
(and not to mention @wossy is on Twitter with 1 million followers...)
-
8/6/2019 Danbri 2nd Review WP3 Challenges
29/30
29
Status
We can show a standards-based system that
integrates TV preference data from diverse Web
matches this with enriched TV metadata
finds graph patterns linking users to content
integrates with classic recommender approaches
builds on opensource (Cliopatria,Mahout)
supports real-time multi-screen exploration
-
8/6/2019 Danbri 2nd Review WP3 Challenges
30/30
30
Plans and challenges
Richer integration between components
currently this occurs in the application; can we
exploit LOD patterns prior toMahout analysis?
Polish & packaging; more patterns and rules
Track and influence evolving standards (W3C)
Work-in-progress with big data analysis -
what kinds of TV links are shared by the kind
of people who follow @stephenfry on
Twitter?
top related