making sense of users' web activities
DESCRIPTION
Keynote at the Personal Semantic Data (PSD) workshop, collocated with EKAW 2010TRANSCRIPT
![Page 1: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/1.jpg)
Making sense of Users’ Web activities
Mathieu d'AquinKnowledge Media Institute, The Open University, UK
![Page 2: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/2.jpg)
A bit of sci-fi to start with
“… from people who are afraid that someone else knows information that they don’t and is gaining an unfair advantage by it. For all the claims one hears about the liberating impact of the data-net, the truth is that it whished on most of us a brand-new reason for paranoia”
John Brunner,
The Shockwave Rider, 1975
![Page 3: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/3.jpg)
What we don’t know that they know
Simple important things:
And more complex important things…
What are all the websites that know my e-mail address?
What does amazon.co.uk or the website of my favorite airline know
about me?
![Page 4: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/4.jpg)
Is this Personal Information Management?
• Yes, but…• Looking at individual user’s information
exchange and more generally activities on the Web
• This is :– Big– Heterogeneous– Distributed– Fragmented– Sometimes implicit
• And hard to collect!
![Page 5: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/5.jpg)
So, what do we do?
Unrestricted monitoring of information exchange on the Web by an individual
user
![Page 6: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/6.jpg)
Loca
l Web
Age
nts
(e.g
., br
owse
r)
Local LoggingProxy
HTTP Requests
HTTP Responses
HTTP Requests
HTTP Responses Exte
rnal
Web
Site
s
Web Exchange RDF Logs
![Page 7: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/7.jpg)
<REQUEST RDF:ABOUT="#REQUEST-1257949232709-1257949233757"> <STARTEDAT>1257949232709</STARTEDAT> <ENDEDAT>1257949233757</ENDEDAT> <ORIGIN RDF:RESOURCE="127.0.0.1" /> <ONPORT>80</ONPORT> <TOHOST RDF:RESOURCE="API.FACEBOOK.COM" /> <METHOD RDF:RESOURCE="POST"/> <TOURL RDF:RESOURCE="HTTP://API.FACEBOOK.COM/RESTSERVER.PHP" /> <HTTPVERSION RDF:RESOURCE="HTTP-1.1" /> <HOST RDF:RESOURCE="API.FACEBOOK.COM" /> <CONTENT-TYPE RDF:RESOURCE="APPLICATION--X-WWW-FORM-URLENCODED" /> <USER-AGENT RDF:RESOURCE="MOZILLA--5.0_(MACINTOSH;_U;_INTEL_MAC_OS_X;_EN)_APPLEWEBKIT--526.9+_(KHTML._LIKE_GECKO)_ADOBEAIR--1.5.2" /> <REFERER RDF:RESOURCE="APP:--TWEETDECK.SWF" /> <X-FLASH-VERSION RDF:RESOURCE="10.0.32.18" /> <ACCEPT RDF:RESOURCE="*--*" /> <ACCEPT-LANGUAGE RDF:RESOURCE="EN-US" /> <ACCEPT-ENCODING RDF:RESOURCE="GZIP._DEFLATE" /> <COOKIE RDF:RESOURCE= "__QCA=1239783354-42963995-12118014;___UTMA=87286159.357565716.1239892196.1252686326.1257582307.16;___UTMZ=87286159.1257582307.16.16.UTMCCN= (REFERRAL)|UTMCSR=FACEBOOK.COM|UTMCCT=--TOS.PHP|UTMCMD=REFERRAL;_C_USER=605559235;_CUR_MAX_LAG=2;_DATR=1239398136-0711BF1215821A9C58848BF0FFD0020EC8450CFA7154B9E228C29;_LSD=P3ZPN;_LXE=METM.DAQUIN%40VIRGIN.NET;_LXS=3;_S_VSN_FACEBOOKPOC_1=9874874320812" /> <CONTENT-LENGTH RDF:RESOURCE="984" /> <CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <PROXY-CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <DATA RDF:RESOURCE="DATA_C22B691F691DABD5AE893B9CB2F8ADD7" /> <RESPONSE> <RESPONSE RDF:ABOUT="#RESPONSE-1257949232709--1257949233757"> <HTTPVERSION RDF:RESOURCE="HTTP--1.0" /> <RESPONSECODE RDF:RESOURCE="200_OK" /> <CACHE-CONTROL RDF:RESOURCE="PRIVATE._NO-STORE._NO-CACHE._MUST-REVALIDATE._POST-CHECK=0._PRE-CHECK=0" /> <CONTENT-TYPE RDF:RESOURCE="APPLICATION--JSON" /> <EXPIRES RDF:RESOURCE="MON._26_JUL_1997_05:00:00_GMT" /> <PRAGMA RDF:RESOURCE="NO-CACHE" /> <CONTENT-ENCODING RDF:RESOURCE="GZIP" /> <CONTENT-LENGTH RDF:RESOURCE="5943" /> <X-CACHE RDF:RESOURCE="MISS_FROM_ROEBURN.OPEN.AC.UK" /> <PROXY-CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <DATA RDF:RESOURCE="DATA_5CCF6054FD0FBA3EE7EB444E178EAF19" /> </RESPONSE></RESPONSE></REQUEST>
<REQUEST RDF:ABOUT="#REQUEST-1257949232709-1257949233757"> <STARTEDAT>1257949232709</STARTEDAT> <ENDEDAT>1257949233757</ENDEDAT> <ORIGIN RDF:RESOURCE="127.0.0.1" /> <ONPORT>80</ONPORT> <TOHOST RDF:RESOURCE="API.FACEBOOK.COM" /> <METHOD RDF:RESOURCE="POST"/> <TOURL RDF:RESOURCE="HTTP://API.FACEBOOK.COM/RESTSERVER.PHP" /> <HTTPVERSION RDF:RESOURCE="HTTP-1.1" /> <HOST RDF:RESOURCE="API.FACEBOOK.COM" /> <CONTENT-TYPE RDF:RESOURCE="APPLICATION--X-WWW-FORM-URLENCODED" /> <USER-AGENT RDF:RESOURCE="MOZILLA--5.0_(MACINTOSH;_U;_INTEL_MAC_OS_X;_EN)_APPLEWEBKIT--526.9+_(KHTML._LIKE_GECKO)_ADOBEAIR--1.5.2" /> <REFERER RDF:RESOURCE="APP:--TWEETDECK.SWF" /> <X-FLASH-VERSION RDF:RESOURCE="10.0.32.18" /> <ACCEPT RDF:RESOURCE="*--*" /> <ACCEPT-LANGUAGE RDF:RESOURCE="EN-US" /> <ACCEPT-ENCODING RDF:RESOURCE="GZIP._DEFLATE" /> <COOKIE RDF:RESOURCE= "__QCA=1239783354-42963995-12118014;___UTMA=87286159.357565716.1239892196.1252686326.1257582307.16;___UTMZ=87286159.1257582307.16.16.UTMCCN= (REFERRAL)|UTMCSR=FACEBOOK.COM|UTMCCT=--TOS.PHP|UTMCMD=REFERRAL;_C_USER=605559235;_CUR_MAX_LAG=2;_DATR=1239398136-0711BF1215821A9C58848BF0FFD0020EC8450CFA7154B9E228C29;_LSD=P3ZPN;_LXE=METM.DAQUIN%40VIRGIN.NET;_LXS=3;_S_VSN_FACEBOOKPOC_1=9874874320812" /> <CONTENT-LENGTH RDF:RESOURCE="984" /> <CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <PROXY-CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <DATA RDF:RESOURCE="DATA_C22B691F691DABD5AE893B9CB2F8ADD7" /> <RESPONSE> <RESPONSE RDF:ABOUT="#RESPONSE-1257949232709--1257949233757"> <HTTPVERSION RDF:RESOURCE="HTTP--1.0" /> <RESPONSECODE RDF:RESOURCE="200_OK" /> <CACHE-CONTROL RDF:RESOURCE="PRIVATE._NO-STORE._NO-CACHE._MUST-REVALIDATE._POST-CHECK=0._PRE-CHECK=0" /> <CONTENT-TYPE RDF:RESOURCE="APPLICATION--JSON" /> <EXPIRES RDF:RESOURCE="MON._26_JUL_1997_05:00:00_GMT" /> <PRAGMA RDF:RESOURCE="NO-CACHE" /> <CONTENT-ENCODING RDF:RESOURCE="GZIP" /> <CONTENT-LENGTH RDF:RESOURCE="5943" /> <X-CACHE RDF:RESOURCE="MISS_FROM_ROEBURN.OPEN.AC.UK" /> <PROXY-CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <DATA RDF:RESOURCE="DATA_5CCF6054FD0FBA3EE7EB444E178EAF19" /> </RESPONSE></RESPONSE></REQUEST>
<REQUEST RDF:ABOUT="#REQUEST-1257949232709-1257949233757"> <STARTEDAT>1257949232709</STARTEDAT> <ENDEDAT>1257949233757</ENDEDAT> <ORIGIN RDF:RESOURCE="127.0.0.1" /> <ONPORT>80</ONPORT> <TOHOST RDF:RESOURCE="API.FACEBOOK.COM" /> <METHOD RDF:RESOURCE="POST"/> <TOURL RDF:RESOURCE="HTTP://API.FACEBOOK.COM/RESTSERVER.PHP" /> <HTTPVERSION RDF:RESOURCE="HTTP-1.1" /> <HOST RDF:RESOURCE="API.FACEBOOK.COM" /> <CONTENT-TYPE RDF:RESOURCE="APPLICATION--X-WWW-FORM-URLENCODED" /> <USER-AGENT RDF:RESOURCE="MOZILLA--5.0_(MACINTOSH;_U;_INTEL_MAC_OS_X;_EN)_APPLEWEBKIT--526.9+_(KHTML._LIKE_GECKO)_ADOBEAIR--1.5.2" /> <REFERER RDF:RESOURCE="APP:--TWEETDECK.SWF" /> <X-FLASH-VERSION RDF:RESOURCE="10.0.32.18" /> <ACCEPT RDF:RESOURCE="*--*" /> <ACCEPT-LANGUAGE RDF:RESOURCE="EN-US" /> <ACCEPT-ENCODING RDF:RESOURCE="GZIP._DEFLATE" /> <COOKIE RDF:RESOURCE= "__QCA=1239783354-42963995-12118014;___UTMA=87286159.357565716.1239892196.1252686326.1257582307.16;___UTMZ=87286159.1257582307.16.16.UTMCCN= (REFERRAL)|UTMCSR=FACEBOOK.COM|UTMCCT=--TOS.PHP|UTMCMD=REFERRAL;_C_USER=605559235;_CUR_MAX_LAG=2;_DATR=1239398136-0711BF1215821A9C58848BF0FFD0020EC8450CFA7154B9E228C29;_LSD=P3ZPN;_LXE=METM.DAQUIN%40VIRGIN.NET;_LXS=3;_S_VSN_FACEBOOKPOC_1=9874874320812" /> <CONTENT-LENGTH RDF:RESOURCE="984" /> <CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <PROXY-CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <DATA RDF:RESOURCE="DATA_C22B691F691DABD5AE893B9CB2F8ADD7" /> <RESPONSE> <RESPONSE RDF:ABOUT="#RESPONSE-1257949232709--1257949233757"> <HTTPVERSION RDF:RESOURCE="HTTP--1.0" /> <RESPONSECODE RDF:RESOURCE="200_OK" /> <CACHE-CONTROL RDF:RESOURCE="PRIVATE._NO-STORE._NO-CACHE._MUST-REVALIDATE._POST-CHECK=0._PRE-CHECK=0" /> <CONTENT-TYPE RDF:RESOURCE="APPLICATION--JSON" /> <EXPIRES RDF:RESOURCE="MON._26_JUL_1997_05:00:00_GMT" /> <PRAGMA RDF:RESOURCE="NO-CACHE" /> <CONTENT-ENCODING RDF:RESOURCE="GZIP" /> <CONTENT-LENGTH RDF:RESOURCE="5943" /> <X-CACHE RDF:RESOURCE="MISS_FROM_ROEBURN.OPEN.AC.UK" /> <PROXY-CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <DATA RDF:RESOURCE="DATA_5CCF6054FD0FBA3EE7EB444E178EAF19" /> </RESPONSE></RESPONSE></REQUEST>
<REQUEST RDF:ABOUT="#REQUEST-1257949232709-1257949233757"> <STARTEDAT>1257949232709</STARTEDAT> <ENDEDAT>1257949233757</ENDEDAT> <ORIGIN RDF:RESOURCE="127.0.0.1" /> <ONPORT>80</ONPORT> <TOHOST RDF:RESOURCE="API.FACEBOOK.COM" /> <METHOD RDF:RESOURCE="POST"/> <TOURL RDF:RESOURCE="HTTP://API.FACEBOOK.COM/RESTSERVER.PHP" /> <HTTPVERSION RDF:RESOURCE="HTTP-1.1" /> <HOST RDF:RESOURCE="API.FACEBOOK.COM" /> <CONTENT-TYPE RDF:RESOURCE="APPLICATION--X-WWW-FORM-URLENCODED" /> <USER-AGENT RDF:RESOURCE="MOZILLA--5.0_(MACINTOSH;_U;_INTEL_MAC_OS_X;_EN)_APPLEWEBKIT--526.9+_(KHTML._LIKE_GECKO)_ADOBEAIR--1.5.2" /> <REFERER RDF:RESOURCE="APP:--TWEETDECK.SWF" /> <X-FLASH-VERSION RDF:RESOURCE="10.0.32.18" /> <ACCEPT RDF:RESOURCE="*--*" /> <ACCEPT-LANGUAGE RDF:RESOURCE="EN-US" /> <ACCEPT-ENCODING RDF:RESOURCE="GZIP._DEFLATE" /> <COOKIE RDF:RESOURCE= "__QCA=1239783354-42963995-12118014;___UTMA=87286159.357565716.1239892196.1252686326.1257582307.16;___UTMZ=87286159.1257582307.16.16.UTMCCN= (REFERRAL)|UTMCSR=FACEBOOK.COM|UTMCCT=--TOS.PHP|UTMCMD=REFERRAL;_C_USER=605559235;_CUR_MAX_LAG=2;_DATR=1239398136-0711BF1215821A9C58848BF0FFD0020EC8450CFA7154B9E228C29;_LSD=P3ZPN;_LXE=METM.DAQUIN%40VIRGIN.NET;_LXS=3;_S_VSN_FACEBOOKPOC_1=9874874320812" /> <CONTENT-LENGTH RDF:RESOURCE="984" /> <CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <PROXY-CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <DATA RDF:RESOURCE="DATA_C22B691F691DABD5AE893B9CB2F8ADD7" /> <RESPONSE> <RESPONSE RDF:ABOUT="#RESPONSE-1257949232709--1257949233757"> <HTTPVERSION RDF:RESOURCE="HTTP--1.0" /> <RESPONSECODE RDF:RESOURCE="200_OK" /> <CACHE-CONTROL RDF:RESOURCE="PRIVATE._NO-STORE._NO-CACHE._MUST-REVALIDATE._POST-CHECK=0._PRE-CHECK=0" /> <CONTENT-TYPE RDF:RESOURCE="APPLICATION--JSON" /> <EXPIRES RDF:RESOURCE="MON._26_JUL_1997_05:00:00_GMT" /> <PRAGMA RDF:RESOURCE="NO-CACHE" /> <CONTENT-ENCODING RDF:RESOURCE="GZIP" /> <CONTENT-LENGTH RDF:RESOURCE="5943" /> <X-CACHE RDF:RESOURCE="MISS_FROM_ROEBURN.OPEN.AC.UK" /> <PROXY-CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <DATA RDF:RESOURCE="DATA_5CCF6054FD0FBA3EE7EB444E178EAF19" /> </RESPONSE></RESPONSE></REQUEST>
<REQUEST RDF:ABOUT="#REQUEST-1257949232709-1257949233757"> <STARTEDAT>1257949232709</STARTEDAT> <ENDEDAT>1257949233757</ENDEDAT> <ORIGIN RDF:RESOURCE="127.0.0.1" /> <ONPORT>80</ONPORT> <TOHOST RDF:RESOURCE="API.FACEBOOK.COM" /> <METHOD RDF:RESOURCE="POST"/> <TOURL RDF:RESOURCE="HTTP://API.FACEBOOK.COM/RESTSERVER.PHP" /> <HTTPVERSION RDF:RESOURCE="HTTP-1.1" /> <HOST RDF:RESOURCE="API.FACEBOOK.COM" /> <CONTENT-TYPE RDF:RESOURCE="APPLICATION--X-WWW-FORM-URLENCODED" /> <USER-AGENT RDF:RESOURCE="MOZILLA--5.0_(MACINTOSH;_U;_INTEL_MAC_OS_X;_EN)_APPLEWEBKIT--526.9+_(KHTML._LIKE_GECKO)_ADOBEAIR--1.5.2" /> <REFERER RDF:RESOURCE="APP:--TWEETDECK.SWF" /> <X-FLASH-VERSION RDF:RESOURCE="10.0.32.18" /> <ACCEPT RDF:RESOURCE="*--*" /> <ACCEPT-LANGUAGE RDF:RESOURCE="EN-US" /> <ACCEPT-ENCODING RDF:RESOURCE="GZIP._DEFLATE" /> <COOKIE RDF:RESOURCE= "__QCA=1239783354-42963995-12118014;___UTMA=87286159.357565716.1239892196.1252686326.1257582307.16;___UTMZ=87286159.1257582307.16.16.UTMCCN= (REFERRAL)|UTMCSR=FACEBOOK.COM|UTMCCT=--TOS.PHP|UTMCMD=REFERRAL;_C_USER=605559235;_CUR_MAX_LAG=2;_DATR=1239398136-0711BF1215821A9C58848BF0FFD0020EC8450CFA7154B9E228C29;_LSD=P3ZPN;_LXE=METM.DAQUIN%40VIRGIN.NET;_LXS=3;_S_VSN_FACEBOOKPOC_1=9874874320812" /> <CONTENT-LENGTH RDF:RESOURCE="984" /> <CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <PROXY-CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <DATA RDF:RESOURCE="DATA_C22B691F691DABD5AE893B9CB2F8ADD7" /> <RESPONSE> <RESPONSE RDF:ABOUT="#RESPONSE-1257949232709--1257949233757"> <HTTPVERSION RDF:RESOURCE="HTTP--1.0" /> <RESPONSECODE RDF:RESOURCE="200_OK" /> <CACHE-CONTROL RDF:RESOURCE="PRIVATE._NO-STORE._NO-CACHE._MUST-REVALIDATE._POST-CHECK=0._PRE-CHECK=0" /> <CONTENT-TYPE RDF:RESOURCE="APPLICATION--JSON" /> <EXPIRES RDF:RESOURCE="MON._26_JUL_1997_05:00:00_GMT" /> <PRAGMA RDF:RESOURCE="NO-CACHE" /> <CONTENT-ENCODING RDF:RESOURCE="GZIP" /> <CONTENT-LENGTH RDF:RESOURCE="5943" /> <X-CACHE RDF:RESOURCE="MISS_FROM_ROEBURN.OPEN.AC.UK" /> <PROXY-CONNECTION RDF:RESOURCE="KEEP-ALIVE" /> <DATA RDF:RESOURCE="DATA_5CCF6054FD0FBA3EE7EB444E178EAF19" /> </RESPONSE></RESPONSE></REQUEST>
2.5 months = 3 Million HTTP Requests100 Million RDF Triples
![Page 8: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/8.jpg)
What this talk is about
Using ontologies and external datasets to – Generate abstractions of this low level data– Enrich it with external knowledge and models– Interpret to give back useful information to the
user
![Page 9: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/9.jpg)
HTTP Ontology
Web Site Information
Location Information
Online Activities Ontology
Parameters and Website
info.
Personal Information
Trust Model
![Page 10: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/10.jpg)
HTTP Ontology
• Built bottom-up from the data
• Can help inferring simple things from it
• And answer questions through SPARQL queries
Request time: DateTime toURL: URL referer: URL
Response time: DateTime responseCode: int
InternetPoint time: DateTime
WebHost domain: String
WebAgent ID: String
DataFile ID: String
DataFormat MineID: String
hasResponse
origine
toHost
User-Agent
Content
Content
Content-Type
Content-Type
![Page 11: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/11.jpg)
Simple examples
Requests per User Agents
Requests per time of day
Requests per Host
![Page 12: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/12.jpg)
Integrating basic info
Domain name
IP
Location
“What!? What requests have I made to websites in Nigeria? What Data did I send?”Can be answered in a SPARQL query
![Page 13: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/13.jpg)
More information about websites
• The linked data cloud is full of it.• Using the domain name to address this
information.CONSTRUCT {<domain_name> ?p ?y}WHERE {{{?x dbpedia:homepage <http://domain_name>}.
{?x ?p ?y}}UNION {{?x owl:sameAs ?z}.
{?x dbpedia:homepage <http://domain_name>}.
{?x ?p ?y}}}
![Page 14: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/14.jpg)
Examples
www.youtube.com
Google Services
Entertainment Websites
Video Hosting
subsediaryOf
Company
type
Video sharing
subject/category
parent
www.google-analytics.com
Web Analytics
developer
subject/category
www.google.com
owner
Internet Search Engine
Search Engine
Web Search Engine
DBpedia freebase
![Page 15: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/15.jpg)
Activities
• Can we now understand the user activities?• Based on website categories and on their parameters:
GET http://uk.search.yahoo.com/beacon/module?p=idiocracy&url=http%3A%2F%2Fwww.imdb.com%2Ftitle%2Ftt0387808%2F
POST format=JSON&method=fql%2Emultiquery&api%5Fkey=51d350e8d92da1f5623512a9e801da2b&v =1%2E0&queries=%7B%22query2%22%3A%22SELECT%20app%5Fid%2C%20display%5Fname%20FROM %20application%20WHERE%20app%5Fid%20IN%20%28SELECT%20app%5Fid%20FROM%20%23query1 %29%22%2C%22query1%22%3A%22SELECT%20post%5Fid%2C%20source%5Fid%2C%20created%5Ftime%2C%20updated%5Ftime%2C%20actor%5Fid%2C%20target%5Fid%2C%20app%5Fid%2C%20message%2C%20attachment%2C%20comments%2C%20likes%2C%20permalink%2C%20attribution%2C%20type%20FROM%20stream%20WHERE%20filter%5Fkey%20IN%20%28SELECT%20filter%5Fkey%20FROM%20stream%5Ffilter%20WHERE%20uid%20%3D%20605559235%20AND%20type%20%3D%20%27newsfeed%27%29%20AND%20%28created%5Ftime%20%3E%3D%201257443596%29%20AND%20%28%28created%5Ftime%20%3E%201257945423%29%20OR%20%28updated%5Ftime%20%21%3D%20created%5Ftime%29%29%20ORDER%20BY%20created%5Ftime%20DESC%20LIMIT%20200%22%7D&call%5Fid=12565739074246102&sig=01a13a72825ed83ed6d23bdf2791ad1a&session%5Fkey=be312ffdf9b9e1a5ec6c5768%2D605559235
![Page 16: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/16.jpg)
Activities in an Ontology
• Derived in a bottom-up way from categories of activities/request
• Can be used to characterize overall activities, individual activities or correlations between activities
ActivityBasedRequest
ExplicitActivity ImplicitActivity
ReportToAnalytics
CheckStatusFeed
AutoCheckStatusFeed
ManualCheckStatusFeed
Search
SearchVideo
SearchImage
FollowLink
FollowSearchResult
![Page 17: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/17.jpg)
Example Activity: Search
Search keywords
![Page 18: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/18.jpg)
Example Activity: Search
inverseOf(linked-followed, referer)InformationalSearch = SearchRequest and min 2 link-followedNavigationalSearch = SearchRequest and =1 link-followed
Prominence of Navigational Searches
IndexedSite = exists referer NavigationalSearchIndexedSite(?x), NavigationalSearch(?y), referer(?x, ?y), searchTerm(?y, ?z) IndexedWithKeyword(?x, ?z)
![Page 19: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/19.jpg)
Example Activity: Search
Search Keywords
OpenCalais
Topics of interest
![Page 20: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/20.jpg)
Personal data exchange
Request Parameters
Personal Information (Profile)
Trust Model
![Page 21: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/21.jpg)
Tool used to create mappings between data sent to websites (from logs on the right) with the user profile (left). Effectively reconstructing the profile from the data
![Page 22: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/22.jpg)
User profile re-constructed from Web activities
• 36 attributes, 1,080 values, to 123 domains
• A model of what piece of personal information was sent where (can answer the questions)
![Page 23: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/23.jpg)
What that tells us about trust
Taking the point of view of an external observer, we can derive an observed model of trust and criticality of data– If this piece of data is critical to you and you
give it to bob, you must trust bob– If you give this piece of data to many
untrusted people, you probably don’t consider it critical
![Page 24: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/24.jpg)
Formally
• Trust in a domain =
max of criticality of data it received
• Criticality of a piece of data =
1 / 1 + Σ (1- trust in websites
that received the data)
• Obviously, these 2 formulas are interdependent. Treating them as a sequence, with initial values at 0.5
![Page 25: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/25.jpg)
Interacting with the model
Expose the user to his own observed behavior has observed, so that he can try to align it to his intended behavior
![Page 26: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/26.jpg)
Demo
![Page 27: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/27.jpg)
Conclusion
• First set tools exploiting logs of personal Web activity
• Demonstrate the need for ways to abstract and interpreter activity data, to support Web Users
• Demonstrate the ability of semantic technologies, ontologies and the enrichment through external data, to provide such abilities
![Page 28: Making sense of users' Web activities](https://reader035.vdocuments.us/reader035/viewer/2022070317/5561fc34d8b42a25488b4f16/html5/thumbnails/28.jpg)
So much more to do
Can I collect this tweet? From HTTPS? From my mobile phone?
Can I link it to where I am?
To what I’m doing? To what I have been doing?
To the abstract of the presentation? To the slides on SlideShare.net? To blogs mentioning it?
Can I cope with the scale of all this information? Can I decide what to share? Can I store all this securely? Can I get usable access to it? Can I learn something from it?