20 years of web search – where to next?
DESCRIPTION
I've given this talk several times over the last year and it's time to retire it and move onto something else. Ambitious title, but it has certainly been fun to talk to general audiences about how, contrary to popular belief, search isn't a solved problem.TRANSCRIPT
![Page 1: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/1.jpg)
20 years of Web search –
where to next?
Mark Sanderson
![Page 2: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/2.jpg)
2
Who am I?
•Professor at RMIT University, Melbourne
•Before
–Professor at University of Sheffield
–Researcher at UMass Amherst
–Researcher at University of Glasgow
•Online
–@IR_oldie
–http://www.seg.rmit.edu.au/mark/
![Page 3: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/3.jpg)
Overview of talk
•A bit of history
![Page 4: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/4.jpg)
A bit of history
Early IR
![Page 5: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/5.jpg)
5
Before IR systems
• There were libraries
–The search engine of the day
• Organise information using
a subject catalogue
–Sort cards by author
–Sort cards by title
–Sort cards by subject
– How to do this?
![Page 6: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/6.jpg)
At the same time…
•While librarians were coping with the
information explosion
–Could machines help?
–Could computers help?
•Very brief history of machines and
computers for search
6
![Page 7: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/7.jpg)
Machines doing IR
CS&IT - ISAR 7
![Page 8: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/8.jpg)
As we may think – Bush 1945
8
–http://www.youtube.com/watch?v=c539cK58ees
![Page 9: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/9.jpg)
9
Computers doing IR
•Holmstrom 1948
![Page 10: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/10.jpg)
10
Information Retrieval
•Calvin Mooers, 1950
![Page 11: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/11.jpg)
The web arrived
•1993
–JumpStation
–Jonathon Fletcher, University of Stirling
•Steinberg, Wired, 1996
–“Information retrieval is really only a problem for people in library science - if some computer scientists were to put their heads together, they'd probably have it solved before lunchtime.”
![Page 12: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/12.jpg)
Where are we now
Google/Bing
![Page 13: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/13.jpg)
Where we are now
•Google/Bing
–Text matching
–Fields, anchor
–PageRank
–Query logs
–…
–Massive machine learning
–Evaluation
–Continual tuning
![Page 14: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/14.jpg)
Search is solved?
•Common perception
14
![Page 15: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/15.jpg)
Favourable conditions
•Most content wants to be found
•Most content is redundant
•Huge income
•Queries often repeated
•Users can read & write
15
![Page 16: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/16.jpg)
Where to next?
• Immediate problems
• Immediate opportunities
•Medium term challenges
•Longer term challenges
![Page 17: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/17.jpg)
Immediate
Problems/opportunies
![Page 18: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/18.jpg)
Problematic summaries
18
![Page 19: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/19.jpg)
Less favourable?
•People struggle to search
•People miss retrieved documents
–Fine for redundant content; what if just one?
19
![Page 20: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/20.jpg)
Problem searching
•Limited redundancy
–Little money
–Enterprise search
–Refinding
–Content doesn’t want to be found
–Patent search
–Legal document search (e-Discovery)
20
![Page 21: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/21.jpg)
Enterprise search
•Many problems in this space
•Each collection is different
–Each search engine needs to be different
•No money
•“Why doesn’t it work like Google?”
21
![Page 22: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/22.jpg)
Significant problem
•Think carefully before
including search in your
user interface
22
![Page 23: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/23.jpg)
At RMIT
•Trying to scope the problem
–If we find a search solution that works on one
set of documents, does it work on others?
–Not as much as was thought
–A lot worse than was thought
23
![Page 24: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/24.jpg)
Major immediate challenge
•Do search as well as Google no matter
what the collection, and do it without all
their money
24
![Page 25: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/25.jpg)
Favourable conditions
•Most content wants to be found
•Most content is redundant
•Huge income
•Queries often repeated
•Users can read & write
25
![Page 26: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/26.jpg)
Refinding
• Interviewed 45 searchers about common
retrieval tasks
–70% relate to refinding
•Starting funded investigation in this area.
26
![Page 27: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/27.jpg)
Ephemeral & archival content
•Archival
–Traditional web search
–Web pages, news, documents
–Coarse grained
•Ephemeral
–Social media
–Blogs, social networks, micro-blogs
–Fine grained
27
![Page 28: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/28.jpg)
Interface of the two
•Summarising ephemeral content
–Only just starting
–Lots of opportunities to specialise
•How can ephemeral content aid search of
archival
–RMIT changing representation of archival
content based on ephemeral data.
–Early days, but promising
28
![Page 29: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/29.jpg)
Medium term
![Page 30: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/30.jpg)
Diffuse information
![Page 31: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/31.jpg)
Harder information needs
•Entertain me
•Contextual
search
•SWIRL 2012
–http://www.cs.r
mit.edu.au/swirl
12/
31
![Page 32: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/32.jpg)
Longer term
![Page 33: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/33.jpg)
Longer term
•Long queries
•Spoken search
•The internet for everyone
![Page 34: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/34.jpg)
Users have complex needs
•Poorly expressed in short queries
–Experts
–issue multiple short queries
–use search engine operators
•Can we build search engines to handle
complex queries?
34
![Page 35: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/35.jpg)
New application area?
•Speech search
–Hand free
–Eyes free
•Seen in the movies, but really?
35
![Page 36: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/36.jpg)
Users?
•Visually impaired
–Together they could form a country
•Other potential uses
–In car searching
–Walking in a city
36
![Page 37: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/37.jpg)
Internet for everyone
– http://www.onbile.com/info/how-many-people-use-smartphones-in-the-world/
37
![Page 38: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/38.jpg)
Internet users?
•2013
–2 billion now
•2015
–4 billion mostly on mobiles (Baird Equity
Research)
38
![Page 39: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/39.jpg)
Implications?
•More languages
•More users who struggle with literacy
–Search engines assume you can read and
write
39
![Page 40: 20 years of web search – where to next?](https://reader033.vdocuments.us/reader033/viewer/2022060201/5599b3f61a28abd30b8b46ae/html5/thumbnails/40.jpg)
Search engines
There is a lot still to do