open search david wolber. overview proliferation of digital libraries metasearch and fixed lists of...
Post on 18-Dec-2015
215 views
TRANSCRIPT
![Page 1: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/1.jpg)
Open Search
David Wolber
![Page 2: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/2.jpg)
Overview
Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P knowledge Sharing Webtop Metasearch Clients
![Page 3: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/3.jpg)
Contributors
Michael Kepe Igor Ranitovic Iman Sadreddin Senior Team ’03 Ken Chong Rudd Stevens Colin Bean Tim Chan Julian Chan Pooja Garg
![Page 4: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/4.jpg)
Information Source Explosion
Google, Amazon APIs Internet Archive Technorati– The World Live Web Domain Specific:
– ACM Digital Library for CS– Lexis-Nexis for law– MLA for literature
![Page 5: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/5.jpg)
End-User Created Digital Libraries
Personal Web (shared Google desktop)
Personal Web Neighborhood
Topic-Specific Personal Crawlers
Ordinary people creating search engines as easily as web pages
2nd Degree
1st Degree
Nth Degree
PersonalWeb
![Page 6: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/6.jpg)
Subsets of the Web
![Page 7: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/7.jpg)
Motivation for Small, Independent Subsets of the Web
Avoid information being channeled through a single portal: Googleopoly
Google does no evil, but…– Censorship in China– Creeping level of commercialization– Unregulated manipulation of secret ranking
algorithms (see PageKing case)
Other media is lost, this is the last frontier
![Page 8: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/8.jpg)
Little support for using multiple search engines
![Page 9: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/9.jpg)
Overview
Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P knowledge Sharing Webtop Metasearch Clients
![Page 10: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/10.jpg)
Metasearch
Help users discover and use digital libraries
Send queries to multiple, selected search engines
filter, process, and unify results
A9.com – Amazon’s metasearch
![Page 11: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/11.jpg)
Web Services Basis
server html
server softwarexml
server
html
Web Page Model
Web Service Model
![Page 12: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/12.jpg)
How does metasearch evolve?
New Digital library
![Page 13: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/13.jpg)
How does metasearch evolve?
New Digital library
Metasearch clients discover it
![Page 14: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/14.jpg)
How does metasearch evolve?
New Digital library
Metasearch clients discover it Metasearch
Programmers write adaptor/scraper
![Page 15: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/15.jpg)
How does metasearch evolve?
New Digital library
Metasearch clients discover Metasearch
Programmers write adaptor/scraper
User can access within metasearch
SLOWLY…
![Page 16: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/16.jpg)
Overview
Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P knowledge Sharing Webtop Metasearch Clients
![Page 17: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/17.jpg)
Goal: Automate the Process
Metasearch engines should provide users with up-to-date lists of existing digital libraries
Digital libraries should be able to register and be made immediately available to all Metasearch clients.
Metasearch and Library development is independent.
![Page 18: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/18.jpg)
What is Necessary?
Standard Search API– So Metasearch clients can use polymorphism to access
sources.
for each source s in sourceList {searchEngine.endPointUrl = s.endPointUrl;resultList +=
searchEngine.keywordSearch(keywords)}
Search API Registry
– Metasearch clients can get dynamic list
![Page 19: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/19.jpg)
Web Service Standards
WSDL – Web Service Description Language
SOAP – Simple Object Access Protocol
UDDI – Universal Description, Discovery, and Integration
![Page 20: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/20.jpg)
Standards on top of Web Services
WSDL, SOAP, UDDI basis for standards in many domains.– e.g., MS initiated for securities information
providers
Businesses agree on a standard, then client applications can use polymorphism and new businesses can register services.
In this case, we want cross-domain standard.
![Page 21: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/21.jpg)
Open Search Architecture
Open Search Protocol (OSP)– Cross-Domain: Search-related services– Not just keyword search, but citations, authorOf, etc.
Open Search Registry– Based on UDDI– Can add customization, e.g., parsing to find out which
search operations are implemented.– Web and web service access
![Page 22: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/22.jpg)
Open Search Architecture
OSP metasearch clientssource list
Register service
OSP-Conforming Libraries
OS Registry
![Page 23: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/23.jpg)
User Can Choose Sources
![Page 24: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/24.jpg)
Open Search Protocol
Keyword search
Citations (inward links, outward links)
AuthorOf and other associative operations…
Metadata object results based on Dublin Core
Restriction object for “advanced search” stuff
![Page 25: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/25.jpg)
Publishing a Library
• Access OSP WSDL Specification from webtop.cs.usfca.edu
• Generate code in language of choice
• Implement the search operations for the digital library
• Deploy the service
• Register with Open Search registry
![Page 26: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/26.jpg)
Deploying an Open Search Lib.
programmer 1. OS wsdl
wsdl2java
2.wsdl
3. skeleton code
Open Search
information
Registry
Library server
4. deployed service
5. registration info
![Page 27: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/27.jpg)
Wrapping a Library
Custom search API, e.g., Google API
Open Search Wrapper
Metasearch Client
1. OSP Query 4. OSP Result
2. Custom query
3. Custom Result
Located on 3rd party server
![Page 28: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/28.jpg)
Wrappers Developed at USF
Google Amazon (sort of) Internet Archive Technorati Feedster
![Page 29: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/29.jpg)
Overview
Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P knowledge Sharing Webtop Metasearch Clients
![Page 30: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/30.jpg)
PublishMe
Like Google Desktop, but shared.
Periodically updates inverse index and linkbase on PC
Deploys Web Service on User’s PC
Auto-Registers with Open Search Registry
![Page 31: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/31.jpg)
Metasearch with P2P Knowledge Sharing
WEBTOP
![Page 32: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/32.jpg)
Integrating Global and Personal Libraries
![Page 33: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/33.jpg)
Motivation for Sharing Personal Webs
People create knowledge everyday when they bookmark, annotate, link, organize, and synthesize.
Communication is a separate step which often doesn’t happen
![Page 34: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/34.jpg)
Experts Collaborative Work
Motivation for Sharing Personal Webs
![Page 35: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/35.jpg)
Computers are designed using our brains for a model
Knowledge creation and dissemination separate
Explicit effort required to communicate Just as we model our word processors on
paper.
![Page 36: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/36.jpg)
Additions to OSP for P2P
GetFile
OnLine(ip)– Handles user starting up– Dynamic IPs
OffLine
![Page 37: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/37.jpg)
But What About PRIVACY?The Big Question:
How much of the information hidden
within your personal web is hidden due
to privacy concerns?
![Page 38: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/38.jpg)
I Want you to be a Search Engine!
![Page 39: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/39.jpg)
Overview
Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P knowledge Sharing Metasearch Clients
![Page 40: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/40.jpg)
Goal: Implement Vannevar Bush’s Association Trails
View a document/thing in context
History of an idea
![Page 41: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/41.jpg)
Thinkmap-like Interface
![Page 42: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/42.jpg)
Association Types
Outward links Inward links Similar-Content links People Links
– author, people referenced in paper Domain-Specific links
– law citations– movie-actor
Associations specified by Annotators
![Page 43: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/43.jpg)
Webtop Tree View webtop.cs.usfca.edu
![Page 44: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/44.jpg)
Expanding a Tree
• Bird’s Eye View
• Local/Web files integrated
• Follow different Associative Trails
• Ins of Outs of Ins, etc.
• Siblings
• Weird though, as ins and outs both expand right
![Page 45: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/45.jpg)
Webtop Side Panel View
![Page 46: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/46.jpg)
Project Status
Too many bugs, Dad
![Page 47: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/47.jpg)
Future Work
Open Search Protocol– In-depth study of existing search APIs– Provide Rest alternative to SOAP
Metasearch development– Complete and refine existing clients– Dream up new ones
Thinkmap Graph Automated Source Selection and Reputation System Page Ranking
Initiate grass-roots involvement
![Page 48: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/48.jpg)
Future Work: Documents and Things
resourceassociationsannotations
document
html word pdf
person
film book
creative work
![Page 49: Open Search David Wolber. Overview Proliferation of Digital Libraries Metasearch and Fixed Lists of Sources Open Search Architecture PublishMe for P2P](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d245503460f949faf9a/html5/thumbnails/49.jpg)
Stop talking about Webtop daddy!
webtop.cs.usfca.edu