research problems in semantic web search varish mulwad ____________________________ 1
Post on 21-Dec-2015
217 views
TRANSCRIPT
![Page 1: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/1.jpg)
Research Problems in Semantic Web Search
Varish Mulwad
____________________________
1
![Page 2: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/2.jpg)
Agenda
• Introduction
• Swoogle
• Swoogle’s Competition – • Sindice• Semantic Web Search Engine (SWSE)• Watson• Falcon
• Research Problems and Issues with Swoogle
• References
____________________________
2
![Page 3: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/3.jpg)
Introduction____________________________
3
Web
Dr.Finin’s FOAF Profile
Your Agent
Possible because: Data is in machine understandable form like – RDF, OWL
But how will agent find all this data ? Search Engines ?
![Page 4: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/4.jpg)
Introduction
4
____________________________
Traditional Search Engine Results Semantic Web Search Engine Results
![Page 5: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/5.jpg)
Swoogle
• Swoogle is a crawler based indexing and retrieval system for Semantic Web
• Swoogle crawls and discovers documents written in RDF,OWL
• Swoogle classifies a Semantic Web Document(SWD) as – • Semantic Web Ontology (SWO) – Defines new
terms• Semantic Web Databases (SWDB) – Makes
assertions about individuals
____________________________
5
![Page 6: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/6.jpg)
Swoogle
SWOOGLE DEMO
____________________________
6
![Page 7: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/7.jpg)
Swoogle Architecture____________________________
7
![Page 8: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/8.jpg)
Swoogle ArchitectureSWD Discovery Component
• Google crawler using the Google web service• Filetypes with extensions “.rdf”, ”.owl”, “.n3”• Google limits only 1000 results per query
• A focussed crawler• Crawls documents within a given website• Extension and Focus constraints
• A Swoogle crawler • Jena based crawler• Explores Semantic Links between SWDs
____________________________
8
![Page 9: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/9.jpg)
Swoogle ArchitectureMetadata Creation
• Basic Metadata• Encoding – “RDF/XML”, “N-Triple”, “N3”• Language – RDF, RDFS, OWL, DAML + OIL• OWL Species – OWL-LITE, OWL-DL, OWL-FULL
• Relations among SWDs• Reference relationship among SWDs• Inter ontology relationships
____________________________
9
![Page 10: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/10.jpg)
Swoogle Architecture
Data analysis component • Classification of SWD as SWO or SWDB • Compute rank of SWD
Web based interface• Human User Interface – http://swoogle.umbc.edu• Web Services using REST interface• Agent Service
____________________________
10
![Page 11: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/11.jpg)
Sindice
• Created at Digital Enterprise Research Institute (DERI)
• Key features of Sindice include –
• Sindice collects SWDs and indexes them on resource URIs, Inverse Functional Properties(IFPs) and keywords
• Sindice uses the Hadoop parallel architecture
____________________________
11
![Page 12: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/12.jpg)
Sindice
Inverse Functional Property (IFP) – An OWL cardinality restriction
Sincdice uses three indexes –
• URI index• IFP index• Keyword index
Benefits - Faster retrieval of data
____________________________
12
![Page 13: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/13.jpg)
SindiceHadoop architecture is used in the following manner –
• Sindice employs Hadoop/Nutch to distribute crawling job across multiple machines
• Collected data is stored in the Hbase distributed column – based store
• Efficient handling of large datasets across the cluster using a MapReduce implementation
____________________________
13
![Page 14: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/14.jpg)
Sindice
SINDICE DEMO
____________________________
14
![Page 15: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/15.jpg)
SWSE
• Semantic Web Search Engine (SWSE) is also a Semantic Web Search Engine created at Digital Enterprise Research Institute (DERI)
• SWSE uses a “Multicrawler” – a pipelined architecture for crawling
____________________________
15
![Page 16: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/16.jpg)
Watson
• Created at Knowledge Management Institute at the UK Open University
• Major Design Principles –
• Considers explicit and implicit relations between Ontologies
• Ranking of Ontologies with focus on quality over popularity
____________________________
16
![Page 17: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/17.jpg)
Watson
WATSON DEMO
____________________________
17
![Page 18: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/18.jpg)
Falcon
• Falcon is a Semantic Web Search engine created at the Institute of Web Science in China
• Falcon allows keyword based queries on :
• Objects
• Concepts
• Documents
• Falcon performs class subsumption reasoning
____________________________
18
![Page 19: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/19.jpg)
Falcon
FALCON DEMO
____________________________
19
![Page 20: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/20.jpg)
Summary
Swoogle• Keyword based search
• Searches Ontologies and Instance Data
OthersSindice
• Indexes on URI, IFP, keywords
• Use of Hadoop Architecture
SWSE
• Pipelined Architecture for Crawling
Watson
• Implicit relations between SWDs
Falcon
• Class Subsumption Reasoning
20
____________________________
![Page 21: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/21.jpg)
IssuesCrawling• Swoogle’s crawler is running as a single thread on
one machine
• Limits the number of SWDs dicovered and revisted
Possible Solutions• Use of Hadoop Architecture
• Use of Grub
____________________________
21
![Page 22: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/22.jpg)
Other IssuesCrawling large structured Datasets like DBPedia
More reasoning
More services
____________________________
22
![Page 23: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/23.jpg)
References• Li Ding et al., "Swoogle: A Search and Metadata Engine for the Semantic Web",
Proceedings of the Thirteenth ACM Conference on Information and Knowledge Management, November 2004.
• P. Mika, G. Tummarello “Web Semantics in the Clouds”, IEEE Intelligent Systems, Volume 23 , Issue 5 (September 2008)
• E. Oren, R.Delbru, M. Catasta, R. Cyganiak, H. Stenzhorn, G.Tummarello “Sindice.com: A document-oriented lookup index for open linked data.” In
International Journal of Metadata, Semantics and Ontologies, 3(1), 2008.
• Mathieu d’Aquin et al., “Watson: A Gateway for the Semantic Web” ,Poster session of the European Semantic Web Conference, ESWC 2007
• Gong Cheng, Weiyi Ge, Honghan Wu, Yuzhong Qu , “Searching Semantic Web Objects Based on Class Hierarchies” In WWW 2008 Workshop on Linked Data on the Web, 2008
____________________________
23
![Page 24: Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1](https://reader030.vdocuments.us/reader030/viewer/2022032521/56649d575503460f94a36b56/html5/thumbnails/24.jpg)
Questions?
____________________________
24