keyword searching and browsing in databases using banks seoyoung ahn mar 3, 2005 the university of...
TRANSCRIPT
Keyword Searching and Browsing in Databases using
BANKS
Seoyoung Ahn
Mar 3, 2005
The University of Texas at Arlington
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Outline
Introduction Database and Query Model Searching for the best answers Browsing features of BANKS Experiment Conclusion
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Introduction
Search engines on Web have popularized an unstructured querying and browsing Simple and user-friendly Users just type in keywords and follow hyperlink
Relational databases are commonly searched using structured query language Users need to know the schema
Keyword searching techniques cannot be used on data stored in databases It often splits across the tables/tuples due to normalization
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Introduction(cond..)
BANKS (Browsing And Keyword Searching) a system which enables keyword-based search on
relational databases, together with data and schema browsing
User BANKSsystem
DatabaseHTTP JDBC
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Introduction(cond..)
BANKS (Browsing And Keyword Searching) a framework for keyword querying of relational
database a novel and efficient heuristic algorithm for executing
keyword queries key features of BANKS system
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Outline
Introduction Database and Query Model
Informal Model Formal Model Query and Answer Model
Searching for the best answers Browsing features of BANKS Experiment Conclusion
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
Informal Model
Model Description
each tuple in db
fk-pk-Link
database
node in the graph
directed edge
directed graph
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
The Schema
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
A Fragment of the Database
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
Informal Model(cond.) An answer to a query should be a subgraph
connecting nodes matching the keywords. The importance of a link depends upon the type of
the link i.e. what relations it connects and on its semantics
Ignoring directionality would cause problems because of “hubs” which are connected to a large numbers of nodes.
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
Formal Database ModelNodes and edges
Node Weight : N(u)
Depends on the prestige
Set the node prestige = the indegree of the node
Nodes that have multiple pointers to them get a higher prestige
Node score N = root node weight
+ ∑ leaf node weight
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
Formal Database Model (Cond.) Edge Weights
Some pupluar tuples can be connected many other tuples Edge with forward and backward edge weights
Weight of a forward link = the strength of the proximity relationship between two tuples
(set to 1 by default)
Weight of a backward link = indegree of edges pointing to the node
Total edge weight = ∑ edge weights
Edge score E = 1 / Total edge weight
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
Formal Database Model (Cond.) Overall relevance score
= Node weights + Edge Weight
Normalize in the range [0,1]
Combine using weighting factor Additive: (1- ) E + N;
multiplicative: E * N
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Database and Query Model
Query and Answer Model Query
A set of keywords e.g.{k1,k2,…kn}
A set of nodes Si = {S1,S2,…Sn}
Locate nodes matching search terms t1,t2,…tn
Answer Model A rooted directed tree connecting keyword nodes
Relevance score of an answer tree
Relevance scores of it nodes and its edge weight
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Database and Query Model Answer Model
A rooted directed tree connecting keyword nodes
Multiple answersRanked by proximity + prestige
Proximity edges weights
Prestige indegree of nodes
Relevance score of an answer treeRelevance scores of it nodes and its edge weight
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Database and Query Model Result of query “sudarshan soumen”
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Outline
Introduction Database and Query Model Searching for the best answers
Backward expanding search algorithm
Browsing features of BANKS Experiment Conclusion
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Searching for the best answers Backward expanding search algorithm
Offers a heuristic solution for incrementally computing query results.
Assume that the graph fits in memory Start at leaf nodes each containing a query keyword Run concurrent single source shortest path algorithm from each
such node Traverses the graph edges backwards Confluence of backward paths identify answer tree roots
Output a node whenever it is on the intersection of the sets of nodes reached from each keyword
Answer trees may not be generated in relevance orderInsert answers to a small buffer (heap)
Output highest ranked answer from buffer to user when buffer is full
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Searching for the best answers
Model (Query : Charuta Sudarshan Roy )
S. Sudarshan
Prasan Roy
writes
author
paper
Charuta
BANKS: Keyword search…
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Outline
Introduction Database and Query Model Searching for the best answers Browsing features of BANKS Experiment Conclusion
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Browsing
BANKS system provides A rich interface to browse data stored in a relational
database Automatically generates browsable views of database
relations and query results Schema browsing and data browsing A hyperlink to the referenced tuple Templates for several predefined ways of displaying
data
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Browsing Data browsing
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Browsing Schema browsing
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Outline
Introduction Database and Query Model Searching for the best answers Browsing features of BANKS Experiment Conclusion
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Error scores vs parameter choices
The rankings are
relatively stable across different choices of parameter values
= 0.2 coupled
with log scaling of
edges weights
does best
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Outline
Introduction Database and Query Model Searching for the best answers Browsing features of BANKS Experiment Conclusion
Seoyoung Ahn Keyword Searching and Browsing in Databases using BANKS
Conclusion
BANKS system provides an integrated browsing and keyword
querying system for relational databases allows users with no knowledge of database systems
or schema to query and browse relational database with ease