spring 2008mmdb--mirs1 multimedia information retrieval systems
TRANSCRIPT
Spring 2008 MMDB--MIRS 2
MIRS Architecture Should be flexible and extensible
to support diverse applications, query types and contents (features).
MIRSs consist functional modules - added to extend, deleted or repl
aced.
Another characteristic of MIRSs is that they are normally distributed consisting of servers and clients results from the large size of multimedia data
Spring 2008 MMDB--MIRS 3
MIRS Architecture Figure
User Interface
Feature Extractor
Communication Manager
Indexing and Search Engine
Storage Manager
Spring 2008 MMDB--MIRS 4
MIRS architecture user interface
insertion of new multimedia items and retrieval. These items can be stored files or input from variant devices
feature extractor The contents or features of multimedia items are extracted either automa
tically or semi-automatically
communication managers These features and the original items are sent to the server or servers
indexing and retrieval engine At the server(s), the features are organized according to a certain indexi
ng scheme for efficient retrieval
storage manager The indexing information and the original items are stored appropriately.
Spring 2008 MMDB 6
MMDB Architectures The Principle of Autonomy: each media type (e.g.
image, video, etc.) is organized in a media-specific manner suitable for that media type.
The Principle of Uniformity: we represent the content of all the different media objects within a single data structure (“unified index”).
The Principle of Hybrid Organization: certain media types use their own indexes, while others use the “unified” index.
Spring 2008 MMDB--Indexing 11
Multidimensional Data Structures A geographic information system (GIS) stores inf
ormation about some physical region of the world. A map is just viewed as a 2-d image, and certain “
points” on the map are considered to be interest. Specialized data structures:
k-d Trees Point Quadtrees MX-Quadtrees R-Trees
Spring 2008 MMDB--Indexing 12
k-D Trees Used to store k dimensional point data. A 2-d tree stores 2-dimensional point data; a 3-d
tree stores 3-dimensional point data, and so on. Node structure:
nodetype = record INFO: infotype; // any user-defined type XVAL: real; // coordinate YVAL: real; // coordinate LLINK: nodetype; RLINK: nodetype;end
Spring 2008 MMDB--Indexing 13
2-d Tree2-d Tree
Def: A 2-d tree is a binary tree satisfying the following condition. (with root a level 0) For node N with level(N) is even, then every node M
under N.LLINK has the property that M.XVAL < N.XVAL, and every node P under N.RLINK has the property that P.XVAL N.XVAL.
For node N with level(N) is odd, then every node M under N.LLINK has the property that M.YVAL < N.YVAL, and every node P under N.RLINK has the property that P.YVAL N.YVAL.
Spring 2008 MMDB--Indexing 14
Example: a map Example: a map
A (19, 45)
D (54, 40)C (38, 38)
B (40, 50)
E (4, 4)
Spring 2008 MMDB--Indexing 15
Example 2-d TreeExample 2-d Tree
INFO (XVAL, YVAL) A (19, 45) B (40, 50) C (38, 38) D (54, 40) E (4, 4)
A(19, 45)
D(54, 40)
C(38, 38)
B(40, 50)E(4, 4)
RLINKLLINK
Level 0
Level 1
Level 2
Level 3
Spring 2008 MMDB--Indexing 16
Split Regions Split Regions
A (19, 45)
D (54, 40)C (38, 38)
B (40, 50)
E (4, 4)
Spring 2008 MMDB--Indexing 17
Suppose T is a 2-d tree, and we wish to delete N(x, y).
If N is a leaf node, then set link to NIL and delete N.
Otherwise, Step 1: Find a “candidate replacement” node R that occurs
either in Ti for i {, r}. (Ti under N)
Step 2: Replace all of N’s non-link fields by those of R.
Step 3: Recursively delete R from Ti.
Deletion in 2-d TreesDeletion in 2-d Trees
Spring 2008 MMDB--Indexing 18
If we find a replacement node from left subtree, it might be violated the definition of 2-d tree. (Two nodes have equal maximal values.)
Find the replacement node R from right subtree (Tr).
Every node M in Tr is such that M.XVAL R.XVAL if level(N) is even, and M.YVAL R.YVAL if level(N) is odd.
If Tr is not empty, and level(N) is even (odd), then any node in Tr that has the smallest possible XVAL (YVAL) field in Tr is a candidate replacement node.
Candidate Replacement NodeCandidate Replacement Node
Spring 2008 MMDB--Indexing 19
If Tr is empty –
Find the node R’ in left subtree T with the smallest XVAL if level(N) is even, or the smallest YVAL if level(N) is odd.
Replace all of N’s nonlink fields by those of R’.
Set N.RLINK = N.LLINK and set N.LLINK = NIL.
Recursively delete R’.
Candidate Replacement NodeCandidate Replacement Node
Spring 2008 MMDB--Indexing 20
Find all the points within distance r of point (xc, yc). Each node N in a 2-d tree implicitly represents a reg
ion RN. If the circle specified in a query has no intersection
with RN, then there is no point searching the subtree rooted at node N.
Range Queries in 2-d TreesRange Queries in 2-d Trees
Spring 2008 MMDB--Indexing 21
Example of Range QueryExample of Range Query
A (19, 45)
D (54, 40)C (38, 38)
B (40, 50)
E (4, 4)
(51, 43)
r=5
Query:
point (51, 43)
r = 5
Spring 2008 MMDB--Indexing 22
k-d Tree for k 2k-d Tree for k 2
For each node N Suppose level(N) mod k = i. For each node M in N’s left subtree, M.VAL[i] <
N.VAL[i]. For each node P in N’s right subtree, P.VAL[i]
N.VAL[i].
Note: when k=1, we get a standard binary search tree.
Spring 2008 MMDB--Indexing 23
Point Quadtrees Point quardtrees split regions into four parts. These four parts are called the NW (northwest),
SW (southwest), NE (northeast), and SE (southeast) quadrants determined by node N.
Node structure in a point quadtree:qtnodetype = record INFO: infotype; XVAL: real; YVAL: real; XLB, XUB, YLB, YUB: real {+, - } NW, SW, NE, SE: qtnodetype;end
Spring 2008 MMDB--Indexing 24
Point Quadtrees XLB, XUB, YLB, and YUB are upper bounds and lo
wer bounds of X and Y, respectively. Suppose N is a root node of a tree or subtree of quadt
ree: Every node P1 in N.NW is such that P1.XVAL < N.XVAL a
nd P1.YVAL N.YVAL. Every node P2 in N.SW is such that P2.XVAL < N.XVAL a
nd P2.YVAL < N.YVAL. Every node P3 in N.NE is such that P3.XVAL N.XVAL a
nd P3.YVAL N.YVAL. Every node P4 in N.SE is such that P4.XVAL N.XVAL a
nd P4.YVAL < N.YVAL.
Spring 2008 MMDB--Indexing 26
Example of Point Quadtree
INFO (XVAL, YVAL) A (19, 45) B (40, 50) C (38, 38) D (54, 40) E (4, 4)
A (19, 45)
NW SW NE SE
E (4, 4)
NW SW NE SE
B (40, 50)
NW SW NE SE
C (38, 38)
NW SW NE SE
D (54, 40)
NW SW NE SE
Spring 2008 MMDB--Indexing 28
Deletion in Point Quadtrees
A node N to be deleted -- If N is a leaf node, the link from parent to N is set
to NIL then releases node N. Otherwise, try to find a replacement node R from
subtrees such that: Every other node R’ in N.NW (N.SW / N.NE / N.SE) i
s to the north west (south west / north east / south east) of R.
Spring 2008 MMDB--Indexing 29
Can’t Find Replacement Node
Unfortunately, replacement node may not be found. (i.e. deleting A)
Thus, in the worst case, deletion of an interior node N may require reinsertion of some nodes in the subtrees pointed to by N.NE, N.SE, N.NW, and N.SW.
Spring 2008 MMDB--Indexing 30
Range queries in Point Quadtrees
Do not search regions that do not intersect the circle defined by the query.
Search procedure:ProcRangeQueryPQtree(T:newqtnodetype, C:circle) if region(T) C = Ø then Halt else if (T.XVAL, T.YVAL) C then print (T.XVAL, T.YVAL); ProcRangeQueryPQtree(T.NW, C); ProcRangeQueryPQtree(T.SW, C); ProcRangeQueryPQtree(T.NE, C); ProcRangeQueryPQtree(T.SE, C); end proc
Spring 2008 MMDB--Indexing 31
MX-Quadtree MX-quadtrees attempt to: ensure that the shape of
the tree are independent of the number of nodes present in the tree, as well as the order of insertion of these nodes.
MX-quadtrees also attempt to provide efficient deletion and search algorithms.
The map being represented is “split up” into a grid of size (2k 2k) for some k.
All physical data are performed at leaf nodes. Node Structure: Exactly the same as for point qua
dtrees.
Spring 2008 MMDB--Indexing 32
Example of MX-quadtree
INFO (XVAL, YVAL) A (1, 3) B (3, 3) C (3, 1) D (3, 0)
A B
C
D
NW SW NE SE
NW SW NE SE NW SW NE SE NW SW NE SE
A
NW SW NE SE
B
NW SW NE SE
C
NW SW NE SE
D
NW SW NE SE
Spring 2008 MMDB--Indexing 33
Deletion in MX-Quadtrees Deletion in an MX-quadtree is a fairly simple op
eration, because all points are represented at the leaf level.
When a node is deleted, we have to keep trace its parent node to see whether it contains no child node? If so, continue to delete the parent nodes recursively.
Spring 2008 MMDB--Indexing 34
Range Queries in MX-QuadtreesHandled in exactly the same way as for point quar
dtrees. But there are two differences: The content of XLB, XUB, YLB, YUB, fields is
different from that in the case of point quadtrees. As points are stored at the leaf level, checking to
see if a point is in the circle defined by the range query needs to be performed only at the leaf level.
Spring 2008 MMDB--Indexing 35
R-Trees Used to store rectangular regions of an image or
a map. R-trees are particularly useful in storing very lar
ge amounts of data on disk. They provide a convenient way of minimizing th
e number of disk accesses. Each R-tree has an associated order, integer k. Each nonleaf R-tree node contains a set of at mo
st k rectangles and at least k/2 rectangles. Each disk access brings back a page containing a
t least k/2 rectangles.
Spring 2008 MMDB--Indexing 37
Example R-TreeR-Tree of order 4
G1 G2 G3
R1 R2 R3 R4 R5 R6 R7 R8 R9
Structure of R-tree node:
rtnodetype = record Rec1, …, Reck: rectangle; P1, …, Pk: rtnodetype;end
Spring 2008 MMDB--Indexing 38
Insertion into a R-TreeInsertion into a R-Tree
R5R1
R2 R3R4
R7 R6
R9
R8
G1 G2
G3R10 R11
Spring 2008 MMDB--Indexing 40
Option 2 : Preferred Option 2 : Preferred
R5R1
R2 R3R4
R7 R6
R9
R8
G1 G2
G3R10 R11
G4
*** Total area of the group rectangles is smallest.
Spring 2008 MMDB--Indexing 41
Option 3 : Incorrect Option 3 : Incorrect
R5R1
R2 R3R4
R7 R6
R9
R8
G1 G2
G3R10 R11
G4
*** G4 is underflow.
Spring 2008 MMDB--Indexing 42
Deletion in R-Trees Deletion of objects from R-trees may cause a
node in the R-tree to “underflow” in which contains less than k/2 rectangles. Such as delete R9 from our example.
If delete R9, we must create a new logical grouping. One possibility is as follows:
G1 G2 G3
R1 R2 R3 R4 R6 R7 R5 R8