unit 06 : index and distributed caching comp 5323 web database technologies and applications 2014

77
Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Upload: kristopher-baker

Post on 11-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Unit 06 : Index and Distributed Caching

COMP 5323Web Database Technologies and

Applications 2014

Page 2: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

• This PowerPoint is prepared for educational purpose and is strictly used in the classroom lecturing.

• We have adopted the "Fair Use" doctrine in this PowerPoint which allows limited copying of copyrighted works for educational and research purposes.

Doctrine of Fair Use

Page 3: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Learning Objectives

• Understand different path index techniques to improve the performance.

• Learn a distributed memory caching system which improves the performance of web database applications

Page 4: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Outline

1.Index for Semi-structured data2.Distributed Caching

Page 5: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

1 Index for Semi-structured Data

Page 6: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

6

Why is Indexing Needed?

• Allows fast access to data by replicating portions of the data in special purpose structures.

• Despite the additional cost (storage, maintenance and complexity) they have shown to be useful in evaluating queries.

Page 7: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Index Types

• Structural index– Accessing all elements of given name– Ancestor-descendant and parent-child

relationship between elements

• Content index– Accessing elements containing given

keywords– Supporting most text search functionalities

Page 8: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Classical Content Index

• Classically based on inverted lists – For each term, gives the

doc.ID + localization• Several variations allows

different search types– Offset, Relative, Proximity

• Generally stored in a B+-Tree to optimize search for a given word

• Size is an important issue– Memory and Disk

• (word, localization)– Fixed entry (word

repeated)

• (word, Frequency, (localization)*)– Variable length entry

Words Localization

- t1 : doc1-100, doc1-300, doc3-200, …

- t2 : doc2-30, doc4-70, …

- t3 : doc4-87, doc5-754, …

Short Reference: http://www.igi-global.com/dictionary/inverted-index/15654

Page 9: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

9

Problem with XML

• Support of element addressing– Doc.ID should include

NodeId (Xpath) + Offset• Index size becomes very

large– XPath are long

• Support of typed data– Integer, float, simple types

of XML schema– Requires classical indexes

for certain elements

• Query processing– Structural joins– Text search– Exact search

• Support of updates– Incremental updates

would be a plus

Page 10: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Path-based approach

• Represent XML document into tree or graph structure

• Index XML document directly– Without the support of DTD

• Mainly use the memory as the index storage• Properties– Keep the structural information to improve query

performance– Easy to support query with regular path expression

Page 11: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Different Approaches• Patricia Trie

– Cooper et al. 2001 • DataGuides

– J. McHugh et al., 1997• T-Index

– Tova Milo and Dan Sucin• APEX (Adaptive Path Index for XML Data)

– C. W. Chung et al., 2002• Dewey Structure• K-ary Table• Path Table• OrdPath

Page 12: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Partricia Tries

• A compact representation of a trie in which any node that is an only child is merged with its parent.

• Also known as radix tree

Page 13: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Partricia Tries

• Cooper et al. 2001 • Idea:– Partitioned Partricia Tries to index strings– Encode XPath expressions as strings

(encode names, encode atomic values)

<book> <author>Whoever</author> <author>Not me</author> <title>No Kidding</title></book>

B A 1 WhoeverB A 2 Not meB T No Kidding

Page 14: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

DataGuides

• World-Wide Web demonstrates that much of the information available online is semistructured.

• Graph-based data model called OEM, for Object Exchange Model

Page 15: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

A sample OEM database

Page 16: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Lore Language (Lorel Query)

• Select Restaurant.Entree – returns all entrees served by any restaurant,

the set of objects {6, 10, 11}

• Select Restaurant.Name• where Restaurant.Entree = “Burger”– The answer to the query is the single object 5.

Page 17: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

DataGuide

Page 18: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

T-index

Page 19: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

1-Index

1-Index DataGuides

Page 20: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

2-Index

Page 21: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

APEX

Page 22: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Representation of XML Data Structure

Page 23: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

DataGuide

Page 24: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

APEX

HAPEX GAPEX

Page 25: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Dewey - Structure• Each node is assigned a label that represents the path from

the document’s root to the node.• Each component of the label represents the local order of an

ancestor node.• Nodes with the same number of delimiters (“.”) in their label

are in the same level.Bib

book paper

paperauthor

Tim Sarah

author

(0)(0)

(0.0)(0.0)

(0.0.0)(0.0.0)

(0.0.0.0)(0.0.0.0)

(0.1)(0.1)

(0.2)(0.2)

(0.2.0)(0.2.0)

(0.2.0.0)(0.2.0.0)

Page 26: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Dewey – Supported Queries (1/3)

• Ancestors / Descendants– Node “X” is an ancestor of node “Y” if the label of

node “X” is a substring of the label of node “Y”.

Bib

book paper

paperauthor

Tim Sarah

author

(0)(0)

(0.0)(0.0)

(0.0.0)(0.0.0)

(0.0.0.0)(0.0.0.0)

(0.1)(0.1)

(0.2)(0.2)

(0.2.0)(0.2.0)

(0.2.0.0)(0.2.0.0)

Page 27: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Dewey – Supported Queries (2/3)• Parent / Child– Node “X” is parent of node “Y” if:

- The label of node “X” is a substring of the label of node “Y”

- And frags(X) = frags(Y) – 1, where frags(X) is the number of delimiters of the label of node X and frags(Y) is the number of delimiters of label of node Y.

Bib

book paper

paperauthor

Tim Sarah

author

(0)(0)

(0.0)(0.0)

(0.0.0)(0.0.0)

(0.0.0.0)(0.0.0.0)

(0.1)(0.1)

(0.2)(0.2)

(0.2.0)(0.2.0)

(0.2.0.0)(0.2.0.0)

Page 28: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Dewey – Supported Queries (3/3)• Siblings– Nodes “X” and “Y” are siblings if:

- They have the same number of delimiters in their labels - And X.prefix = Y.prefix, where prefix is the label of the

node without its positional identifier

Bib

book paper

paperauthor

Tim Sarah

author

(0)(0)

(0.0)(0.0)

(0.0.0)(0.0.0)

(0.0.0.0)(0.0.0.0)

(0.1)(0.1)

(0.2)(0.2)

(0.2.0)(0.2.0)

(0.2.0.0)(0.2.0.0)

Page 29: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Dewey – Updates• Insertion of new node– The label of the nodes in the subtree rooted at the

following sibling need to be updated– O(n) nodes need relabeling, where n is the number of

nodes of the XML fileBib

book paper

paperauthor

TimSarah

author

(0)(0)

(0.0)(0.0)

(0.0.0)(0.0.0)

(0.0.0.0)(0.0.0.0)

(0.1)(0.1)

(0.2)(0.2)

(0.2.0)(0.2.0)

(0.2.0.0)(0.2.0.0)

paper(0.2)(0.2)

(0.3)(0.3)

(0.3.0)(0.3.0)

(0.3.0.0)(0.3.0.0)

Page 30: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Dewey• Not efficient for dynamic XML files with many updates– Need to re-label many nodes

• As the depth of the tree increases:– Label size of a node increases rapidly • Storage size increases rapidly

– It becomes more costly to infer the supported queries between any two nodes (the string prefix matching becomes longer)

• Overflow problem– The original fixed length of bits assigned to store the

size of the label is not enough.

Page 31: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Document Tree

real node

virtual node

aa

3-ary tree

cc cc

dd ee ff gg

hh ii jj

ee ee ee ee ee

ee

ee

• Lee et al, ACM DL 1996.• Represent each document as a k-ary complete tree and assign a UID to each node

Page 32: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

K-ary table

• Each document is assigned k, which is the maximum number of siblings in the document tree.

• Each element has an entry (row) in the K-ary table

• When a query is issued, the result set has pointers to the K-ary table.

Page 33: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Level and Element Type Number

• Level– Level means the level in the document tree– It gives a clue how many parent function is applied to

get to a target element• Element type number– A unique number is assigned to each element type in

DTD It enables to filter out unnecessary elements and accumulate the correct frequencies

• Element location– The unique position of an element instance in a

document tree

Page 34: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Element Labeling

Document(1)

Para(5)

Abstract(3)

Chapter(4)

Section(6)

Para(7)

Title(2)

Page 35: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

UID

element UID element UID

D1 C1 C2 S1 S2

1 2 3 4 5

S3 S4 P1 P2 P3

8 9 14 15 16

Result of assigning UIDs

parent(i) = [(i-2)/k+1]

dd

3-ary tree

cc cc

ss ss ss ss

pp pp pp

ee ee ee ee ee

ee

• Unique element identifier

Page 36: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

36

XML Index Path Table (Oracle)

BaseRid Path OrderKey

Value Locator NumValue

Rid1 po

Rid1 po.data 1 11

Rid1 po.data.item

1.1 “foo” 17

Rid1 po.data.pkg

1.2 “123” 32 123

Rid1 po.data.item

1.3 “bar” 46

<po> <data> <item>foo</item> <pkg>123</pkg> <item>bar</item> </data></po>

Some Typos

Page 37: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

37

OrdPath

• ORDPATHs: Insert-Friendly XML Node Labels– Patrick O’Neil, Elizabeth O’Neil1, Shankar Pal,

Istvan Cseri, Gideon Schaller, Nigel Westbury– SIGMOD 2004– SQL Server 2005 implementation

Page 38: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

38

OrdPath

• Aims to provide efficient insertion at any position of an XML tree, and also supports extremely high performance query plans for native XML queries.

• Tree modifications– new may be inserted– sub-trees be deleted– sub-trees may be moved around within the

tree

Page 39: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

39

OrdPath• Encodes the parent-child relationship by extending the

parent’s ORDPATH label with a component for the child. – E.g.: 1.5.3.9 might be the parent ORDPATH, 1.5.3.9.1 the

child. • The various child components reflect the children’s relative

sibling order, so that byte-by-byte comparison of the ORDPATH labels of two nodes yields the proper document order.

• A new node (possibly a root node of a sub-tree) can be inserted under any designated parent node in an existing tree. – Its label is generated using an additional intermediate

“careting” component that falls between the components of its left and right siblings.

Page 40: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

40

OrdPath• At the beginning– Only positive, odd integers are assigned during an initial

load; even-numbered and negative integer component values are reserved for later insertions into an existing tree

• Inserting in the middle– Even numbers are used as carets only. Do not count as

components that increase the depth of the nodes.– E.g. new nodes in between 3.5.5 and 3.5.7

• New siblings: 3.5.6.1, 3.5.6.2, …• A subtree: 3.5.6.1, 3.5.6.1.1, 3.5.6.3, 3.5.6.3.1, 3.5.6.3.3,

3.5.6.3.3.1, 3.5.6.3.3.3, 3.5.6.3.5, 3.5.6.5, 3.5.6.5.1

Page 41: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

XML

Page 42: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

42

ORDPATH Label of Nodes

BOOK1

Section1.3

Figure1.3.5

Title1.3.1

Section1.5

Title1.5.1

Figure1.5.5

@ISBN1.1

CAPTION1.3.5.1

Nobody…1.3.3

tree frogs1.5.7

All right…1.5.3

Page 43: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

43

Infoset Table

Page 44: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

2 Memcached

Page 45: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

What is memcached briefly?

• memcached is a high-performance, distributed memory object caching system, generic in nature

• It is a key-based cache daemon that stores data and objects wherever dedicated or spare RAM is available for very quick access

• It is a dumb distributed hash table. It does not provide redundancy, failover or authentication. If needed the client has to handle that.

Page 46: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Why was memcached made?

• It was originally developed by Danga Interactive to enhance the speed of LiveJournal.com

• It dropped the database load to almost nothing, yielding faster page load times for users, better resource utilization, and faster access to the databases on a memcache miss

• http://www.danga.com/memcached/

Page 47: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

47

Memcached

Page 48: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Where does memcached reside?

• Memcache is not part of the database but sits outside it on the server(s).

• Over a pool of servers

Page 49: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Architecture

Page 50: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

When should I use memcached?

• When your database is optimized to the hilt and you still need more out of it.– Lots of SELECTs are using resources that could be better

used elsewhere in the DB.– Locking issues keep coming up

• When table listings in the query cache are torn down so often it becomes useless

• To get maximum “scale out” of minimum hardware

Page 51: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Hit-rate Management

• anything what is more expensive to fetch from elsewhere, and has sufficient hitrate, can be placed in memcached– How often will object or data be used?– How expensive is it to generate the data?– What is the expected hitrate?– Will the application invalidate the data itself, or will TTL be

used? – How much development work has to be done to embed it?

Page 52: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Why use memcached?

• To reduce the load on the database by caching data BEFORE it hits the database

• Can be used for more then just holding database results (objects) and improve the entire application response time

• Feel the need for speed– Memcache is in RAM - much faster then hitting

the disk or the database

Page 53: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Why not use memcached?

• Memcache is held in RAM. This is a finite resource.

• Adding complexity to a system just for complexities sake is a waste. If the system can respond within the requirements without it - leave it alone

Page 54: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

What are the limits of memcached?

• Keys can be no more then 250 characters• Stored data can not exceed 1M (largest typical

slab size)• There are generally no limits to the number of

nodes running memcache• There are generally no limits the the amount of

RAM used by memcache over all nodes– 32 bit machines do have a limit of 4GB though

Page 55: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Platform

• You can build and install memcached from the source code directly, or you can use an existing operating system package or installation.– on a RedHat, Fedora or CentOS host, use yum:

• root-shell> yum install memcached– on a Debian or Ubuntu host, use apt-get:

• root-shell> apt-get install memcached– on a Gentoo host, use emerge:

• root-shell> emerge install memcached– on OpenSolaris, use the pkg for SUNWmemcached:

• root-shell> pkg install SUNWmemcached

Page 56: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Port 11211

• Get the source from the website: http://www.danga.com/memcached/download.bml– Memcache has a dependancy on libevent so make

sure you have that also.

• Decompress, cd into the dir• ./configure;make;make install;• Memcached listens on port 11211 by default,

this can be changed with –p option.

Page 57: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

How do I start memcached?

• Memcached can be run as a non-root user if it will not be on a restricted port (<1024) - though the user can not have a memory limit restriction

• shell> memcached • Default configuration - Memory: 64MB, all

network interfaces, port:11211, max simultaneous connections: 1024

Page 58: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Memcached options

• You can change the default configuration with various options.– -u <user> : run as user if started as root– -m <num> : maximum <num> MB memory to use for items

• If more then available RAM - will use swap• Don’t forget 4G limit on 32 bit machines

– -d : Run as a daemon– -l <ip_addr> : Listen on <ip_addr>; default to INDRR_ANY– -p <num> : port

Page 59: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

How can I connect to memcached?

• Memcached uses a protocol that many languages implement with an API.

• Languages that implement it:– Perl, PHP, Python, Ruby, Java, C#, C, Lua, Postgres,

MySQL, Chicken Scheme

• And yes - because it is a protocol you can even use telnet– shell> telnet localhost 11211

Page 60: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Memcached protocol

• Three types of commands– Storage - ask the server to store some data

identified by a key• set, add, replace, append, prepend and cas

– Retrieval - ask the server to retrieve data corresponding to a set of keys• get, gets

Page 61: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Memcached protocol (con’t)– All others that don’t involve unstructured data • Deletion:delete • Statistics: stats, • flush_all: always succeeds, invalidate all existing items

immediately (by default) or after the expiration specified.• version, verbosity, quit

Page 62: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

PHP and Memcached

• Make sure you have a working Apache/PHP install

• PHP has a memcached extension available through pecl.

• Installation:– shell> pecl install memcache

• Make sure the pear is installed (debian: apt-get install php-pear)

• Make sure that you also have php5-dev installed for phpize.– shell> apt-get install php5-dev

Page 63: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

PHP Script Example• Information about the PHP API at

http://www.php.net/memcache

<?php // make a memcache object $memcache = new Memcache; // connect to memcache $memcache->connect('localhost', 11211) or die ("Could not connect"); //get the memcache version $version = $memcache->getVersion(); echo "Server's version: ".$version."<br/>\n";

Page 64: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

PHP Script (con’t)// test data$tmp_object = new stdClass;$tmp_object->str_attr = 'test';$tmp_object->int_attr = 123;// set the test data in memcache$memcache->set('key', $tmp_object, false, 10) or die ("Failed to

save data at the server");echo "Store data in the cache (data will expire in 10 seconds)<br/>\

n";// get the data$get_result = $memcache->get('key');echo "Data from the cache:<br/>\n";echo ‘<pre>’, var_dump($get_result), ‘</pre>’;

MEMCACHE_COMPRESSED

Page 65: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

PHP Script (con’t)// modify the data$tmp_object->str_attr = ‘boo’;$memcache->replace(‘key’, $tmp_object, false, 10) or die(“Failed to

save new data to the server<br/>\n”);Echo “Stored data in the cache changed<br/>\n”;// get the new data$get_result = $memcache->get(‘key’);Echo “New data from the cache:<br/>\n”;Echo ‘<pre>’, var_dump($get_result), “</pre>\n”;// delete the data$memcache->delete(‘key’) or die(“Data not deleted<br/>\n”);

Page 66: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

MySQL’s memcached

• The API is consistent with the other API’s– Connect: mysql> SELECT

memc_servers_set('192.168.0.1:11211, 192.168.0.2:11211');• The list of servers used by the memcached UDFs is not

persistent over restarts of the MySQL server.– Set: mysql> SELECT memc_set('myid', 'myvalue');– Retreive: mysql> SELECT memc_get('myid');

Page 67: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Possible ways to secure memcached

• It has no authentication system - so protection is important

• Run as a non-priveledged user to minimize potential damage

• Specify the ip address to listen on using -l– 127.0.0.1, 192.168.0.1, specific ip address

• Use a non-standard port• Use a firewall

Page 68: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Memcached

Page 69: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

69

• Slab AllocationSlab Allocation: : When you start to store data into the cache, memcached does not allocate the memory for the data on an item by item basis. Instead, a slab allocation is used to optimize memory usage and prevent memory fragmentation when information expires from the cache.

• Lazy Expiration + LRULazy Expiration + LRU• Lazy ExpirationLazy Expiration: : When an item is requested (a get request)

Memcached checks the expiration time to see if the item is still valid before returning it to the client.

• LRU LRU ((least recently used): Memcached is LRU per slab class, : Memcached is LRU per slab class, but not globally LRU. but not globally LRU.

Memcached Memory Management

Page 70: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

70

Slab Allocation

Page 71: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

71

Memcached Distributed Architecture

////Obtain a server ID based on Obtain a server ID based on Key Key valuevalueint getServerId(char *key, int serverTotal) {int getServerId(char *key, int serverTotal) { int c, hash = 0;int c, hash = 0; while (c = *key++) {while (c = *key++) { hash += c;hash += c; }} return hash % serverTotal;return hash % serverTotal;}}

////a list of serversa list of serversnode[0] => 192.168.0.1:11211node[0] => 192.168.0.1:11211node[1] => 192.168.0.2:11211node[1] => 192.168.0.2:11211node[2] => 192.168.0.3:11211node[2] => 192.168.0.3:11211

////get id get id int id = getServerId("test", 3);int id = getServerId("test", 3);

////get ip address and port numberget ip address and port numbernode[id] == node[1]node[id] == node[1]

Page 72: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Memcached Distributed Architecture

Page 73: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

SQL Server Cache

• SQL Server cache mechanism :- Query plans- pages from the database files

• but it does NOT cache:- exact results from a query

REFERENCEhttp://searchsqlserver.techtarget.com/tip/SQL-Server-memory-configurations-for-procedure-cache-and-buffer-cache

Page 74: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Reference

• DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases

• Index Structure for Path Expressions• APEX: An Adaptive Path Index for XML Data

Page 75: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Reference Only• https://blog.couchbase.com/memcached-144-

windows-32-bit-binary-now-available• Memcached should run on Linux. It may not

work on some windows

Page 76: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Reference

Page 77: Unit 06 : Index and Distributed Caching COMP 5323 Web Database Technologies and Applications 2014

Reference

• 0: <flag> • 60: timeout• 6: length