scalable data stores
Post on 16-Jan-2022
3 Views
Preview:
TRANSCRIPT
SCALABLE DATA STORES Data Management in the Cloud
34
Overview
• New systems have emerged to address requirements of data management in the cloud – so-called “NoSQL” data stores
– scalable SQL databases
• Horizontal Scaling – shared nothing
– replicating and partitioning data over thousands of servers
– distribute “simple operation” workload over thousands of servers
• Simple Operations – key lookups
– read and writes of one or a small number of records
– no complex queries or joins
35
Defining “NoSQL”
• No agreed upon definition – “not only SQL”
– “not relational”
– …
• Six key features 1. ability to horizontally scale simple operation throughput over many
servers
2. ability to replicate and distribute (partition) data over many servers
3. simple call level interface or protocol (in contrast to a SQL binding)
4. weaker concurrency model than ACID transactions of most relational (SQL) database systems
5. efficient use of distributed indexes and RAM for data storage
6. ability to dynamically add new attributes to data records
36 Based on: “Scalable SQL and NoSQL Data Stores”, 2010
Data Models
• Terminology – tuple: row in a relational table, where attribute names and types are
defined by a schema, and values must be scalar
– document: supports both scalar values and nested documents, and the attributes are dynamically defined for each document
– column family: groups key/value pairs (columns) into families to partition and replicate them; one column family is similar to a document as new (nested, list-valued) attributes can be added
– object: analogous to objects in programming languages, but without procedural methods
• Relational – data is stored in relations (tables) of tuples (rows) of scalar values
– queries expressed over arbitrary (combinations of) attributes
– indexes defined over arbitrary (combinations of) attributes
37
Based on: “Scalable SQL and NoSQL Data Stores”, 2010
Key/Value Data Model
• Interface – put(key, value)
– get(key): value
• Data storage – values (data) are stored based on programmer-defined keys
– system is agnostic as to the structure (semantics) of the value
• Queries are expressed in terms of keys
• Indexes are defined over keys – some systems support secondary indexes over (part of) the value
38
k1 v1
k2 v2
k3 v3
…
kn vn
Document Data Model
• Interface – set(key, document)
– get(key): document
– set(key, name, value)
– get(key, name): value
• Data storage – documents (data) is stored based on programmer-defined keys
– system is aware of the (arbitrary) document structure
– support for lists, pointers and nested documents
• Queries expressed in terms of key (or attribute, if index exists)
• Support for key-based indexes and secondary indexes
39
k1 “name”:“fred”
k2 “name”:“mary”;“age”:“25”
k3
…
kn “name”:“john”;“address”:“k3”
“name”:“oak st”
Private Public
Column Family Data Model
• Interface – define(family)
– insert(family, key, columns)
– get(family, key): columns
• Data storage – <name, value, timestamp> triples (so-called columns) are stored based
on a column family and key; a column family is similar to a document
– system is aware of (arbitrary) structure of column family
– system uses column family information to replicate and distribute data
• Queries are expressed based on key and column family
• Secondary indexes per column family are typically supported
40
k1 “name”:“fred”
k2 “name”:“mary”
k3
…
kn “name”:“john”
“name”:“oak st”
“title”:“Mr”
“age”:“25”
Graph Data Model
• Interface – create: id
– get(id)
– connect(id1, id2): id
– addAttribute(id, name, value)
– getAttribute(id, name): value
• Data storage – data is stored in terms of nodes and (typed) edges
– both nodes and edges can have (arbitrary) attributes
• Queries are expressed based on system ids (if no indexes exist)
• Secondary indexes for nodes and edges are supported – retrieve nodes by attributes and edges by type, start and/or end node,
and/or attributes
41
n1 n2
n3
“name”:“fred”
“name”:“mary”;“age”:“25”
“name”:“oak st”
LIKES
LIKES
“weight”:“-1”
Array Data Model
• Nested multi-dimensional arrays – cells can be tuples or other
arrays – can have non-integer
dimensions
• Additional “History” dimension on updatable arrays
• Ragged arrays allow each row or column to have a different length
• Supports multiple flavors of “null” – array cells can be “EMPTY” – user-definable treatment of
special values
SciDB DDL
CREATE ARRAY Test_Array
< A: integer NULLS,
B: double,
C: USER_DEFINED_TYPE >
[I=0:99999,1000, 10, J=0:99999,1000, 10 ]
PARTITION OVER ( Node1, Node2, Node3 )
USING block_cyclic();
Attribute names A, B, C
Index names I, J
Chunk size 1000
Overlap 10
Object Data Model
• Interface – set(object)
– get(query): object
• Data storage – typed programming language objects (plus referenced objects) stored
– attribute can be collection-valued
– database is aware of the type (schema) of objects
• Objects are retrieved using queries or by traversal from “roots”
• Specialized indexes can be expressed based on schema
44
“mary”
25
Person
“fred”
27
Person LIKES
LIKES
“oak st”
Address
LIVES_AT
APPLICATION SCENARIO Data Management in the Cloud
45
An Application Scenario
• As an example application scenario, we will use graph data management and processing throughout the course
• Graph data applications – social networking
– Semantic Web (i.e. RDF graphs)
– data provenance
– Web site ranking (i.e. Page Rank)
– …
• No (mature) graph databases exist – graph data stores are available (Neo4j, OrientDB, …)
• Use existing (mature) non-graph database – graph data model must be mapped to data model of database
– algorithms must be specified in database language
46
Graph Data
• A graph 𝐺 = 𝑉, 𝐸 consists of a set of nodes 𝑉 and a set of edges 𝐸
• Edges are pairs (𝑢, 𝑣) that can either be ordered or unordered – if edges are ordered, the graph is directed; we sometimes refer to
directed edges as arcs
– if edges are unordered, the graph is undirected
• Each edge can be associated with a weight 𝑤 – if edges have a weight, the graph is weighted
– if edges have no weight, the graph is unweighted
• A path from 𝑣1 to 𝑣𝑛 is a sequence of nodes ⟨𝑣1, 𝑣2, … , 𝑣𝑛⟩ such that ∀ 1 ≤ 𝑖 < 𝑛: ∃ 𝑣𝑖 , 𝑣𝑖+1 ∈ 𝐸
• A graph is (fully) connected if there is a path between every pair of nodes
47
Graph Processing
• Classical graph algorithms – shortest path
– bridges
– transitive closure
• “Web 2.0” – friend of a friend
– who follows who?
– who might know who?
• Social network analysis – degree centrality
– closeness centrality
– betweenness centrality
48
Classical Graph Algorithms
• Shortest Path – find the shortest path(s) between two nodes
– e.g. Dijkstra’s algorithm 𝑂 𝐸 + 𝑉 log 𝑉
– weighted graphs: minimize total weight
– unweighted graphs: minimize number of edges
• Bridges – find edges that connect to graph components
– Tarjan’s algorithm 𝑂 𝑉 + 𝐸
– parallel algorithms exist
• Transitive Closure – inserts all transitive paths as edges
– Floyd-Warshall’s algorithm 𝑂 𝑉3
49
Social Network Analysis
• The degree centrality of a node 𝑣 in a graph 𝐺 = (𝑉, 𝐸) is defined as the number of edges incident to 𝑣, normalized by the number of other nodes in 𝐺
𝐶𝐷 𝑣 =deg (𝑣)
𝑉 − 1
• The closeness centrality of a node 𝑣 in a graph 𝐺 = 𝑉, 𝐸 is given by the inverse of the sum of its geodesic distance (weight or length of the shortest path) to all other nodes in 𝐺
𝐶𝐶 𝑣 = 1 𝛿(𝑣,𝑢)𝑢∈𝑉\v
50
Social Network Analysis
• The betweenness centrality of a node 𝑣 in a graph 𝐺 = 𝑉,𝐸 is defined as
𝐶𝐵 𝑣 = |𝑃 𝑢,𝑤 𝑣 |
|𝑃 (𝑢, 𝑤)|𝑢,𝑤∈𝑉∧𝑢≠𝑣≠𝑤
where 𝑃 (𝑢,𝑤) denotes the set of all shortest paths from 𝑢 to 𝑤 and 𝑃 𝑢,𝑤 𝑣 denotes the set of all shortest paths from 𝑢 to 𝑤 that pass through 𝑣
51
Usage Profiles
• Database updates and consistency – data changes frequently, results need to be accurate
– data changes infrequently, results need to be accurate
– query results may not reflect the latest state of the database
• Different types of queries – point-based, bound: a node, a node and its neighbors, friends of a
friend
– point-based, unbound: a node and all its reachable nodes, shortest path between two nodes
– graph-based, per node: centralities
– graph-based, all: bridges, transitive closure
52
Case Study: SQL Server
This slide is intentionally left blank.
53
Case Study: Versant
This slide is intentionally left blank.
54
Looking Good, SQL Server!
55
NetScience Geom Erdos
SQL Server 0.054 0.077 0.151
Versant 0.340 2.240 1.930
0.000
0.500
1.000
1.500
2.000
2.500
Tim
e (m
s)
NetScience Geom Erdos
SQL Server 0.074 0.486 35.313
Versant 11.800 131.400 13875.900
0.000
2000.000
4000.000
6000.000
8000.000
10000.000
12000.000
14000.000
16000.000
Tim
e (m
s)
Shortest Paths Closeness Centrality
Oh, no!
56
NetScience Geom Erdos
SQL Server 2710.000 48092.000 129945.000
Versant 4.160 26.020 12.110
0.000
20000.000
40000.000
60000.000
80000.000
100000.000
120000.000
140000.000
Tim
e (m
s)
Bridges
Course Project
• The course will be accompanied by a project that is based on the scenario of graph data management and processing – Task 1: Study question that will take you through a “dry run” of
mapping the graph data model to a NoSQL data model and make you think about how to answer some simple queries.
– Task 2: Groups of 4-5 students will pick a NoSQL system and compile a systems profile, based on papers and documentation. These profiles will be presented in class.
– Task 3: Groups of 4-5 students will design a graph management and processing system based on the previously chosen NoSQL system. This time for real!
– Task 4: Groups of 4-5 students implement a prototype of the graph data management and processing system.
– Task 5: Each student will write a 3 page report and essay.
57
Task 1: Application Design “Dry Run”
• The goal of this project is to complete the following tasks – Pick a NoSQL data model and map graph data model into that model
– In words or pseudo-code, describe how you would do the queries below
• find a node properties based on its identifier
• find the neighbors of a given node
• find the "friends of a friend" of a given node
• transitive closure
• bridges
– Discuss how different usage profiles presented in the lecture (Slide #51) affect the processing of these queries
• Deliverable is a written report
• Students will conduct this part of the project individually
58
Task 2: Systems Profile
• Horizontally scalable data management systems – Riak (http://wiki.basho.com/)
– Project Voldemort (http://project-voldemort.com/)
– CouchDB (http://couchdb.apache.org/)
– SimpleDB (http://www.amazon.com/simpledb/)
– HBase (http://www.mongodb.org/)
– Cassandra (http://cassandra.apache.org/)
– OrientDB (http://www.neo4j.org/)
• Groups of 4-5 students collaborate on a systems profile – groups will be formed on October 4
– decide in advance with who you would like to work on what
59
Task 2: Systems Profile
• Data Model – Precise description of the data model, especially in terms of differences
from the "standard" models presented in the lecture
– Detailed summary of the basic data manipulation API, i.e. features to create, retrieve, update and delete data items.
• Query Support – Supported query types, i.e. point, range, navigation, and/or arbitrary?
– What is the query language of the system? Is it declarative, functional, algebraic and/or imperative?
– Are queries automatically optimized?
• Indexes – What index structures are available?
– What can be indexed? What can be a key? What can be a value?
– How are indexes managed, i.e. manually or automatically?
60
Task 2: Systems Profile
• Storage – Disk or file storage
– In-memory (RAM)
– Flash or SSD
– Traditional database
– Cloud Storage (GFS, HDFS, S3)
• Transactions and Concurrency Control – Does the system support transactions?
– How are transactions implemented, i.e. locks, OCC, MVCC, etc.?
– What consistency guarantees are given?
61
Task 2: Systems Profile
• Scalability and Replication – What types of replication are supported, i.e. synchronous or
asynchronous?
– ...
• Platform/Deployment – What cloud infrastructures are supported?
– What deployment scenarios are supported, i.e. embedded, client/server, multi-core CPU, cloud, etc.?
– Language bindings?
– Communication protocols, i.e. JSON, REST, etc.?
62
Task 3: Application Design
• Design the example graph data management in the previously profiled system – similar to Tart 1, but more technical as it is based on a concrete system
– consider only "friends of a friend", transitive closure and bridges query
– insights for other queries optional, but highly welcome and appreciated
• Deliverable is a ten minute presentation in class – November 15
– discuss final design and motivate the design choices made w.r.t. the requirements of the application and the capabilities of the system
– give details on the mapping of data structures, planned indexes, and query implementation strategies
• Same groups of 4-5 students will continue to collaborate
63
Task 4: Prototype Implementation
• Implement a small prototype based on previous design – data model (with data loading capabilities)
– three queries mentioned before
• The goal of this part of the project is to realize of a small application and experience its performance in practice
• Alternative Option: If you feel you lack implementation experience to complete this task, you may contribute to "benchmarking" the systems built by your peers
• Deliverable is the developed source code by the students
• The same groups of 4-5 students continue to collaborate
• The team with the best solution in terms of design and performance will win a price!
64
Task 5: Report and Essay
• Final task is to write a 3 page paper
• Report (1 page) – summarize major decisions and choices made throughout the course of
the projects
– show an understanding of and active involvement in the project
• Essay (2 pages) – “cloud-scale data management systems in today's world of computing”
– demonstrate that they can position these systems within the IT landscape
– show awareness of advantages and disadvantages
– conclude with a personal opinion about/view on cloud-scale data management systems.
• Deliverable is the final report
• This part of the project is conducted individually
65
top related