dbsystems distributed database systems university of manitoba asper school of business 3500 dbms bob...

24
D D B B S S Y Y S S T T E E M M S S Distributed Database Systems University of Manitoba Asper School of Business 3500 DBMS Bob Travica Updated 2013 1 of 24

Upload: kathleen-watkins

Post on 01-Jan-2016

216 views

Category:

Documents


3 download

TRANSCRIPT

DDBB

SSYYSSTTEEMMSS

Distributed Database

Systems

University of ManitobaAsper School of Business

3500 DBMSBob Travica

Updated 2013

1 of 24

DDBB

SSYYSSTTEEMMSS

Distributed Databases (DDB) Concept

Britain

Germany

France

Italy

SELECT SalesFROM Britain.SalesUNIONSELECT SalesFROM France.SalesUNIONSELECT SalesFROM Italy.Sales

Multiple independent systems

Each has DBMS engine, queries, locking,

transactinos, etc.

Usually on different machines & locations

Can be different hardware, OS, software.

2 of 24

DDBB

SSYYSSTTEEMMSS

DatabaseZeus

DatabaseApollo

DatabaseAthena

United States

England

France

• There must be different databases as opposed to a central database (teleprocessing)

• Network is a part of DDB system

Distributed Database System (DDBS) Concept

3 of 24

DDBB

SSYYSSTTEEMMSS

Teleprocessing vs. DDBS

DB

Loc 1, central

Loc 2 Loc 3

DBLoc1

DBLoc2

Teleprocessing DB System

Distributed DB System

4 of 24

DDBB

SSYYSSTTEEMMSS

DDBS Rules

Golden Rule - Transparency: the user should not know or care that the database is distributed and how. User is independent of any constraint.

Specifically: No reliance on a central site. Continuous operation. Location independence. Fragmentation (Partitioning)

independence Replication independence.

Support to distributed query processing.

Support to distributed transaction management.

Hardware independence. Operating system

independence. Network independence. DBMS independence.

(C.J. Date)

5 of 24

DDBB

SSYYSSTTEEMMSS

DDBS Features and Benefits

Each database can continue to run even if a portion fails.

Performance gains in contrast to teleprocessing.

Data and hardware can be moved without affecting

operations or users.

Operations expansion possible

System expansion and upgrades possible

6 of 24

DDBB

SSYYSSTTEEMMSS

More on DDBS Benefits

Support to operations’ dispersion Work and data are

segmented on departments. Work and data are

geographically segmented. Improved local performance

Most updates and queries are performed locally.

Local control and responsibility over data.

Can still combine data across the system.

Scalability and expansion support

Localtransactions

Futureexpansion

Combinedata

7 of 24

DDBB

SSYYSSTTEEMMSS

DDBS Designs

Two typical designs

PartitionedReplicatedPartitioned-

Replicated HybridTeleprocessing

Technical & economic requirements highlow

PartitionReplication C

B

A

C

B

AC

B

A

C

B

A

C

B

A

Loc 1 Loc 2 Loc 3

8 of 24

Loc 1

Loc 2

Loc 3

DDBB

SSYYSSTTEEMMSS

Choose hardware and DBMS vendor, and network.

Set up network and DBMS connections.

Choose locations for data segments.

Create backup plan and strategy.

Design local views (customized queries).

Design stress tests (max. loads, failure cases).

Creating Distributed Database System

9 of 24

DDBB

SSYYSSTTEEMMSS

Query Processing in Partitioned DDBS

Networks (LANs, WANs) infleunce the

speed of data transfer.

Goal is to minimize data transfers

Each system must be capable of

evaluating queries.

Results depend heavily on how the

system joins tables.

Disk driveLAN

WAN

10 of 24

DDBB

SSYYSSTTEEMMSS

Query Optimization in Partitioned DBS

Example NY: tbl.Customers: 1 M rows (Marketing) LA: tbl.Product: 10 M rows (Production) Chicago: tbl.Sale: 20 M rows (main Retail op’s) Query in Chicago: List customer and product data for

blue products sold on 1-Mar-2001. Must join the 3 tables.

Database is partitioned according to spatial dispersion of core operations (above). Only sales data are at the locale where query is run (Chicago).

Bad query design : Transfer entire tables Product and Customer to

Chicago

11 of 24

DDBB

SSYYSSTTEEMMSS

Customer(C#, …) 1,000,000 rows

NY

Product(P#, Color…) 10,000 rows

Sale(S#, C#, Date) 20,000,000 rowsSaleItem(S#, P#,…) 50,000,000 rows

Chicago

LA

returns requestedcustomerrecords

Better query design

qry1:SELECT C#FROM SaleWHERE Date=3/1/2006;

qry2:SELECT P#FROM SaleItem INNER JOIN Sale ON SaleItem.S#=Sale.S# WHERE Date=3/1/2006;

qry3:SELECT * FROM Customer WHERE C# in qry1’s output;

qry4:SELECT * FROM ProductWHERE P# in qry2’s outputAND Color like “blue”;

requests customerrecords for certain

customer IDs

requests product records

for certain Product IDs

returns requestedproductrecords

12 of 24

DDBB

SSYYSSTTEEMMSS

Replicated DDBS Design

Rule of thumb: Keep the data close to the location at which they are used.

Goals: Minimize transmissions Improve performance

Replication Manager

Britain: Customers& Sales

Spain: Customers& Sales

Britain

Britain: Customers& Sales

Spain: Customers& Sales

Spain

Periodic (batch) updates

Decision support systems:

Use replicated data warehouse.

13 of 24

DDBB

SSYYSSTTEEMMSS

Design: Partitioned database

The concurrency issues that apply to centralized database

become even more difficult in a partitioned database

Challenges: Slow network traffic Site unavailable Concurrency (more people trying to change the same data at the

same time) => locks management

DDBS and Concurrent Access

14 of 24

DDBB

SSYYSSTTEEMMSS

Two-Phase Commit Ensuring data consistency across the

system. In updating data, DBMS initiating

update becomes Coordinator. Two phases:

1. Get Ready: Coordinator sends new data and a request for update to all DDB parts.

DB partitions store new data in logs, and report update status.

2. Execute: 2.1 If all DDB parts ready for update,

Coordinator requests COMMIT. 2.2 If any DDB part not ready,

Coordinator requests ROLLBACK, and the update procedure repeats.

DB Partition 1 (Master price list)Initiate Transaction (Reduce price of

cookies for 5%)

DB Partition 2(Store A Product tbl) Partition 3

(Store B Product tbl)

1. Prepare to update.All agree?

2. CommitOK?

OK!

Go!

Coordinator

15 of 24

DDBB

SSYYSSTTEEMMSS

Client-Server DDB

Server performs database tasks at request of clients.

Clients handle front-end tasks and small data tables that are not shared.

Database Server

Client Front-end

(Forms, Queries,Reports)

QueryRequest

Query Output

Database ServerUpdateRequest

Update Response

16 of 24

DDBB

SSYYSSTTEEMMSS

Three-Tier Client-Server

Clients

Middleware

DatabaseServers

Front-end

DatabasesTransactionsLegacy systems

Database linksBusiness rulesCode

17 of 24

DDBB

SSYYSSTTEEMMSS

Client-Server Architecture for Internet

Client Browsers

DBS Middleware

DatabaseServers

Web Server

Application Server

Other Systems

18 of 24

DDBB

SSYYSSTTEEMMSS

Technologies for DDB Used in E-Commerce

- Open Database Connectivity (ODBC)

- Microsoft’s Active Data Objects (ADO)

- Client--Web Server Connectors (CGI)

- Data transfer (XML, SOAP)

19 of 24

DDBB

SSYYSSTTEEMMSS

Open Database Connectivity: ODBC

Since 1992, SQL Access Group; Microsoft’s decisive support

Provides SQL access to different DBMSes

ODBC handles: Login to database. Send query. Interpret result. Exchange data.

Oracle Database

Application

ODBC driver

CLIENT

ODBC driver

SE

LEC

T …

O

utpu

t

SQLDatabase

ODBC driver

ODBC driver

SERVER

SERVER

20 of 24

DDBB

SSYYSSTTEEMMSS

Active Data Objects (ADO)

Relational Database

Server

Front end (report)

ADO objects

ClientS

ELE

CT

Out

put

• ADO - Microsoft object-based technology for accessing data

• Data access at records level (“cursor programming”) via classes:

• recordset - similar to table• rowset - looser structure

Non-relationalDatabase

Server

• Builds on concepts of OO and cursor (the space for storing records in DB systems)

• Used in Active Server Pages (ASP)*

21 of 24

DDBB

SSYYSSTTEEMMSS

Client—Web Server Connectors

- HTML input needs to be translated for servers down the stream

- Output needs to be put back into HTML format

- Common Gateway Interface (e.g., CGI programming in PERL)

22 of 24

DDBB

SSYYSSTTEEMMSS

Communication technology -Extensible Markup Language (XML) &

SOAP

- SQL can extract DB data into XML documents. So, any database schema can be described => document transfer, validation and format support (e.g., supplier documents). - XML requires additional software for extracting data (parser) and formatting display (style sheets).

- Note: Document Type Definition (Declaration)

23 of 24

DDBB

SSYYSSTTEEMMSS

Extensible Markup Language (XML) & SOAP

- SOAP (Simple Object Access Protocol) - protocol for exchanging structured (marked up) data in distributed environment

- SOAP uses XML technologies to define an extensible

messaging framework, defining message that can

be exchanged over a variety of underlying protocols

(started with HTTP, hence “Simple”; complex indeed).

(See: http://www.w3.org/TR/2003/REC-soap12-part1-20030624/#encapsulation )

- SOAP and XML provide foundations for Web services – distributed application software, stored across nets, running on different operating systems and devices, written in different programming languages, and made by different vendors (e.g., .NET)

( More... )

24 of 24