distributed database design 3rd assignment
TRANSCRIPT
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 1/22
DISTRIBUTED DATABASE
DESIGN
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 2/22
2
DISTRIBUTED DATABASE DESIGN
The organization of distributed systems can be investigated along
three orthogonal dimensions:
1. Level of sharing
2. Behavior of access patterns
3. Level of knowledge on access pattern behavior
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 3/22
3
DISTRIBUTED DATABASE DESIGN
Level of sharing
– no sharing - each application and its data execute at one site,
– data sharing - all the programs are replicated at all the sites, but data
files are not,
– data plus program sharing - both data and programs may be shared.
Behavior of access patterns
– static - access patterns of user requests do not change over time,
– dynamic - access patterns of user requests change over time.
Level of knowledge on access pattern behavior – complete information - the access patterns can reasonably be
predicted and do not deviate significantly from the predictions,
– partial information - there are deviations from the predictions.
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 4/22
4
ALTERNATIVE DESIGN STRATEGIES
Two major strategies that have been identified for designing
distributed databases are:
•
the top-down approach
• the bottom-up approach
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 5/22
5
ALTERNATIVE DESIGN STRATEGIES
TOP-DOWN DESIGN PROCESS
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 6/22
6
ALTERNATIVE DESIGN STRATEGIES
TOP-DOWN DESIGN PROCESS• Activity begins with the requirement analysis that defines
environment of the system and elicits both the data and
processing needs of database users.
• The requirement study also specifies where the final system is
expected to stand with respect to the objectives of the
distributed DBMS.
• The requirement document is input to two parallel activities:
view design and conceptual design.
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 7/22
Continued..
• view design - defining the interfaces for end users,
• conceptual design - is the process by which the enterprise isexamined to determine entity types and relationships among
these entities. One can possibly divide this process into torelated activity groups:
– entity analysis - is concerned with determining theentities, their attributes, and the relationships amongthese entities,
– functional analysis - is concerned with determining thefundamental functions with which the modeled enterpriseis involved.
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 8/22
Continued..
– View integration should be used to ensure that entity and
relationship requirements for all the views are covered in
the conceptual schema.
– In conceptual design and view design activities the user
needs to specify the data entities and must determine the
applications that will run on the database as well as
statistical information about these applications.
– Statistical information includes the specification of the
frequency of user applications, the volume of variousinformation.
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 9/22
9
ALTERNATIVE DESIGN STRATEGIES
TOP-DOWN DESIGN PROCESS• The GCS and acess pattern information collected as a result of
view design are inputs to the distribution design step.
• distributions design - design the local conceptual schemas by
distributing the entities over the sites of the distributed system. The
distribution design activity consists of two steps: – fragmentation
– allocation
• physical design - is the process, which maps the local conceptual
schemas to the physical storage devices available at the correspondingsites,
• observation and monitoring - the results is some form of feedback,
which may result in backing up to one of the earlier steps in the design.
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 10/22
10
ALTERNATIVE DESIGN STRATEGIES
BOTTOM-UP DESIGN PROCESS
Top-down design is a suitable approach when a database
system is being designed from scratch.
If a number of databases already exist, and the design task
involves integrating them into one database - the bottom-up
approach is suitable for this type of environment. The starting
point of bottom-up design is the individual local conceptual
schemas. The process consists of integrating local schemas into
the global conceptual schema.
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 11/22
11
DISTRIBUTION DESIGN ISSUESREASONS FOR FRAGMENTATION
• The important issue is the appropriate unit of distribution. For a number
of reasons it is only natural to consider subsets of relations as distribution
units.
• If the applications that have views defined on a given relation reside at
different sites, two alternatives can be followed, with the entire relation
being the unit of distribution. The relation is not replicated and is stored
at only one site, or it is replicated at all or some of the sites where the
applications reside.
• The fragmentation of relations typically results in the parallel execution of
a single query by dividing it into a set of subqueries that operate on
fragments. Thus, fragmentation typically increases the level of
concurrency and therefore the system throughput.
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 12/22
12
DISTRIBUTION DESIGN ISSUESREASONS FOR FRAGMENTATION
• There are also the disadvantages of fragmentation:
– if the application have conflicting requirements which
prevent decomposition of the relation into mutually
exclusive fragments, those applications whose views are
defined on more than one fragment may suffer
performance degradation,
– the second problem is related to semantic data control,
specifically to integrity checking.
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 13/22
13
DISTRIBUTION DESIGN ISSUESFRAGMENTATION ALTERNATIVES
• The are clearly two alternatives:
– horizontal fragmentation
– vertical fragmentation
• The fragmentation may, of course, be nested. If the nestings
are of different types, one gets hybrid fragmentation.
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 14/22
14
DISTRIBUTION DESIGN ISSUESDEGREE OF FRAGMENTATION
• The extent to which the database should be fragmented is an
important decision that affects the performance of query
execution.
• The degree of fragmentation goes from one extreme, that is,
not to fragment at all, to the other extreme, to fragment to
the level of individual tuples (in the case of horizontal
fragmentation) or to the level of individual attributes (in the
case of vertical fragmentation).
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 15/22
15
DISTRIBUTION DESIGN ISSUESCORRECTNESS RULES OF FRAGMENTATION
Completeness
If a relation instance R is decomposed into fragments R1 ,R2 , ..., Rn , each
data item that can be found in R can also be found in one or more of Ri ’s.
This property is also important in fragmentation since it ensures that the
data in a global relation is mapped into fragments without any loss.Reconstruction
If a relation R is decomposed into fragments R1 ,R2 , ..., Rn , it should be
possible to define a relational operator such that:
R =Ri , Ri F R
The reconstructability of the relation from its fragments ensures that
constraints defined on the data in the form of dependencies are
preserved.
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 16/22
16
DISTRIBUTION DESIGN ISSUESCORRECTNESS RULES OF FRAGMENTATION
Disjointness
If a relation R is horizontally decomposed into fragments R1 ,R2 , ..., Rn and
data item d i is in R j , it is not in any other fragment Rk (k j ). This criterion
ensures that the horizontal fragments are disjoint. If relation R is vertically
decomposed, its primary key attributes are typically repeated in all itsfragments. Therefore, in case of vertical partitioning, disjointness is
defined only on the nonprimary key attributes of a relation.
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 17/22
17
DISTRIBUTION DESIGN ISSUESALLOCATION ALTERNATIVES
• The reasons for replication are reliability and efficiency of
read-only queries.
• Read-only queries that access the same data items can be
executed in parallel since copies exist on multiple sites.
• The execution of update queries cause trouble since the
system has to ensure that all the copies of the data are
updated properly.
• The decisions regarding replication is a trade-off which
depends on the ratio of the read-only queries to the update
queries.
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 18/22
18
DISTRIBUTION DESIGN ISSUESALLOCATION ALTERNATIVES
• A nonreplicated database (commonly called a partitioned
database) contains fragments that are allocated to sites, and
there is only one copy of any fragment on the network.
• In case of replication, either the database exists in its entirety
at each site ( fully replicated database), or fragments are
distributed to the sites in such a way that copies of a fragment
may reside in multiple sites ( partially replicated database).
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 19/22
19
DISTRIBUTION DESIGN ISSUESALLOCATION ALTERNATIVES
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 20/22
20
DISTRIBUTION DESIGN ISSUESINFORMATION REQUIREMENTS
• The information needed for distribution design can be divided
into four categories:
– database information: local organization of database
– application information: location of application
– communication network information: access
characteristics of applications to database
– computer system information.
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 21/22
21
DISTRIBUTION DESIGN ISSUESFRAGMENTATION
• Horizontal fragmentation partitions a relation along its tuples
• Two versions of horizontal fragmentation
– Primary horizontal fragmentation of relation is performed using
predicates that are defined on that relation
– Derived fragmentation is the partitioning of relation that results from
predicates being defined on another relation
7/28/2019 Distributed Database Design 3rd Assignment
http://slidepdf.com/reader/full/distributed-database-design-3rd-assignment 22/22
22
DISTRIBUTION DESIGN ISSUESFRAGMENTATION
• Vertical fragmentation partitions a relation into a set
of smaller relations so that many of users aplications
will run on only one fragment
• Vertical fragmentation is inherently more
complicated than horizontal partitioning