properties of data replication : replication can increase read performance replication can be used...
TRANSCRIPT
Properties of Data Replication :•Replication can increase read performance
•Replication can be used to integrate heterogeneous systems that uses different databases.
• Replication can be used for both failure and disaster recovery
• The costs of using replication is decreased update performance and the costs of managing consistency problems. (Relaxed ACID properties may be used in order to minimize inconsistency).
Cases where replication designs and extended transactions are recommended:
• ERP systems • E-commerce• Electronic health records (Elektronisk patientjournal)• Logistics• Airline reservation systems• CSCW systemer• Distributed calender systems• Mobile integrated database applications• Supply chain management• Banking systems• Library systems
Overview of different types of replication designs:
Suppose all replication designs have n physical copies of the same logical table.
In the n-safe replication design, all n copies are consistent and up to date.
In the quorum-safe design, a quorum of the n copies are consistent and up to date.
In the 1-safe design, only one of the n copies is consistent and up to date.
In the 0-safe design, none of the copies are consistent and up to date. The inconsistencies can be managed by using countermeasures.
Logical table
........................
..Physical tablesPhysical tablesPhysical tables1 to n relationship
Rules for using Data Replication in Distributed Databases with Relaxed ACID Properties
1. Use only replicated data when it is necessary for availability or economical reasons.
2. The 0-safe design with local commit is recommended when it is important to update in disconnected mode and it is possible to implement sufficient local countermeasures against the isolation anomalies.
3. The 0-safe design with deferred commit is recommended when it is important to update in disconnected mode and it is not possible to implement sufficient local countermeasures against the isolation anomalies.
4. The 1-safe designs are recommended in situations when real time update is not important in disconnected mode. Therefore, updates must wait until the primary copy location has committed the updates.Please notice that the basic 1-safe designs is the cheapest replication method.
The Basic 1-safe Design:
The 1-safe designs are recommended in situations when real time update is not important in disconnected mode. Therefore, updates may wait until the primary copy location has committed the updates.The basic 1-safe designs is the cheapest replication method.
Primary location Secondary location
Client
Commit Log records
Commit
Customers Orders Products Orderlines
Product-stocks
E-Commerce System
Customers Orders Products Orderlines
Product-stocks
ERP System
Suppose the ERP system and B2B E-commerce system are heterogeneous. Which ERP tables would you recommend to replicate by using the Basic 1-safe design?
Replication Example:The basic 1-safe design is recommended when it does not make major problems if updates are delayed for hours or even days. In the “Danish medicine card” system, all hospitals and private physicians must transfer their medicine prescriptions to a central database by using the 0-safe design with local commit. What replication design would you recommend for the Medicine types?
Patients_____ Patient IDNameAddress
Diagnoses/diseases
Disease typesMedicine prescriptions
Medicine types
The basic 1-safe Design
The 1-safe Design with commutative updates
Primary location Secondary location
Client
Commit Log records
Commit
Nearest location Remote location Client
Commit Trans. record drecord
In the basic 1-safe replication design, lost transaction may occur when the secondary location takes over after a primary copy failure.
Why can lost transactions not occur in the 1-safe design with commutative updates?
The Basic 1-safe Design Example:
The basic 1-safe design is recommended when it does not make major problems if updates are delayed for hours or even days.
Example:
Patients_____ Patient IDNameAddress
Diagnoses/diseases
Disease types
Which of the 1-safe designs would you recommend for the Disease types table?Where would you recommend to store the back up of this table?
Entities with different versions of 0-safe design
Which of the1-safe designswould you recommend?
Health records
TreatmentsDiagnoses/diseases
Patient admitsSympthoms
and test results
Employees
Prescriptions
Prescription lines
Patient discharges
...
Conseptual hospital entites in general are below the dottet line
Basic Health records are above the dottet line
...Patient admit type
Health record subtypes
Figure 2. Generalized ER diagram of a local hospital database
Medical tests subtypes
Sympthom types
Disease types
Treatment types
Patient discharges type
Medicin types
Medicin productsMedicin
companies
Patients_____ Patient IDNameAddress
The 0-safe design with local commit:
Nearest location Remote location Client
Trans. record drecord
The 0-safe design with local commit is recommended when it is important to update in disconnected mode and it is possible to implement sufficient local countermeasures against the isolation anomalies.
Example.
Patients_____ Patient IDNameAddress
SympthomsSymptom
types
The 0-safe design with deferred commit:
The 0-safe design with deferred commit is recommended when it is important to update in disconnected mode and it is not possible to implement sufficient local countermeasures against the isolation anomalies.
Example.
Nearest location Primary location Client
Compensate?
Trans. record drecord
Compensate?
Patients_____ Patient IDNameAddress
Diagnoses/diseases
Disease types
0-safe Design Example:In the “Danish medicine card” system, all hospitals and private physicians must transfer their medicine prescriptions to a central database by using the 0-safe design with local commit. Which type of 0-safe replication design would you recommend for the Medicine prescriptions?
Patients_____ Patient IDNameAddress
Diagnoses/diseases
Disease typesMedicine prescriptions
Medicine types
Where would you recommend the different types of 0-safe design?
Entities with the1-safe design
Health records
TreatmentsDiagnoses/diseases
Patient admitsSympthoms
and test results
Employees
Prescriptions
Prescription lines
Patient discharges
...
Conseptual hospital entites in general are below the dottet line
Basic Health records are above the dottet line
...Patient admit type
Health record subtypes
Figure 2. Generalized ER diagram of a local hospital database
Medical tests subtypes
Sympthom types
Disease types
Treatment types
Patient discharges type
Medicin types
Medicin productsMedicin
companies
Patients_____ Patient IDNameAddress
Electronic Health Records
Patients
Healthrecords
Hospitaladmis-sions
Diag-noses
Pre-scrip-tions
Surge-ries
Hospitaldis-charges
- Patient - ID- Name - Address- Balance
…….
Deffered commit Local commit
3. The 0-safe design with local commit is recommended when it is important to operate in disconnected mode, and it is possible to implement sufficient local countermeasures against the isolation anomalies.
4. The 0-safe design with deferred commit is recommended when it is important to operate in disconnected mode, and when it is not possible to implement sufficient local countermeasures against the isolation anomalies.
Customers Orders Products Orderlines
Product-stocks
E-Commerce System
Customers Orders Products Orderlines
Product-stocks
ERP System
Suppose the ERP system and B2B E-commerce system are heterogeneous. Which tables would you recommend to replicate by using one the 0-safe designs?
The basic N-safe Design
The basic 1-safe Design
The basic 0-safe Design
Primary location Secondary location
Client
Commit Log records
Commit
Nearest location Remote location Client
Commit Trans. record drecord
Commit
Coordinator location Participant tion Client
Commit
Log records
Commit
Evaluation of replication designsReplication methods
Properties
n-safe design with the ROWA protocol
n-safe design with the quorum protocol
1-safe design. Basic solution
1-safe design with com-mutative updates
0-safe design with local commit
0-safe designs with deferred commit
No- replica-tion design
Read per-formance/ capacity
Best Worst Average Average Best Best Below average
Write per-formance
Worst Above worst
Average Average Best Below best
Average
Ease of failure recovery
Average Average Worst Best Best Below best
Best
Ease of disaster recovery
Best Below best
Above worst
Average Average Below average
Worst
The probability of lost data
Best. pn
Below best pn/2
Worst p
Average
Average
Average
Worst p
Logging of the update transaction
Optional Optional Optional Not optional
Not optional
Not optional
Optional
Availability Below average
Below average
Average in theory
Best Best Below best
Worst
Atomicity Best Best Worst Best Best Best Best Consistency Best Best Average Average Worst Worst Best Isolation Best Best Average Average Worst Worst Best Durability Best Best Worst Best Best Best Best Develop-ment costs
Best Best Best Average Worst Worst Best
Electronic Health Records (EHR)
Patients
Healthrecords
Hospitaladmis-sions
Diag-noses
Pre-scrip-tions
Surge-ries
Hospitaldis-charges
- Patient - ID- Name - Address- Balance
…….
Architectures for integrating electronic health records
Evaluation criteria
Local auto-nomy
Read perfor-mance
Soft-ware costs
Anomaly pro-blems
1. Integration by using SOA health services
Best Worst Worst Worst
2. Central database solution Worst Best Best Best
3. Central database solution mixed with SAO integration
Average Above worst
Average Above worst
4. Distributed subscriber solution Best Best Worst Average
5. Central subscriber offering SOA services to others
Best Average Worst Average
6. Central database solution with central subscription and SOA services to others
Average Average Average Average
7. Central database solution mixed with distributed subscription on top of central subscription
Average Best Average Average
Overview of the most important EHR integr. architectures
Architectures for integrating electronic health records
Evaluation criteria
Local auto-nomy
Read perfor-mance
Software costs
Anomaly problems
1. Integration by using SOA health services
Best Worst Worst Worst
2. Central database solution Worst Best Best Best
3. Central database solution mixed with SAO integration
Average Above worst Average Above worst
Read
Read
Hospital of residence Hospital of admission
Any other hospital
Data
Data Data
Architectures for integrating electronic health records
Evaluation criteria
Local auto-nomy
Read perfor-mance
Software costs
Anomaly problems
1. Integration by using SOA health services
Best Worst Worst Worst
2. Central database solution Worst Best Best Best
3. Central database solution mixed with SAO integration
Average Above worst Average Above worst
UpdateHospital of admission
Any other hospital
Central database with all functions
Update
Read
Hospital of residence
Data
Architectures for integrating electronic health records
Evaluation criteria
Local auto-nomy
Read perfor-mance
Software costs
Anomaly problems
1. Integration by using SOA health services
Best Worst Worst Worst
2. Central database solution Worst Best Best Best
3. Central database solution mixed with SAO integration
Average Above worst Average Above worst
UpdateHospital of admission
Any other hospital
Central database with only new develloped functionsData
UpdateRead
Hospital of residence
Read
ReadData
DataData
Architectures for integrating electronic health records
Evaluation criteria
Local auto-nomy
Read perfor-mance
Software costs
Anomaly problems
4. Distributed subscriber solution Best Best Worst Average
5. Central subscriber offering SOA services to others
Best Average Worst Average
6. Central solution with central sub-scription and SOA services to others
Average Average Average Average
Hospital of residence Hospital of admission
Any other hospital
Update
Update
Data
DataData
Architectures for integrating electronic health records
Evaluation criteria
Local auto-nomy
Read perfor-mance
Software costs
Anomaly problems
4. Distributed subscriber solution Best Best Worst Average
5. Central subscriber offering SOA services to others
Best Average Worst Average
6. Central solution with central sub-scription and SOA services to others
Average Average Average Average
UpdateHospital of admission
Any other hospital
Update
Read
Hospital of residence
Central read distribution of redundant dataData
DataData
Data
Architectures for integrating electronic health records
Evaluation criteria
Local auto-nomy
Read perfor-mance
Software costs
Anomaly problems
4. Distributed subscriber solution Best Best Worst Average
5. Central subscriber offering SOA services to others
Best Average Worst Average
6. Central solution with central sub-scription and SOA services to others
Average Average Average Average
UpdateHospital of admission
Any other hospital
Central database with new develloped functionsData
UpdateRead
Hospital of residence
Read
ReadData
DataData
Architectures for integrating electronic health records
Evaluation criteria
Local auto-nomy
Read perfor-mance
Software costs
Anomaly problems
5. Central subscriber offering SOA services to others
Best Average Worst Average
6. Central database solution with central subscription and SOA services to others
Average Average Average Average
7. Central solution mixed with distributed subscription on top of central subscription
Average Best Average Average
UpdateHospital of admission
Any other hospital
UpdateHospital of residence
Update Update
Data
Central update distribution of both redundant and central data
Data Data
Data
Architectures for integrating electronic health records
Evaluation criteria
Local auto-nomy
Read perfor-mance
Soft-ware costs
Anomaly pro-blems
1. Integration by using SOA health services
Best Worst Worst Worst
2. Central database solution Worst Best Best Best
3. Central database solution mixed with SAO integration
Average Above worst
Average Above worst
4. Distributed subscriber solution Best Best Worst Average
5. Central subscriber offering SOA services to others
Best Average Worst Average
6. Central database solution with central subscription and SOA services to others
Average Average Average Average
7. Central database solution mixed with distributed subscription on top of central subscription
Average Best Average Average
Which EHR integration architecture would you recommend and why?
Architectures for integrating electronic health records
Evaluation criteria
Local auto-nomy
Read perfor-mance
Soft-ware costs
Anomaly pro-blems
1. Integration by using SOA health services
Best Worst Worst Worst
2. Central database solution Worst Best Best Best
3. Central database solution mixed with SAO integration
Average Above worst
Average Above worst
4. Distributed subscriber solution Best Best Worst Average
5. Central subscriber offering SOA services to others
Best Average Worst Average
6. Central database solution with central subscription and SOA services to others
Average Average Average Average
7. Central database solution mixed with distributed subscription on top of central subscription
Average Best Average Average
Overview of the most important EHR integr. architectures
Concept definitions used in logistic exercise:
Pallet = wooden skeleton where packages may be stored in such a way that they all can be moved by a truck.
Collie = alle the packages that are stored on a pallet(palle).
Leg = Route or subroute where the transportation does not have stops
ER-diagram of a logistics management system
Orders Customers
Transport medias like ships, airplanes, and trucks.
Physical containers
Scheduled routes and legs
Orderlines
Packages and Collies
Loca-tions
Route-leg hierarchy
Package- Collie hierarchy
from
to
Routes and legs
from
to
Damagerelationship
from to
Container-routes
Transport operator
How should the transport orders and sub-orders be replicated in order to optimize the transports?
Global ER-diagram of integrated logistics management
Orders Customers
Transport medias like ships, airplanes, and trucks.
Physical containers
Scheduled routes and legs
Orderlines
Packages, Collies and Containers
Loca-tions
Route-leg hierarchy
Package- Collie hierarchy
from
to
Routes and legs
from
to
Damagerelationship
from to
Container-routes
Transport operator
Describe the local databases in the central location of the transport company, the locations of the integrated transport suppliers, and the mobile locations of the Transport medias. Design a workflow with focus on the integration of the local database locations.
Horizontal fragmentation:
Global table
Fragment 1
Fragment 2
Fragments = The non-redundant and non-overlapping parts of a global distributed table.
Fragments may be allocated in many different locations.
Vertikal Fragmentation:
Example:In an employ table some attribytes/fragments may be confidential and stored in a secure location.
Fragmentation rules:
Vertical fragmentation supports distribution by function where different functions use different attributes.
Horizintal fragmentation supports geografical distribution where different locations use different rows.
Mixed Fragmentation:
Horizontal fragmentation on a vertical fragmentation.
Horizontal fragmentation on a vertical fragmentation.
Entities with different versions of 0-safe design
Entities with 1-safe design
Describe your recommendations for distributed table fragmentation?Health records
TreatmentsDiagnoses/diseases
Patient admitsSympthoms
and test results
Employees
Prescriptions
Prescription lines
Patient discharges
...
Conseptual hospital entites in general are below the dottet line
Basic Health records are above the dottet line
...Patient admit type
Health record subtypes
Figure 2. Generalized ER diagram of a local hospital database
Medical tests subtypes
Sympthom types
Disease types
Treatment types
Patient discharges type
Medicin types
Medicin productsMedicin
companies
Patients_____ Patient IDNameAddress
A distributed ERP system =A set of local ERP systems integrated in such a way that each local system can use the resources/stocks managed by the other local ERP systems.
Orders Orderlines
Customers
Locations Stocks
Products
Would you recommend the distributed ERP architecture for a mobile salesman?
Which tables in Distributed ERP system would you recommend to replicate?
Customers Orders Products Orderlines
Product-stocks Locations
1. Use only replicated data when it is necessary or convenient for economical reasons. 2. The 0-safe design with local commit is recommended when it is important to update in
disconnected mode and it is possible to implement sufficient local countermeasures against the isolation anomalies.
3. The 0-safe design with deferred commit is recommended when it is important to update in disconnected mode and it is not possible to implement sufficient local countermeasures against the isolation anomalies.
4. The 1-safe designs are recommended in situations when real time update is not important in disconnected mode. Therefore, updates may wait until the primary copy location has committed the updates.Please notice that the basic 1-safe designs is the cheapest replication method.
Allokation of fragments:
Customers Orders Products Orderlines
Product-stocks Locations
Is it possible to optimize the Product and Product stock tables by integrating them to a single table locally?
Allocation is the physical placement of fragments in different locations. Allocated fragments may be replicated.
Exercise:How are the tables fragmented and allocated in a distributed ERP system?
Replication in a distributed ERP system:
Orders are fragmented and without replication
Fragmented and 0-safe with primary copy commit.
Fragmented and without replication
Customers Orders Products Orderlines
Product-stocks Locations
Not fragmented and basic 1-safe.
Local customers are fragmented and has the basic 1-safe design.
Global customers are fragmented and has the 0-safe with primary copy commit.
Should all attributes in a table have the same replication design? (Analyze the Orderlines or Products ).
Exercise:
Customers Orders Products Orderlines
Product-stocks Locations
Describe and design the local databases for a distributed brewery with many different production, sale and depot locations.
Can the earlier described distributed ERP system be used?
Design a Distributed Airline Database
Design an integrated distributed database that integrate the databases of different airline companies in a way that optimize performance, availability and consistency of a common distributed airline system with local databases in the airline companies, airports, and “sale offices” at e.g. travel agents, hotels and e-commerce servers.
Flight routes
Subroutes
Departures
Airports
Tickets
Travel arrangement
Customers
Airline companies
Plains
Plain types
Exercise:
Hotels
Rooms
Room reservations
Services/ tours/ car rentals Check-in
periods
Customers Customer groups
Hotel chains Design an integrated
distributed database that integrate the databases of different airline companies and hotel chains in a way that optimize performance, availability and consistency of a common distributed system with local databases in the airline companies, hotel chains, airports, and “sale offices” at e.g. travel agents, hotels and e-commerce servers.
End of session
Thank you !!!Thank you !!!
The Basic 1-safe Design
The Basic 0-safe Design
Primary location Secondary location
Client
Commit Log records
Commit
Nearest location Remote location Client
Commit Trans. record drecord
Commit
The x-safe replication design:
Suppose n is the number of replicated tables andx is an integer in the interval [0,n] where n is an integer greater than one.
In the x-safe designx out of n replicated tables are consistent and and up to date.
Countermeasures against missing isolation property
Patients
Healthrecords
Hospitaladmit-ments
Diag-noses
Pre-scrip-tions
Surge-ries
Hospitaldis-charges
- Patient - ID- Name - Address- Balance
…….
Version file countermeasure
Commutative update countermeasure
Version file countermeasure
Pessimistic view countermeasure for related resources
Properties of different EHR database design methods Evaluation criteria Database designs for EHR systems.
Traditional normalized database design
XML based storing of variable health attributes
Generalized subtypes are used for storing variable health attributes
Flexibility towards new health record types
Worst Best Best
Performance of overview queries
Worst Best Best
Performance of queries that need variable health attributes
Best Average Worst
Storage consumption Best Average Worst
Development costs for table driven applications
Worst Best Best
Flexibility towards data analyses
Average Worst Best
Is normalization used? Yes No No
EHR Datawarehouse:
Departments
The most important attributes of Electronic Patient Record eventsFact table
Dimensions
Conformed dimension hierarchies
Hospitals
Regions
Contries
Attributes special for ERP event type 1
Dimensions for event type 1
Time dimensionPatientsEHR event
types
Attributes special for ERP event type n.........
One-to-one relationship One-to-one relationship
Dimensions for event type 1 .........
Dimension hierarchies special for the different event types
Properties of different EHR database design methods Evaluation criteria Database designs for EHR systems.
Traditional normalized database design
XML based storing of variable health attributes
Generalized subtypes are used for storing variable health attributes
Flexibility towards new health record types
Worst Best Best
Performance of overview queries
Worst Best Best
Performance of queries that need variable health attributes
Best Average Worst
Storage consumption Best Average Worst
Development costs for table driven applications
Worst Best Best
Flexibility towards data analyses
Average Worst Best
Is normalization used? Yes No No
”Health replication” methods0-safe with primary copy in the patients home location.
Patients
Patientrecords
Hospitaladmit-ments
Diag-noses
Pre-scrip-tions
Surge-ries
Hospitaldis-charges
- Patient - ID- Name- Address- Balance
…….
0-safe withlocal commit
0-safe withprimary copycommit in thepatients homelocation
0-safe with local commit.
Relations to resourceshave primary copy commitin the locations of theresources.