lecture2
DESCRIPTION
TRANSCRIPT
Data and Applications Security Developments and Directions
Dr. Bhavani Thuraisingham
The University of Texas at Dallas
Lecture #2
Supporting Technologies: Data Management
January 13, 2005
Objective of the Unit
This unit will provide an overview of the concepts and developments in data management
Reference: Data Management Systems: Evolution and Interoperation, Thuraisingham, CRC Press, 1997
Outline of the Unit
Concepts in database systems Types of database systems Distributed Data Management Heterogeneous database integration Federated data management
Concepts in Database Systems
Definition of a Database system Early systems Metadata Architectural Issues
- Schema, Functional DBMS Design Issues Other Issues
- Database design, Administration
Database System
Consists of database, hardware, Database Management System (DBMS), and users
Database is the repository for persistent data Hardware consists of secondary storage volumes, processors, and
main memory DBMS handles all users’ access to the database Users include application programmers, end users, and the
Database Administrator (DBA) Need: Reduced redundancy, avoids inconsistency, ability to share
data, enforce standards, apply security restrictions, maintain integrity, balance conflicting requirements
We have used the definition of a database management system given in C. J. Date’s Book (Addison Wesley, 1990)
An Example Database System
Database
Database Management SystemApplicationPrograms
Users
Adapted from C. J. Date, Addison Wesley, 1990
Metadata
Metadata describes the data in the database
- Example: Database D consists of a relation EMP with attributes SS#, Name, and Salary
Metadatabase stores the metadata
- Could be physically stored with the database Metadatabase may also store constraints and administrative
information Metadata is also referred to as the schema or data dictionary
Three-level Schema Architecture: Details
ExternalSchema A
ExternalSchema B
ConceptualSchema
InternalSchema
User A1 User A2 User A3 User B1 User B2
ExternalModel A
ExternalModel B
ConceptualModel
StoredDatabaseInternal Model
External/ConceptualMapping B
External/ConceptualMapping A
Conceptual/Internal Mapping
Functional Architecture
User Interface Manager
QueryManager
Transaction Manager
Schema(Data Dictionary)Manager (metadata)
Security/IntegrityManager
FileManager
DiskManager
Data Management
Storage Management
DBMS Design Issues
Query Processing
- Optimization techniques Transaction Management
- Techniques for concurrency control and recovery Metadata Management
- Techniques for querying and updating the metadatabase Security/Integrity Maintenance
- Techniques for processing integrity constraints and enforcing access control rules
Storage management
- Access methods and index strategies for efficient access to the database
Other Issues
Database design
- Generally a two-step process Semantic data model to capture the entities of the
application and the relationships between the entities Generate the conceptual schema; theory of normal forms for
relational databases
- Research on object-oriented approaches for database design Database Administration
- Creating and deleting databases; backup and recovery, enforcing policies, auditing, etc.
Types of Database Systems
Relational Database Systems Object Database Systems Deductive Database Systems Other
- Real-time, Secure, Parallel, Scientific, Temporal, Wireless, Functional, Entity-Relationship, Sensor/Stream Database Systems, etc.
Relational Database: Informal Overview
Collection of tables also called relations Table has one or more columns also called attributes Each table has zero or more rows also called tuples Elements of a row take values from a pool of legal values The values of one or more columns in a row uniquely identify
the row. These columns form an identifier (also called key) One identifier is designated as the unique identifier (also called
primary key) Querying relational databases using language called SQL
(Structured Query Language)
Relational Database: Example
Relation S:
S# SNAME STATUS CITYS1 Smith 20 LondonS2 Jones 10 ParisS3 Blake 30 ParisS4 Clark 20 LondonS5 Adams 30 Athens
Relation P:
P# PNAME COLOR WEIGHT CITYP1 Nut Red 12 LondonP2 Bolt Green 17 ParisP3 Screw Blue 17 RomeP4 Screw Red 14 LondonP5 Cam Blue 12 ParisP6 Cog Red 19 London
Relation SP:
S# P# QTYS1 P1 300S1 P2 200S1 P3 400S1 P4 200S1 P5 100S1 P6 100S2 P1 300S2 P2 400S3 P2 200S4 P2 200S4 P4 300S4 P5 400
Concepts in Object Database Systems
Objects- every entity is an object
- Example: Book, Film, Employee, Car Class
- Objects with common attributes are grouped into a class Attributes or Instance Variables
- Properties of an object class inherited by the object instances Class Hierarchy
- Parent-Child class hierarchy Composite objects
- Book object with paragraphs, sections etc. Methods
- Functions associated with a class
Example Class Hierarchy
DocumentClass
D1 D2
Book Subclass
B1# of Chapters Volume #
Print-doc-att(ID)
Method1:
JournalSubclass
J1
Print-doc(ID)
Method2:
ID Name
Author
Publisher
Example Composite Object
CompositeDocument
Object
Section 1Object
Section 2Object
Paragraph 1Object
Paragraph 2Object
Deductive Database Systems
Database systems augmented with inference engines to deduce new data from existing data and rules
Example
- Rule: parent of a parent is a grandparent
- Data: John is Jane’s parent; Jane is Robert’s parent
- From the above, infer John is Robert’s grandparent Loose and tight coupling architectures between the database system
and inference engine
A Definition of a Distributed Database System
A collection of database systems connected via a network The software that is responsible for interconnection is a Distributed
Database Management System (DDBMS) Each DBMS executes local applications and should be involved in at
least one global application (Ceri and Pelagetti) Homogeneous environment
Architecture
Communication NetworkDistributed Processor 1
DBMS 1
Data-base 1 Data-
base 3
Data-base 2 DBMS 2
DBMS 3
Distributed Processor 2
Distributed Processor 3
Site 1
Site 2
Site 3
Distributed Processor
DistributedQuery/UpdateProcessor
DistributedTransactionManager
Distributed Metadata Management
Network Interface
Local DBMS Interface
Integrity/SecurityManager
Data Distribution
EMP1
SS# Name Salary
1 John 20 2 Paul 303 James 404 Jill 50
605 Mary6 Jane 70
D#
102020 201020
DnameD# MGR
10 30 40
Jane David Peter
DEPT1
SITE 1
SITE 2EMP2
SS# Name Salary9 Mathew 70
D#50
DnameD# MGR
50 Math John
Physics
DEPT2
David 80 30
Peter 90 40
7
8
C. Sci. English French
20 Paul
Distributed Database Functions
Distributed Query Processing
- Optimization techniques across the databases Distributed Transaction Management
- Techniques for distributed concurrency control and recovery
Distributed Metadata Management
- Techniques for managing the distributed metadata Distributed Security/Integrity Maintenance
- Techniques for processing integrity constraints and enforcing access control rules across the databases
DBMS 1
DQP DQP
DBMS 2
DQP
DBMS 3
EMP1 (20) EMP2 (30)DEPT2 (20)
EMP1 (20)EMP3 (50)DEPT3 (30)
Network
Query at site 1: Join EMP and DEPT on D#
Move EMP2 to site 3; Merge EMP1, EMP2, EMP3 to form EMPMove DEPT2 to site 3; Merge DEPT2 and DEPT3 to form DEPTJoin EMP and DEPT; Move result to site 1
Query Processing Example (Concluded)DQP(DistributedQueryProcessor)
Transaction Processing Example
Site 1Coordinator
Transaction Tj
Site 2Participant
Site 3Participant
Site 4Participant
Subtransaction Tj2 Subtransaction Tj3
Subtransaction Tj4
Issues:Concurrency controlRecoveryData Replication
Two-phase commit:Coordinator queries participants whether they are ready to commitIf all participants agree, then coordinator sends request forthe participants to commit
DTM (Distributed Transaction Manager) responsible for executing the distributedtransaction
Interoperability of Heterogeneous Database Systems
Database System A Database System B
Network
Database System C(Legacy)
Transparent accessto heterogeneousdatabases - both usersand application programs;Query, Transactionprocessing
(Relational) (Object-Oriented)
Technical Issues on the Interoperability of Heterogeneous Database Systems
Heterogeneity with respect to data models, schema, query processing, query languages, transaction management, semantics, integrity, and security policies
Interoperability based on client-server architectures Federated database management
- Collection of cooperating, autonomous, and possibly heterogeneous component database systems, each belonging to one or more federations
Different Data Models
Node A Node B
Database Database
RelationalModel
NetworkModel
Node C
Database
Object-Oriented Model
Network
Node D
Database
HierarchicalModel
Developments: Tools for interoperability; commercial productsChallenges: Global data model
Schema Integration and Transformation: An approach
Schemadescribing
the networkdatabase
Schemadescribing
the hierarchicaldatabase
Schemadescribing
the object-orienteddatabase
Global Schema: Integratethe generic schemas
ExternalSchema I
External Schema II
External Schema III
Schemadescribing
the relationaldatabase
Generic schemadescribing
the relationaldatabase
Generic schemadescribing
the networkdatabase
Generic schemadescribing
the hierarchicaldatabase
Generic schemadescribing
the object-orienteddatabase
Challenges: Selecting appropriate generic representation; maintaining consistency during transformations; schema evolution
Semantic Heterogeneity Semantic heterogeneity occurs when there is a disagreement about
the meaning or interpretation of the same data
Object O
Node A Node B
Database Database
Object Ointerpreted as
a passenger ship
Object Ointerpreted asa submarine
Challenges:Standard definitions;Repositories
Federated Database Management
Database System A Database System B
Database System C
Cooperating databasesystems yet maintainingsome degree ofautonomy
Federation F1
Federation F2
Autonomy
Component A Component B
Component C
local request
request from component
communicationthrough
federation
component Adoes not
communicatewith
component C
component A honorsthe local request first
Challenges:Adapt techniques to handle autonomy -e.g., transactionprocessing, schema integration; transitionresearch to products
Schema Integration and Transformation in a Federated Environment
Adapted from Sheth and Larson, ACM Computing Surveys, September 1990
Component Schema for Component A
Component Schema for Component B
Component Schema for Component C
Local Schema 1
Local Schema 2
Generic Schema for Component A
Generic Schemafor Component B
Generic Schemafor Component C
Export Schemafor Component A
Export Schema Ifor Component B
Export Schemafor Component C
Federated Schemafor FDS - 1
Federated Schemafor FDS - 2
ExternalSchema 1.2 Schema 2.1
ExternalSchema 2.2
ExternalSchema 1.1
Export Schema IIfor Component B
External
Federated Data and Policy Management
ExportData/Policy
ComponentData/Policy for
Agency A
Data/Policy for Federation
ExportData/Policy
ComponentData/Policy for
Agency C
ComponentData/Policy for
Agency B
ExportData/Policy
Current Status and Directions Developments
- Several prototypes and some commercial products
- Tools for schema integration and transformation
- Standards for interoperable database systems Challenges being addressed
- Semantic heterogeneity
- Autonomy and federation
- Global transaction management
- Integrity and Security New challenges
- Scale
- Web data management