© d. wong 2002 © d. wong 2003 1 cs610 / cs710 database systems i daisy wong
TRANSCRIPT
11 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
CS610 / CS710 Database Systems ICS610 / CS710 Database Systems I
Daisy WongDaisy Wong
22 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
ObjectivesObjectives
Modeling and design of databases.Modeling and design of databases.
Programming: queries and DB operationsProgramming: queries and DB operations
33 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
DatabaseDatabase
A large collection of data:A large collection of data:
–stored in mass storage stored in mass storage
–exists over a long period of timeexists over a long period of time
–Can take on a variety of appearances depending Can take on a variety of appearances depending on the requirements at the timeon the requirements at the time
–Can serve as the data source for a variety of Can serve as the data source for a variety of applicationsapplications
44 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Database ExamplesDatabase Examples
Examples:Examples:
– Wal-Mart : records every item purchased in Wal-Mart : records every item purchased in every storeevery store
– L. L. Bean: records detail information of each L. L. Bean: records detail information of each customer and their purchasescustomer and their purchases
– Hospitals: record patient demographics, Hospitals: record patient demographics, conditions and progress, test results, etcconditions and progress, test results, etc
– . . . . . .
This course, refer to a collection of data managed This course, refer to a collection of data managed by a Database Management System (DBMS)by a Database Management System (DBMS)
55 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
DBMSDBMS
A combination of software, data, and structure of A combination of software, data, and structure of the data that support: the data that support:
– Users to create databases and specify their Users to create databases and specify their schemaschema
– Users to query and modify the dataUsers to query and modify the data
– Storage of very large amount of data, secure the Storage of very large amount of data, secure the data from accident or unauthorized use, and data from accident or unauthorized use, and allow efficient accessallow efficient access
– Control concurrent access from many users, Control concurrent access from many users, presenting correct data to each user, and presenting correct data to each user, and prevent accidental corruption of the data from prevent accidental corruption of the data from simultaneous accessessimultaneous accesses
66 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
DBMS lingoDBMS lingo
SchemaSchema
– logical structure of the data. Use Data Definition logical structure of the data. Use Data Definition Language (DDL)Language (DDL)
QueryQuery
– A question about the data. Use Data A question about the data. Use Data Manipulation Language (DML)Manipulation Language (DML)
Transaction (or atomic transaction)Transaction (or atomic transaction)
– A logical unit of work that must be completed as A logical unit of work that must be completed as a whole or not at all.a whole or not at all.
77 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Three levels of database schemasThree levels of database schemas
External View1 External View2 External View3
Logical Schema
DiskInternal schema
Logical level
Internal level
External level
Logical to external mappings
Internal to Logical mappings
88 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
DBMS User RolesDBMS User Roles End users End users
– access the database for information to do their jobs. access the database for information to do their jobs. CasualCasual (use simple user interfaces) vs. (use simple user interfaces) vs. SophisticatedSophisticated ( use ( use DML)DML)
Database designers Database designers
– specify information content (use DDL) to create database specify information content (use DDL) to create database systemssystems
Application developers Application developers
– design and develop applications that extend the design and develop applications that extend the functionality of the dbms. E.g. user interface, data functionality of the dbms. E.g. user interface, data analysis and data mining, various business servicesanalysis and data mining, various business services
Database administrators (DBAs) Database administrators (DBAs)
– administer databases: control access, maintain data administer databases: control access, maintain data accuracy and integrity, monitor and improve database accuracy and integrity, monitor and improve database performanceperformance
99 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
A little historyA little history
First attempt – file systemsFirst attempt – file systems Hierarchical model (tree based)Hierarchical model (tree based) Network model (graph base)Network model (graph base) Relational model Relational model
– Proposed by E. F. Codd (1970)Proposed by E. F. Codd (1970)
– Data should be presented to user as tables Data should be presented to user as tables (relations)(relations)
– Queries expressed in a very high-level languageQueries expressed in a very high-level language
– SQL (Structured Query Language) – most SQL (Structured Query Language) – most important language based on relational modelimportant language based on relational model
1010 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Relational ModelRelational Model
A conceptual model that represents data as relations.A conceptual model that represents data as relations. Relations – tables of dataRelations – tables of data
Query using SQL:Query using SQL:
SELECT balance FROM AccountsSELECT balance FROM Accounts
WHERE accountNo = 34567;WHERE accountNo = 34567; Relational DBMS finds an efficient way to answer the Relational DBMS finds an efficient way to answer the
queryquery
accountNoaccountNo namename balancebalance
1234512345 SallySally 1000.211000.21
3456734567 SueSue 285.48285.48
…… …… ……
Accounts
1111 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Major components of a DBMS (Simplified. Figure 1.1)Major components of a DBMS (Simplified. Figure 1.1)
DDL commands
External level
Logical level
Internal level
Transaction Commands
Query ManagerDDL
compilerTransaction
Manager
Buffer/File Storage Manager
DataMetadata
Queries/updates
DBA Users / applications
1212 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Figure 1.1: Database management system componentsFigure 1.1: Database management system components
Ref. FCDS 2ed. by Ullman
1313 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Storage ManagerStorage Manager• Obtains the requested information from data storage
• Modifies the information if requested and re-store
• Indexes are used (data structures that help us find data items quickly given a part of their value). Advanced data structures such as B-tree are used for efficient access
• Indexes are part of the data, their description is part of the metadata
• Consists of 2 components: file manger and buffer manager
1. File manager keeps track of the location of files on disk and obtains the blocks containing the requested data
2. Buffer manager handles main memory. It manages the memory blocks, obtaining disk blocks from the file manager, trying to optimize the access to data
1414 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Query ManagerQuery Manager
Parse and optimize the query using a query Parse and optimize the query using a query compilercompiler
Execute the resulting query plan (sequence of Execute the resulting query plan (sequence of actions for the DBMS to perform)actions for the DBMS to perform)
– Issues a sequence of requests to storage Issues a sequence of requests to storage manager for small pieces of datamanager for small pieces of data
Return the result to the requesterReturn the result to the requester
1515 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Transaction ManagerTransaction Manager
Assure all transactions are executed properlyAssure all transactions are executed properly ACID properties of “proper” execution:ACID properties of “proper” execution:
– Atomicity : All of the updates of a transaction are Atomicity : All of the updates of a transaction are successful, or no update take placesuccessful, or no update take place
– Consistency: Each transaction should leave the Consistency: Each transaction should leave the database in a consistent statedatabase in a consistent state
– Isolation: Each transaction, when executed Isolation: Each transaction, when executed concurrently with other transactions, should have the concurrently with other transactions, should have the same effect as if it had been executed by itselfsame effect as if it had been executed by itself
– Durability: Once a transaction has completed Durability: Once a transaction has completed successfully, its changes to the database should be successfully, its changes to the database should be permanent. Even serious failures should not affect the permanent. Even serious failures should not affect the permanence of a transaction.permanence of a transaction.
1616 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Techniques to enforce ACIDTechniques to enforce ACID
Locking – granularity of locks is important.Locking – granularity of locks is important.
Logging – write a log to nonvolatile storage. Logging – write a log to nonvolatile storage. Assure durability.Assure durability.
Transaction Commitment – for durability and Transaction Commitment – for durability and atomicity, transactions are computed atomicity, transactions are computed “tentatively”, recorded, but no changes are made “tentatively”, recorded, but no changes are made to the db until the transaction gets committed. to the db until the transaction gets committed. Changes copied to the log, then copied to db. Changes copied to the log, then copied to db.
1717 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
TrendsTrends Object Oriented DB Object Oriented DB
– Richer data typesRicher data types
– Classes and class hierarchy enable share or reuse of sw Classes and class hierarchy enable share or reuse of sw and schemasand schemas
– Protect misuse through abstract data typesProtect misuse through abstract data types Object Relational DBObject Relational DB Constraints and triggers handlingConstraints and triggers handling Multimedia dataMultimedia data Data Integration to support advance data analysis such as Data Integration to support advance data analysis such as
data mining. E.g. data warehouses, data martsdata mining. E.g. data warehouses, data marts Multi-tier Client-Server architecture, move more Multi-tier Client-Server architecture, move more
processing to the clientprocessing to the client Parallel processingParallel processing Support Web sitesSupport Web sites
1818 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Knowledge Discovery in Databases (KDD)Knowledge Discovery in Databases (KDD)
Knowledge
DataTarget Data
Preprocessed Data
Transformed Data
Patterns
Selection
Preprocessing
Transformation
Data MiningInterpretation / Evaluation
Reference: Fayyad; Smyth: "From Data Mining to Knowledge Discovery: An Overview" 1996
1919 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Ch. 2 Entity-Relationship Data ModelCh. 2 Entity-Relationship Data Model
Data modelsData models
Entity-Relationship diagramsEntity-Relationship diagrams
Design PrinciplesDesign Principles
Modeling of constraintsModeling of constraints
Weak entity setsWeak entity sets
2020 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Data ModelingData Modeling
Used for conceptual database designUsed for conceptual database design
Ideas
ODL
E/R
Relational Schema
Relational
DBMS
Object-oriented
DBMSClasses / Objects
2121 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Data ModelsData Models
Value-OrientedValue-Oriented Object-OrientedObject-Oriented
ExamplesExamples –RelationalRelational
–LogicLogic
–Object-orientedObject-oriented
–NetworkNetwork
–HierarchicalHierarchical
Distinct objects byDistinct objects by Data valuesData values Object identityObject identity
Redundancy Redundancy handlinghandling
Reduced by Design Reduced by Design capabilitiescapabilities
Use pointers to Use pointers to objectsobjects
Many-to-many Many-to-many relationshipsrelationships
Handled the sameHandled the same Only binary Only binary relationshipsrelationships
2222 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Relational ModelRelational Model
Based on mathematically defined relations of Based on mathematically defined relations of entitiesentities
Consists of:Consists of:
– Attributes (fields)Attributes (fields)
– Domain legitimate values of attributes (data Domain legitimate values of attributes (data range)range)
– Views of data presented in table formatViews of data presented in table format
– Create new views by projections of the databaseCreate new views by projections of the database
– Records are n-tuples, where n = # of attributesRecords are n-tuples, where n = # of attributes
2323 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Entity-Relationship (E/R) DiagramsEntity-Relationship (E/R) Diagrams
Represents the schematic of a databaseRepresents the schematic of a database
Useful for designing the conceptual modelUseful for designing the conceptual model
Entity set : classes of objects Entity set : classes of objects
Entity : member of an entity set (data)Entity : member of an entity set (data)
Attributes : properties of the entities in an entity Attributes : properties of the entities in an entity setset
Relationship – describe how entities relate to each Relationship – describe how entities relate to each otherother
2424 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Entity-Relationship Diagrams SymbolsEntity-Relationship Diagrams Symbols
attributes
relationships
entity sets
Cardinality: one - one
many - onem
many - manym n
2525 © D. Wong 2002© D. Wong 2002© D. Wong 2003© D. Wong 2003
Entity-Relationship Diagrams (continue)Entity-Relationship Diagrams (continue)
EntityEntity
– Has a set of attributesHas a set of attributes
– The key is underlinedThe key is underlined
KeyKey
– Attributes which uniquely identify an entity in Attributes which uniquely identify an entity in an entity setan entity set
– An inherent property of the data (e.g. movie An inherent property of the data (e.g. movie title and year)title and year)
– Serve as a constraintServe as a constraint
SS#