cs405g: introduction to database systems final review
TRANSCRIPT
![Page 1: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/1.jpg)
CS405G: Introduction to Database Systems
Final Review
![Page 2: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/2.jpg)
04/19/23 Jinze Liu @ University of Kentucky 2
Database Design
![Page 3: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/3.jpg)
E-R model
• E-R model• Entities
• Attributes
• Relationships
![Page 4: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/4.jpg)
04/19/23 Jinze Liu @ University of Kentucky 4
![Page 5: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/5.jpg)
04/19/23 5
Database Design
504/19/23
![Page 6: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/6.jpg)
From E-R Diagram to Relations
• Relations• Schemas• Converting E-R diagram to relations• Keys
• Super keys• Candidate keys• Primary keys
• Relational integrity constraints
![Page 7: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/7.jpg)
Key Constraints
• Superkey: – (Uniqueness constraints) A set of attributes where
no two distinct tuples can have the same values– Every relation has at least one superkey:
• The set of all attributes.
• Key: A minimal superkey– Uniqueness constraint (superkey)– Minimum Constraint
• No attribute can be removed and still satisfy the uniqueness constraints.
04/19/23 7Jinze Liu @ University of Kentucky
![Page 8: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/8.jpg)
04/19/23 8
Relational Integrity Constraints
• Constraints are conditions that must hold on all valid relation instances. There are four main types of constraints:
1. Domain constraints1. The value of a attribute must come from its domain
2. Key constraints3. Entity integrity constraints4. Referential integrity constraints
804/19/23
![Page 9: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/9.jpg)
Database Normalization
• Functional Dependency• Functional Closure• Keys
– Redefined– Based on functional dependency
• DB Norm Form– 1st, 2nd, 3rd, BCNF
![Page 10: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/10.jpg)
04/19/23 Luke Huan Univ. of Kansas 10
Database Query
![Page 11: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/11.jpg)
Relational Algebra and SQL
• Relational algebra• SQL query
SFW Group by …, Having Subqueries
Relationship between R.A. and SQL
![Page 12: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/12.jpg)
04/19/23 Jinze Liu @ University of Kentucky 12
Relational algebra
• Core set of operators:– Selection, projection, cross product, union, difference, and
renaming• Additional, derived operators:
– Join, natural join, intersection, etc.• Compose operators to make complex queries
RelOp
RelOp
A language for querying relational databases based on operators:
![Page 13: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/13.jpg)
04/19/23 Jinze Liu @ University of Kentucky 13
Summary of core operators• Selection:• Projection:• Cross product:• Union:• Difference: • Renaming:
– Does not really add “processing” power
σp R
πL R
R X S
R SR - Sρ S(A1, A2, …) R
![Page 14: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/14.jpg)
04/19/23 Jinze Liu @ University of Kentucky 14
Summary of derived operators
• Join: • Natural join:• Intersection:
R p S
R S
R S
![Page 15: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/15.jpg)
04/19/23 Jinze Liu @ University of Kentucky 15
Classification of relational operators
• Selection: σp R
• Projection: πL R
• Cross product: R X S• Join: R p S
• Natural join: R S• Union: R U S• Difference: R - S• Intersection: R ∩ S
MonotoneMonotoneMonotoneMonotoneMonotoneMonotoneMonotone w.r.t. R; non-monotone w.r.t S
Monotone
![Page 16: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/16.jpg)
04/19/23 Jinze Liu @ University of Kentucky 16
Update Operations on Relations
• Update operations– INSERT a tuple.– DELETE a tuple.– MODIFY a tuple.
• Constraints should not be violated in updates
![Page 17: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/17.jpg)
04/19/23 Luke Huan Univ. of Kansas 17
Basic queries: SFW statement
• SELECT A1, A2, …, An
FROM R1, R2, …, Rm
WHERE condition;• Also called an SPJ (select-project-join) query• (almost) Equivalent to relational algebra query
π A1, A2, …, An (σ condition (R1 X R2 X … X Rm))
![Page 18: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/18.jpg)
04/19/23 Luke Huan Univ. of Kansas 18
Semantics of SFW
• SELECT E1, E2, …, En
FROM R1, R2, …, Rm
WHERE condition;• For each t1 in R1:
For each t2 in R2: … … For each tm in Rm: If condition is true over t1, t2, …, tm: Compute and output E1, E2, …, En as a row
• t1, t2, …, tm are often called tuple variables
• Not 100% correct, we will see
![Page 19: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/19.jpg)
04/19/23 Jinze Liu @ University of Kentucky 19
Operational semantics of GROUP BY
SELECT … FROM … WHERE … GROUP BY …;
• Compute FROM • Compute WHERE • Compute GROUP BY: group rows according to
the values of GROUP BY columns• Compute SELECT for each group
– For aggregation functions with DISTINCT inputs, first eliminate duplicates within the group
Number of groups = number of rows in the final output
![Page 20: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/20.jpg)
04/19/23 Jinze Liu @ University of Kentucky 20
Database Design
![Page 21: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/21.jpg)
21
physical data organization• Storage hierarchy (DC vs. Pluto) ! count I/O’s• Disk geometry: three components of access cost;
random vs. sequential I/O• Data layout
– Record layout (handling variable-length fields, NULL’s)– Block layout (NSM, PAX) ! inter-/intra-record locality
• Access paths– Primary versus secondary indexes– Tree-based indexes: ISAM, B+-tree! Again, reintroduce redundancy to improve performance! Fundamental trade-off: query versus update cost
![Page 22: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/22.jpg)
Performance Issues on Indexes
• Indexes– ISAM– B+ Tree
• Metrics – Storage– IO-costs
• Operations– Single value query & range query– Insertion and deletion
22
![Page 23: CS405G: Introduction to Database Systems Final Review](https://reader035.vdocuments.us/reader035/viewer/2022062715/56649d705503460f94a5214f/html5/thumbnails/23.jpg)
Query Processing Implementation
• Typical Query Processings– Selection– Join– Set operations.
• Typical approaches– Sequential scans in unsorted database– Sorted database– What are the tradeoffs.
23