(part 3) database programming
TRANSCRIPT
![Page 1: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/1.jpg)
Database Programming(Part 3)
Copyright © 2021 byRobert M. Dondero, Ph.D.
Princeton University
1
![Page 2: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/2.jpg)
Objectives
Continued from Part 2…
[see slide]
Objectives
• We will cover:– Databases (DBs) and database
management systems (DBMSs)…– With a focus on relational DBs and DBMSs…– With a focus on the SQLite DBMS…– With a focus on programming with SQLite
2
![Page 3: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/3.jpg)
Agenda
• Relational DB transactions: atomicity• Relational DB transactions: locking• Relational DB design
3
![Page 4: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/4.jpg)
DB Transactions: Atomicity
• Problem: DBMS must preserve DB consistency in the presence of HW/SW failures
4
![Page 5: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/5.jpg)
DB Transactions: Atomicity
• Example:– Customer 222 has purchased 1 copy of book
123– Must change ORDERS:
• Add 1 to quantity of appropriate row (20 -> 21)– Must change BOOKS:
• Subtract 1 from quantity of appropriate row (500 -> 499)
5
![Page 6: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/6.jpg)
One logical update consists of 2 UPDATE stmtsFailure occurs between execution of stmts
=> DB is corrupted
UPDATE orders SET quantity=quantity+1 WHERE isbn=123 AND custid=222;
Time HW/SW failure
DB Transactions: Atomicity
6
UPDATE books SET quantity=quantity-1 WHERE isbn=123;
![Page 7: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/7.jpg)
DB Transactions: Atomicity
• Solution: – DBMS must execute each logical DB update
atomically
• Atomically: either completely, or not at all
7
![Page 8: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/8.jpg)
DB Transactions: Atomicity
• Problem: How to implement atomicity?• Solution: Transactions
• Transaction– Defines an atomic unit of work– DBMS executes a transaction either
completely or not at all
8
![Page 9: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/9.jpg)
DB Transactions: Atomicity
• See purchase.py– The job:
• Accept isbn and custid as command-line arguments
• In ORDERS table increment quantity of row with given isbn and custid
• In BOOKS table decrement quantity of row with given isbn
– Note:• Two updates must be executed atomically
9
![Page 10: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/10.jpg)
DB Transactions: Atomicity
• See purchase.py (cont.)
10
Relational DB Transactions: Atomicity
[see slide]
Code notes:First update implicitly begins a transactionconnection.commit()
Commits and ends transactionconnection.rollback()
Rolls back and ends transaction
![Page 11: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/11.jpg)
DB Transactions: Atomicity
• Do SQLite transactions in Python really handle HW/SW errors?
• Do they really enforce atomicity?
11
![Page 12: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/12.jpg)
DB Transactions: Atomicity
• See recovery.py– The job (artificial):
• Repeatedly, 20 times…• In orders table increment quantity of row with isbn 123 and custid 222
• In books table decrement quantity of row with isbn 123
• “Failure” may occur between the updates• Is DB consistency preserved?
12
![Page 13: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/13.jpg)
Relational DB Transactions: Atomicity
Code notes:
“HW/SW failure” occurs at pseudo-random times between updatesIs DB consistency preserved?
[see slide]
500 + 20 == 483 + 37Provides evidence that DB consistency is preserved!SQLite transactions are atomic
DB Transactions: Atomicity
• See recovery.py (cont.)
13
![Page 14: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/14.jpg)
Agenda
• Relational DB transactions: atomicity• Relational DB transactions: locking• Relational DB design
14
![Page 15: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/15.jpg)
DB Transactions: Locking
• Problem: DBMS must preserve DB consistency in the presence of concurrent updates
15
![Page 16: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/16.jpg)
Increment ORDERS (21)Time
Process P1 Process P2
BEGIN
Decrement BOOKS (499)
DB Transactions: Locking
16
COMMIT
Increment ORDERS (21)
Decrement BOOKS (499)
One increment/decrement pair is lost!!!
Without locking:
Relational DB Transactions: Locking
Without locking, consider this sequence
[see slide]
P1 starts a transaction
P1 increments the quantity in ORDERS, thus changing it to 21But the increment occurs within a transaction, and so is cached
P2 increments the quantity in ORDERS, thus changing it to 21
P1 decrements the quantity in BOOKS, thus changing it to 499But the decrement occurs within a transaction, and so is cached
P2 decrements the quantity in BOOKS, thus changing it to 499
P1 ends/commits the transactionThe quantity in ORDERS is changed to 21The quantity in BOOKS is changed to 499
One increment/decrement pair is lost!
![Page 17: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/17.jpg)
DB Transactions: Locking
• Solution: Locking within transactions– Process P1 asks DBMS to execute
transaction involving update of row X• DBMS locks row X
– Concurrently, process P2 asks DBMS to update row X
• DBMS forces P2 to wait, or rejects P2’s request, until transaction completes
17
![Page 18: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/18.jpg)
Increment orders (21)
Time
Process P1 Process P2
BEGIN
Decrement books (499)
DB Transactions: Locking
18
COMMITIncrement orders (22)
Decrement books (498)
Process P2 updates are postponed or rejected
Updates disallowed
Relational DB Transactions: Locking
With locking, consider this similar sequence
[see slide]
P1 starts a transaction
During the transaction, P2’s attempts to update are disallowed (postponed or rejected)
P1 increments the quantity in ORDERS, thus changing it to 21But the increment occurs within a transaction, and so is cached
P1 decrements the quantity in BOOKS, thus changing it to 499But the decrement occurs within a transaction, and so is cached
P1 ends/commits the transactionThe quantity in ORDERS is changed to 21The quantity in BOOKS is changed to 499
P2 increments the quantity in ORDERS, thus changing it to 22
P2 decrements the quantity in BOOKS, thus changing it to 498
![Page 19: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/19.jpg)
No increments/decrements are lost!
![Page 20: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/20.jpg)
Relational DB Transactions: Locking
Important note:
Full-featured DBMSs (PostgreSQL, Oracle, SQLServer, MySQL, …), do row-level locking
SQLite does DB-level locking
A big reason why SQLite isn’t used in production applications
More precisely
SQLite often is used in production applications as a rich alternative to simple flat filesThunderbird email client uses a SQLite database
SQLite is not used in production applications as a weak alternative to a full-fledged DBMS (PostgreSQL, Oracle, etc.)
DB Transactions: Locking
• Important note:– PostgreSQL, Oracle, SQLServer, MySQL,...
• Row-level locking– SQLite
• DB-level locking
19
![Page 21: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/21.jpg)
DB Transactions: Locking
$ sqlite3 bookstore.sqlitesqlite> BEGIN;sqlite> UPDATE books SET quantity = 11111 WHERE isbn = 123;sqlite>
$ python purchase.py 123 222database is locked$
20
Window 1:
Window 2:
Noticeable delay
![Page 22: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/22.jpg)
DB Transactions: Locking
$ sqlite3 bookstore.sqlitesqlite> BEGIN;sqlite> UPDATE books SET quantity = 11111 WHERE isbn = 123;sqlite> COMMIT;sqlite>
$ python purchase.py 123 222database is locked$ python purchase.py 123 222Transaction committed.$
21
Window 1 (cont):
Window 2 (cont):
![Page 23: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/23.jpg)
Transaction Summary
• DBMSs use transactions to:– Recover from HW/SW errors
• Transactions implement atomicity– Handle concurrent updates
• Transactions implement locking
22
![Page 24: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/24.jpg)
Relational DB Design
So far:How to use a relational DB
Now:How to design a relational DB
Agenda
• Relational DB transactions: atomicity• Relational DB transactions: locking• Relational DB design
23
![Page 25: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/25.jpg)
Relational DB Design
• Relational DB normal forms– Rules that DBAs can apply to:
• Determine if DB design is wrong• Make it right
24
Relational DB Design
To design a proper relational DB, you must know about the relational DB normal forms
Developed by Codd
Rules that DBAs can apply to:Determine if DB design is wrongMake it right
There are many of themWe’ll cover the first 3…
![Page 26: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/26.jpg)
Relational DB Design: DB0
Let’s consider a series of “bookstore” relational databasesWe’ll start with a very flawed versionWe’ll incrementally apply normal formsThe final version in the series will be the database that we used in previous lectures
Consider DB0…
BOOKS:There is a book whose isbn is 123Whose title is “The Practice of Programming” Whose authors are “Kernighan” and “Pike”And there are 500 copies of it available in our warehouse
ORDERS:There is a customer whose custid is 222That customer is the Harvard bookstoreIt’s located at 1256 Mass Ave in Cambridge, MA 02138That customer ordered 20 copies of the book with isbn 123
BOOKS isbn title authors quantity123 The Practice of Programming {Kernighan, Pike} 500234 The C Programming Language {Kernighan, Ritchie} 800345 Algorithms in C {Sedgewick} 650
ORDERSisbn custid custname street city state zipcode quantity123 222 Harvard 1256 Mass Ave Cambridge MA 02138 20 345 222 Harvard 1256 Mass Ave Cambridge MA 02138 100123 111 Princeton 114 Nassau St Princeton NJ 08540 30
Relational DB Design: DB0
25
DB0:
![Page 27: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/27.jpg)
Relational DB Design: DB0
• Def: A table is in first normal form iff all underlying domains contain atomic values only– All modern relational DBMSs enforce first
normal form
26
![Page 28: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/28.jpg)
Relational DB Design: DB0
Recall: A table is in first normal form iff all underlying domains contain atomic values only
DB0 has cells that contain non-atomic values
ORDERSisbn custid custname street city state zipcode quantity123 222 Harvard 1256 Mass Ave Cambridge MA 02138 20 345 222 Harvard 1256 Mass Ave Cambridge MA 02138 100123 111 Princeton 114 Nassau St Princeton NJ 08540 30
Relational DB Design: DB0
27
BOOKS isbn title authors quantity123 The Practice of Programming {Kernighan, Pike} 500234 The C Programming Language {Kernighan, Ritchie} 800345 Algorithms in C {Sedgewick} 650
DB0 is not in first normal form
DB0:
![Page 29: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/29.jpg)
BOOKSisbn title quantity123 The Practice of Programming 500234 The C Programming Language 800345 Algorithms in C 650
AUTHORSisbn author123 Kernighan123 Pike234 Kernighan234 Ritchie345 Sedgewick
ORDERSisbn custid custname street city state zipcode quantity123 222 Harvard 1256 Mass Ave Cambridge MA 02138 20 345 222 Harvard 1256 Mass Ave Cambridge MA 02138 100123 111 Princeton 114 Nassau St Princeton NJ 08540 30
DB1:
Relational DB Design: DB1
28
Relational DB Design: DB1
The solution: split BOOKS into BOOKS and AUTHORS
Yields DB1
[see slide]
DB1 is in first normal form
![Page 30: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/30.jpg)
BOOKSisbn title quantity123 The Practice of Programming 500234 The C Programming Language 800345 Algorithms in C 650
AUTHORSisbn author123 Kernighan123 Pike234 Kernighan234 Ritchie345 Sedgewick
ORDERSisbn custid custname street city state zipcode quantity123 222 Harvard 1256 Mass Ave Cambridge MA 02138 20 345 222 Harvard 1256 Mass Ave Cambridge MA 02138 100123 111 Princeton 114 Nassau St Princeton NJ 08540 30
DB1:
Relational DB Design: DB1
29DB1 design seems wrong
Relational DB Design: DB1
[see slide]
Intuitively: Still not quite right
The location of the Harvard bookstore is stored redundantlySuppose the Harvard bookstore moves?
![Page 31: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/31.jpg)
Relational DB Design: DB1
• Somewhat informally...• Def: The primary key for a table is the
minimal set of columns that uniquely identifies any particular row of that table
30
![Page 32: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/32.jpg)
BOOKSisbn title quantity123 The Practice of Programming 500234 The C Programming Language 800345 Algorithms in C 650
AUTHORSisbn author123 Kernighan123 Pike234 Kernighan234 Ritchie345 Sedgewick
ORDERSisbn custid custname street city state zipcode quantity123 222 Harvard 1256 Mass Ave Cambridge MA 02138 20 345 222 Harvard 1256 Mass Ave Cambridge MA 02138 100123 111 Princeton 114 Nassau St Princeton NJ 08540 30
Primary keys in DB1:
Relational DB Design: DB1
31
Relational DB Design: DB1
[see slide]
In BOOKS the primary key is isbn
In ORDERS the primary key is isbn+custid
In AUTHORS the primary key is isbn+authorDegenerate case
![Page 33: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/33.jpg)
Relational DB Design: DB1
• Def: A column C2 of a table is functionally dependent on a column C1 iff, for each row in the table, the value of C1 determines the value of C2
• In DB1...
32
![Page 34: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/34.jpg)
AUTHORS
ORDERS
BOOKS
isbn
Relational DB Design: DB1
33
title
quantity authorisbn
quantity
custname
street
zipcode
city
statecustid
isbn
Functional dependencies in DB1:
Relational DB Design: DB1
[see slide]
BOOKS:title and quantity are functionally dependent upon isbn
isbn uniquely identifies a rowIf we know the isbn, then we can determine the title and quantity
AUTHORS: degenerate case
ORDERS:Quantity is functionally dependent upon isbn+custidCustname, street, and zipcode are functionally dependent upon custidCity and state are functionally dependent upon zipcode
![Page 35: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/35.jpg)
Relational DB Design: DB1
• Somewhat informally…• A table is in second normal form iff:
– It is in first normal form, and– No non-primary-key column is dependent on
any proper subset of the primary key
34
![Page 36: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/36.jpg)
Relational DB Design: DB1
Recall: A table is in second normal form iff:It is in first normal form, andNo non-primary-key column is dependent on any proper subset of the primary key
In ORDERS:Custname, street, and zipcode are functionally dependent upon a proper subset of the isbn+custid primary key
AUTHORS
ORDERS
BOOKS
isbn
Relational DB Design: DB1
35
title
quantity authorisbn
quantity
custname
street
zipcode
city
statecustid
isbn
DB1 is not in second normal form
Functional dependencies in DB1:
![Page 37: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/37.jpg)
Relational DB Design: DB2
Recall: A table is in second normal form iff:It is in first normal form, andNo non-primary-key column is dependent on any proper subset of the primary key
Solution: Split ORDERS into ORDERS and CUSTOMERS
Yields DB2
[see slide]
BOOKSisbn title quantity123 The Practice of Programming 500234 The C Programming Language 800345 Algorithms in C 650
AUTHORSisbn author123 Kernighan123 Pike234 Kernighan234 Ritchie345 Sedgewick
CUSTOMERScustid custname street city state zipcode111 Princeton 114 Nassau St Princeton NJ 08540 222 Harvard 1256 Mass Ave Cambridge MA 02138333 MIT 292 Main St Cambridge MA 02142
ORDERSisbn custid quantity123 222 20345 222 100123 111 30
Relational DB Design: DB2
36
DB2:
![Page 38: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/38.jpg)
Relational DB Design: DB2
Recall: A table is in second normal form iff:It is in first normal form, andNo non-primary-key column is dependent on any proper subset of any primary key
Note that DB2 is in second normal form
AUTHORS
ORDERS
BOOKS
isbn
Relational DB Design: DB2
37
title
quantity authorisbn
quantitycustname
street
zipcode
city
state
custid
isbn
custid
CUSTOMERS
Functional dependencies in DB2:
DB2 is in second normal form
![Page 39: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/39.jpg)
Relational DB Design: DB2
[see slide]
Intuitively: Still not quite right.The fact that zip code 08540 maps to Princeton, NJ could be stored multiple times
CUSTOMERScustid custname street city state zipcode111 Princeton 114 Nassau St Princeton NJ 08540 222 Harvard 1256 Mass Ave Cambridge MA 02138333 MIT 292 Main St Cambridge MA 02142
Relational DB Design: DB2
38
BOOKSisbn title quantity123 The Practice of Programming 500234 The C Programming Language 800345 Algorithms in C 650
AUTHORSisbn author123 Kernighan123 Pike234 Kernighan234 Ritchie345 SedgewickORDERS
isbn custid quantity123 222 20345 222 100123 111 30
DB2:
Design of DB2 seems wrong
![Page 40: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/40.jpg)
Relational DB Design: DB2
• Somewhat informally...• A table is in third normal form iff:
– It is in second normal form, and– Every non-primary-key column is
non-transitively dependent on the primary key
39
![Page 41: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/41.jpg)
Relational DB Design :DB2
Recall: A table is in third normal form iff:It is in second normal form, andEvery non-candidate-key column is non-transitively dependent on the candidate keys
City and state are functionally dependent on zipcodeZipcode is functionally dependent on custidSo city and state are transitively functionally dependent upon custid
AUTHORS
ORDERS
BOOKS
isbn
Relational DB Design: DB2
40
title
quantity authorisbn
quantitycustname
street
zipcode
city
state
custid
isbn
custid
CUSTOMERS
Functional dependencies in DB2:
DB2 is not in third normal form
![Page 42: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/42.jpg)
Relational DB Design: DB3
Recall: A table is in third normal form iff:It is in second normal form, andEvery non-candidate-key column is non-transitively dependent on the candidate keys
The solution: Split CUSTOMERS into CUSTOMERS and ZIPCODES
[see slide]
Yields DB3
BOOKSisbn title quantity123 The Practice of Programming 500234 The C Programming Language 800345 Algorithms in C 650
AUTHORSisbn author123 Kernighan123 Pike234 Kernighan234 Ritchie345 Sedgewick
CUSTOMERScustid custname street zipcode111 Princeton 114 Nassau St 08540 222 Harvard 1256 Mass Ave 02138333 MIT 292 Main St 02142
ORDERSisbn custid quantity123 222 20345 222 100123 111 30
ZIPCODESzipcode city state08540 Princeton NJ02138 Cambridge MA02142 Cambridge MA
Relational DB Design: DB3
41
DB3:
![Page 43: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/43.jpg)
Relational DB Design: DB3
Recall: A table is in third normal form iff:It is in second normal form, andEvery non-candidate-key column is non-transitively dependent on the candidate keys
[see slide]
Note that DB3 is in third normal form
We’ve now incrementally designed the DB that we used in previous lectures
AUTHORSBOOKS
isbn
Relational DB Design: DB3
42
title
quantity
authorisbn
custname
street
zipcodecity
state
custid
CUSTOMERS
ORDERS
quantitycustid
isbn
zipcode
ZIPCODES
Functional dependenciesin DB3:
DB3 is in third normal form
![Page 44: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/44.jpg)
Relational DB Design
• Some additional points...– There are more normal forms– Database designers routinely violate normal
forms– DBMS can enforce additional consistency
constraints– There is a substantial mathematical theory of
relational database design
44
Relational DB Design
Some additional points to wrap up…
There are more normal formsSee Date book
Database designers routinely violate normal formsBut a good one does so:
Only with a purposeKnowing the consequences
DBMS can enforce additional consistency constraints; e.g.:Primary key values cannot be NULLPrimary key values must be unique within a tableWithin a table, foreign key values must correspond to primary key values in another table
E.g. orders.isbn is a foreign key with respect to books.isbn
There is a substantial mathematical theory of relational database designMore precise definitions of normal formsSeveral additional normal formsRelational algebra, relational calculusForeign keys, integrity rules
![Page 45: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/45.jpg)
...See An Introduction to Database Systems (C. J. Date)
![Page 46: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/46.jpg)
Summary
In Database Programming (Part 3) we have covered:
[see slide]
Summary
• We have covered:– Relational DB transactions: atomicity– Relational DB transactions: locking– Relational DB design
44
![Page 47: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/47.jpg)
Summary
In the 3 Database Programming lectures we have covered…
[see slide]
Summary
• We have covered:– Databases (DBs) and database
management systems (DBMSs)…– With a focus on relational DBs and
DBMSs…– With a focus on the SQLite DBMS…– With a focus on programming with SQLite
• See also...
45
![Page 48: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/48.jpg)
See Also
• Appendices– Appendix 1: Before relational DBs– Appendix 2: After relational DBs
• Optional lecture slide decks– PostgreSQL– SQLAlchemy– MongoDB
See Also
AppendicesBefore relational DBs
A look backward through time at the DB fieldAfter relational DBs
An overview of some more recent developments in the DB field
Optional database lecture slide decks
Available via Topics web pageCover some popular database-related technologies that are:
Not used in COS 333 assignmentsOften used in COS 333 projects
PostgreSQLA full-fledged relational DBMSIn contrast with SQLite…
Runs as a serverProvides authenticationRow-level lockingPreferred by the Heroku web service
Good choice for many COS 333 projects
![Page 49: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/49.jpg)
SQLAlchemyAn object-relational mapper (ORM)Allows you to use a relational DBMS without using SQLMaps relational DB tables to classes, and rows to objects of that classUsed by many COS 333 projects
MongoDBA popular non-relational (noSQL) DBMSUsed by some COS 333 projects
GoalGive you enough information about those technologies to:
Help you decide if you want to use them in your projectHelp you get started with them
SuggestionsView quickly now
To understand the general conceptsSo you know what material is given, in case you need it later
View thoroughly laterAs needed
![Page 50: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/50.jpg)
Appendix 1:Before Relational DBs
![Page 51: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/51.jpg)
Before Relational DBs
• Before relational DBs, there were…
• Navigational DBs– Data are linked into graph structure
48
Before Relational DBs
Before relational DBs, there were…
Navigational DBsData are linked into graph structureClient is given “root” nodeClient follows links to desired data
Example…
![Page 52: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/52.jpg)
Before Relational DBs
[see slide]
The root of the graph points to 0 or more BOOK recordsEach BOOK record contains an isbn, a title, and a quantityEach BOOK record points to 0 or more ORDER recordsEach ORDER record contains a customerid and a quantityEach ORDER record points to one CUSTOMER record Each CUSTOMER record contains a custname, street, and zipcodeEach CUSTOMER record points to one ZIPCODE record Each ZIPCODE record contains a city and stateEach BOOK record points to one or more AUTHOR nodeEach AUTHOR record contains an author (name)
isbn title quantity
author
custname street zipcode
city state
custid quantity
BOOK
ORDER AUTHOR
CUSTOMER
ZIPCODE
Before Relational DBs
49
root
*
* +
1
1
Example Navigational DB
![Page 53: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/53.jpg)
Princeton | 114 Nassau St | 08540
111 | 30222 | 20ORDER(custid, quantity)
Harvard | 1256 Mass Ave | 02138
CUSTOMER(custname, street,zipcode)
Which customers purchased the book whose ISBN is 123?
123 | The Practice of Programming | 500BOOK(isbn, title, quantity)
Before Relational DBs
50
root
Before Relational DBs
[see slide]
Given a book, it’s easy to find the customers who purchased that book
However, given a customer, it would be hard to find the books purchased by that customer
![Page 54: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/54.jpg)
Before Relational DBs
• Navigational DBs– Queries are biased– DB designer must anticipate queries
• Relational DBs– Queries are unbiased– DB designer need not anticipate queries– However, can create indices
51
Before Relational DBs
Navigational DBsQueries are biasedDB designer must anticipate queries to create appropriate links
Relational DBsQueries are unbiased
That was Codd’s original motivationDB designer need not anticipate queries
However, can create indicesDBA can create indices, based upon anticipated usage patternsDBMS can create indices, based upon observed usage patterns
![Page 55: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/55.jpg)
Appendix 2:After Relational DBs
![Page 56: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/56.jpg)
After Relational DBs
• For some apps:– Relational DBMSs are more complex than
necessary– The relational DB model is a poor fit
53
After Relational DBs
Observations:For some apps, relational DBMSs are more complex than necessary
Many apps don’t need query neutrality, transactions, …For some apps, the relational DB model (tables, rows) is a poor fit
Example: apps that use data that is hierarchical to undefined levels
It’s not the case that one size fits all
One size does not fit some
![Page 57: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/57.jpg)
After Relational DBs
Data Model
Relational Non-Relational(NoSQL)
Key-ValueStores
DocumentStores
Wide-Column Stores
…
54
After Relational DBs
Those observations motivate the non-relational (“NoSQL”) data models
Note that there are three kinds…
![Page 58: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/58.jpg)
After Relational DBs
Data Model
Relational Non-Relational(NoSQL)
Key-ValueStores
DocumentStores
Wide-Column Stores
…
55
![Page 59: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/59.jpg)
After Relational DBs
• Key-value store– Values: arbitrary bytes– Data structure: key-value pairs– Access: by key– Examples: Redis, Memcached, Microsoft
Azure Cosmos DB, Hazelcast, Ehcache
56
After Relational DBs
Key-value storeValues: Arbitrary bytesData structure: key-value pairsAccess: by keyExamples: Redis, Memcached, Microsoft Azure Cosmos DB, Hazelcast, EhcacheValue Stores
Essentially a key-value store database is a persistent associative array (hash table or search tree)Lookup is strictly by key
Very simple; very fast
![Page 60: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/60.jpg)
After Relational DBs
Data Model
Relational Non-Relational(NoSQL)
Key-ValueStores
DocumentStores
Wide-Column Stores
…
57
![Page 61: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/61.jpg)
After Relational DBs
• Document store– Values: documents with internal structure
(e.g., JSON)– Data structure: key-value pairs– Access: by key or content– Examples: MongoDB, Amazon DynamoDB,
Couchbase, CouchDB, MarkLogic
58
After Relational DBs
Document storeValues: Documents with internal structure (e.g. JSON) which DBMS can processData structure: key-value pairs
Can access document by key or sometimes by contentAccess: by key or contentExamples:
MongoDB, Amazon DynamoDB, Couchbase, CouchDB, MarkLogic
![Page 62: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/62.jpg)
After Relational DBs
Data Model
Relational Non-Relational(NoSQL)
Key-ValueStores
DocumentStores
Wide-Column Stores
…
59
![Page 63: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/63.jpg)
After Relational DBs
• Wide-column store– Values: Arbitrary bytes– Data structure: Multidimensional associative
array– Examples: Cassandra, HBase, Microsoft
Azure Cosmos DB
60
After Relational DBs
Wide-column storeValues: Arbitrary bytesData structure: Multidimensional associative array
Typically sparsely populatedExamples:
Cassandra, HBase, Microsoft Azure Cosmos DB
I haven’t used
Origin: Google BigTable
![Page 64: (Part 3) Database Programming](https://reader030.vdocuments.us/reader030/viewer/2022012714/61ac9174421c02020a02bd94/html5/thumbnails/64.jpg)
After Relational DBs
[see slide]
NotesRelational data model still dominates, by a wide marginPostgreSQL is the preferred DBMS when deploying on HerokuSQLite is in the top 10!
After Relational DBs
Rank DBMS DB Data Model Score1 Oracle Relational 12632 MySQL Relational 12283 Microsoft SQL Server Relational 9824 PostgreSQL Relational 5775 MongoDB Document Store 4966 Redis Key-Value Store 1687 DB2 Relational 1659 SQLite Relational 13010 Cassandra Wide-Column Store 114
Popular DBMSs, according to https://db-engines.com/en/ranking as of July 2021:
61