lecture #4 october 19, 2000 sql. administration exam date officially moved to december 7 th, 6:30pm,...
Post on 21-Dec-2015
214 views
TRANSCRIPT
Administration
• Exam date officially moved to December 7th, 6:30pm, here.
• Homework #3 – will be on the web site tomorrow. Due in two weeks. SQL.
• Accounts are being set up.
• Projects, projects, projects.
SQL IntroductionStandard language for querying and manipulating data
Structured Query Language
Many standards out there: SQL92, SQL2, SQL3, SQL99Vendors support various subsets of these, but all of what we’llbe talking about.
Basic form: (many many more bells and whistles in addition)
Select attributes From relations (possibly multiple, joined) Where conditions (selections)
Selections
SELECT * FROM Company WHERE country=“USA” AND stockPrice > 50
You can use: attribute names of the relation(s) used in the FROM. comparison operators: =, <>, <, >, <=, >= apply arithmetic operations: stockprice*2 operations on strings (e.g., “||” for concatenation). Lexicographic order on strings. Pattern matching: s LIKE p Special stuff for comparing dates and times.
Projections
SELECT name AS company, stockprice AS price FROM Company WHERE country=“USA” AND stockPrice > 50
SELECT name, stock price FROM Company WHERE country=“USA” AND stockPrice > 50
Select only a subset of the attributes
Rename the attributes in the resulting table
Ordering the Results
SELECT name, stock price FROM Company WHERE country=“USA” AND stockPrice > 50 ORDERBY country, name
Ordering is ascending, unless you specify the DESC keyword.
Ties are broken by the second attribute on the ORDERBY list, etc.
Joins
SELECT name, store FROM Person, Purchase WHERE per-name=buyer AND city=“Seattle” AND product=“gizmo”
Product ( pname, price, category, maker)Purchase (buyer, seller, store, product)Company (cname, stock price, country)Person( per-name, phone number, city)
Disambiguating Attributes
SELECT Person.name FROM Person, Purchase, Product WHERE Person.name=buyer AND product=Product.name AND Product.category=“telephony”
Product ( name, price, category, maker)Purchase (buyer, seller, store, product)Person( name, phone number, city)
Find names of people buying telephony products:
Tuple Variables
SELECT product1.maker, product2.maker FROM Product AS product1, Product AS product2 WHERE product1.category=product2.category AND product1.maker <> product2.maker
Product ( name, price, category, maker)
Find pairs of companies making products in the same category
First Unintuitive SQLismSELECT R.AFROM R,S,TWHERE R.A=S.A OR R.A=T.A
Looking for R (S T)
But what happens if T is empty?
Union, Intersection, Difference(SELECT name FROM Person WHERE City=“Seattle”)
UNION
(SELECT name FROM Person, Purchase WHERE buyer=name AND store=“The Bon”)
Similarly, you can use INTERSECT and EXCEPT.You must have the same attribute names (otherwise: rename).
Exercises
Product ( pname, price, category, maker)Purchase (buyer, seller, store, product)Company (cname, stock price, country)Person( per-name, phone number, city)
Ex #1: Find people who bought telephony products.Ex #2: Find names of people who bought American productsEx #3: Find names of people who bought American products and did not buy French productsEx #4: Find names of people who bought American products and they live in Seattle.Ex #5: Find people who bought stuff from Joe or bought products from a company whose stock prices is more than $50.
SubqueriesSELECT Purchase.productFROM PurchaseWHERE buyer = (SELECT name FROM Person WHERE social-security-number = “123 - 45 - 6789”);
In this case, the subquery returns one value.
If it returns more, it’s a run-time error.
Subqueries Returning Relations
SELECT Company.name FROM Company, Product WHERE Company.name=maker AND Product.name IN (SELECT product FROM Purchase WHERE buyer = “Joe Blow”);
Find companies who manufacture products bought by Joe Blow.
You can also use: s > ALL R s > ANY R EXISTS R
Conditions on Tuples
SELECT Company.name FROM Company, Product WHERE Company.name=maker AND (Product.name,price) IN (SELECT product, price) FROM Purchase WHERE buyer = “Joe Blow”);
Correlated Queries
SELECT title FROM Movie AS Old WHERE year < ANY (SELECT year FROM Movie WHERE title = Old.title);
Movie (title, year, director, length) Movie titles are not unique (titles may reappear in a later year).
Find movies whose title appears more than once.
Note scope of variables
Removing Duplicates
SELECT DISTINCT Company.name FROM Company, Product WHERE Company.name=maker AND (Product.name,price) IN (SELECT product, price) FROM Purchase WHERE buyer = “Joe Blow”);
Conserving Duplicates
(SELECT name FROM Person WHERE City=“Seattle”)
UNION ALL
(SELECT name FROM Person, Purchase WHERE buyer=name AND store=“The Bon”)
The UNION, INTERSECTION and EXCEPT operators operate as sets, not bags.
Grouping and Aggregation
Product Date Price Quantity
Bagel 10/21 0.85 15
Banana 10/22 0.52 7
Banana 10/19 0.52 17
Bagel 10/20 0.85 20
Example 1: find total sales for the entire database
Simple Aggregation
SELECT Sum(price * quantity)FROM Purchase
SELECT Sum(price * quantity)FROM PurchaseWHERE product = ‘bagel’
SQL supports several aggregation operations:
SUM, MIN, MAX, AVG, COUNT
Except COUNT, all aggregations apply to a single attribute
Grouping and Aggregation
Product Date Price Quantity
Bagel 10/21 0.85 15
Banana 10/22 0.52 7
Banana 10/19 0.52 17
Bagel 10/20 0.85 20
Example 2: find total sales per product.
Solution: Two Steps
Product Date Price Quantity
Banana 10/19 0.52 17
Banana 10/22 0.52 7
Bagel 10/20 0.85 20
Bagel 10/21 0.85 15
Example 2: find total sales per product.
First: group the entries by product.
Then, aggregate
Product TotalSales
Bagel $29.75
Banana $12.48
SELECT product, Sum(price * quantity) AS TotalSalesFROM PurchaseGROUP BY product
Another Example
SELECT product, Sum(price * quantity) AS SumSales Max(quantity) AS MaxQuantityFROM PurchaseGROUP BY product
For every product, what is the total sales and max quantity sold?
Product SumSales MaxQuantity
Banana $12.48 17
Bagel $29.75 20
Grouping and Aggregation: Summary
SELECT product, Sum(price)FROM Product, PurchaseWHERE Product.name = Purchase.productGROUP BY Product.name
1. Compute the relation (I.e., the FROM and WHERE).2. Group by the attributes in the GROUP BY3. Select one tuple for every group (and apply aggregation)
SELECT can have (1) grouped attributes or (2) aggregates.
HAVING Clause
SELECT product, Sum(price * quantity)FROM PurchaseGROUP BY productHAVING Sum(quantity) > 30
Same query, except that we consider only products that hadat least 100 buyers.
HAVING clause contains conditions on aggregates.
Exercises
Product ( pname, price, category, maker)Purchase (buyer, seller, store, product)Company (cname, stock price, country)Person( per-name, phone number, city)
Ex #6: What is the total sales of products by country of manufacture?
Ex #7: Find the total revenue from products of which 100,000 units or more have been sold.
Modifying the Database
We have 3 kinds of modifications: insertion, deletion, update.
Insertion: general form -- INSERT INTO R(A1,…., An) VALUES (v1,…., vn)
Insert a new purchase to the database:
INSERT INTO Purchase(buyer, seller, product, store) VALUES (‘Joe’, ‘Fred’, ‘wakeup-clock-espresso-machine’, ‘The Sharper Image’)
If we don’t provide all the attributes of R, they will be filled with NULL.
We can drop the attribute names if we’re providing all of them in order.
More Interesting Insertions
INSERT INTO PRODUCT(name)
SELECT DISTINCT product FROM Purchase WHERE product NOT IN (SELECT name FROM Product)
The query replaces the VALUES keyword.
Note the order of querying and inserting.
Deletions
DELETE FROM PURCHASE
WHERE seller = ‘Joe’ AND product = ‘Brooklyn Bridge’
Factoid about SQL: there is no way to delete only a single occurrence of a tuple that appears twice in a relation.
Updates
UPDATE PRODUCTSET price = price/2WHERE Product.name IN (SELECT product FROM Purchase WHERE Date =‘Oct, 25, 2000’);
Data Definition in SQL
So far, SQL operations on the data.
Data definition: defining the schema.
• Create tables• Delete tables• Modify table schema
But first: Define data types.
Finally: define indexes.
Data Types in SQL
• Character strings (fixed of varying length)• Bit strings (fixed or varying length)• Integer (SHORTINT)• Floating point• Dates and times
Domains will be used in table declarations.
To reuse domains:CREATE DOMAIN address AS VARCHAR(55)
You wish you had richer types, no? (hang on…)
Creating Tables
CREATE TABLE Person(
name VARCHAR(30), social-security-number INTEGER, age SHORTINT, city VARCHAR(30), gender BIT(1), Birthdate DATE
);
Deleting or Modifying a Table
Deleting: DROP Person;
Altering: (adding or removing an attribute).
ALTER TABLE Person ADD phone CHAR(16);
ALTER TABLE Person DROP age;
What happens when you make changes to the schema?
Default Values
The default of defaults: NULL
Specifying default values:
CREATE TABLE Person(
name VARCHAR(30), social-security-number INTEGER, age SHORTINT DEFAULT 100, city VARCHAR(30) DEFAULT ‘Seattle’, gender CHAR(1) DEFAULT ‘?’, Birthdate DATE
Indexes
REALLY important to speed up query processing time.
Suppose we have a relation
Person (name, SSN, age, city)
An index on SSN enables us to fetch a tuple for a given SSN very efficiently (not have to scan the whole relation).
The problem of deciding which indexes to put on the relations isvery hard! (it’s called: physical database design).
Creating Indexes
CREATE INDEX ssnIndex ON Person(SSN)
Indexes can be created on more than one attribute:
CREATE INDEX doubleindex ON Person (age, city)
Why not create indexes on all the attributes?
Defining Views (!!)Views are relations, except that they are not physically stored.They are used to:• simplify complex queries, and• define distinct conceptual interfaces for different classes of users.
Example view: purchases of telephony products.
CREATE VIEW telephony-purchases AS SELECT product, buyer, seller, store FROM Purchase, Product WHERE Purchase.product = Product.pname AND Product.category = ‘telephony’ The view is materialized when its results are stored in the DBMS.
A Different ViewCREATE VIEW Seattle-view AS
SELECT buyer, seller, product, store FROM Person, Purchase WHERE Person.city = ‘Seattle’ AND Person.per-name = Purchase.buyer
We can later use the views: SELECT name, store FROM Seattle-view, Product WHERE Seattle-view.product = Product.name AND Product.category = ‘shoes’
What’s really happening when we query a view?? It’s unfolded.
Updating ViewsHow can I insert a tuple into a table that doesn’t exist?
CREATE VIEW bon-purchase AS SELECT store, seller, product (note: buyer is not selected) FROM Purchase WHERE store = ‘The Bon Marche’
If we make the following insertion:
INSERT INTO bon-purchase VALUES (‘the Bon Marche’, ‘Joe’, ‘Denby Mug’)
We can simply add a tuple (‘the Bon Marche’, ‘Joe’, NULL, ‘Denby Mug’) to relation Purchase.
Non-Updatable ViewsCREATE VIEW Seattle-view AS
SELECT seller, product, store FROM Person, Purchase WHERE Person.city = ‘Seattle’ AND Person.name = Purchase.buyer
How can we add the following tuple to the view? (‘Joe’, ‘Shoe Model 12345’, ‘Nine West’)In principle, two tuples should be added to the database:
Person: (foo, NullPhoneNumber, ‘Seattle’)Purchase: (foo, ‘Joe’, ‘Nine West’, ‘Shoe Model 12345’)
But it’s very hard to manage the foo’s later, so this update is not legal.
Reusing a Materialized View• Suppose I have only the result of SeattleView:SELECT buyer, seller, product, store FROM Person, Purchase WHERE Person.city = ‘Seattle’ AND Person.per-name = Purchase.buyer
• and I want to answer the query SELECT buyer, seller FROM Person, Purchase WHERE Person.city = ‘Seattle’ AND Person.per-name = Purchase.buyer AND Purchase.product=‘gizmo’.
Then, I can rewrite the query using the view.
Query Rewriting Using Views
Rewritten query: SELECT buyer, seller FROM SeattleView WHERE product= ‘gizmo’
Original query:SELECT buyer, seller FROM Person, Purchase WHERE Person.city = ‘Seattle’ AND Person.per-name = Purchase.buyer AND Purchase.product=‘gizmo’.
Another Example• I still have only the result of SeattleView:SELECT buyer, seller, product, store FROM Person, Purchase WHERE Person.city = ‘Seattle’ AND Person.per-name = Purchase.buyer
• but I want to answer the query SELECT buyer, seller FROM Person, Purchase WHERE Person.city = ‘Seattle’ AND Person.per-name = Purchase.buyer AND Person.Phone LIKE ‘206 543 %’.
The General Problem
• Given a set of views V1,…,Vn, and a query Q, can we answer Q using only the answers to V1,…,Vn?
• Why do we care?– We can answer queries more efficiently. – We can query data sources on the WWW in a
principled manner.
• Many, many papers on this problem.• The best performing algorithm: The MiniCon
Algorithm, (Pottinger & Levy, 2000).• Great survey on the topic: (Halevy, 2000).
Querying the WWW• Assume a virtual schema of the WWW, e.g.,
– Course(number, university, title, prof, quarter)
• Every data source on the web contains the answer to a view over the virtual schema:
UW database: SELECT number, title, prof FROM Course WHERE univ=‘UW’ AND quarter=‘4/99’Stanford database: SELECT number, title, prof, quarter FROM Course WHERE univ=‘Stanford’User query: find all professors who teach “database systems”
SQL: Constraints and Triggers
• Chapter 6 Ullman and Widom• Certain properties we’d like our database to
hold• Modification of the database may break
these properties• Build handlers into the database definition• Key constraints• Referential integrity constraints.
Keys: Fundamental Constraint
• In the CREATE TABLE statement, use:– PRIMARY KEY, UNIQUE
CREATE TABLE MovieStar (
name CHAR(30) PRIMARY KEY,
address VARCHAR(255),
gender CHAR(1));
• Or, list at end of CREATE TABLEPRIMARY KEY (name)
Keys...
• Can use the UNIQUE keyword in same way– …but for any number of attributes
– foreign key only reference PRIMARY KEY
• Indexing KeysCREATE UNIQUE INDEX YearIndex ON Movie(year)
• Makes insertions easier to check for key constraints
• Subtle differences between PRIMARY KEY and UNIQUE
Foreign Key ConstraintsActedIn(ActorName, MovieName)Movies(MovieName, year)If MovieName in ActedIn is a foreign key for
Movies, then: 1) Foreign Key must be a reference to a valid
value in the referenced table.2) … must be a PRIMARY KEY in the
referenced table.Yes, this is a referential integrity constraint.
Declaring Foreign Key Constraints
• REFERENCES keyword...CREATE TABLE ActedIn (
Name CHAR(30) PRIMARY KEY,
MovieName CHAR(30)
REFERENCES Movies(MovieName));
• Or, summarize at end of CREATE TABLEFOREIGN KEY MovieName REFERENCES
Movies(MovieName)
• MovieName must be a PRIMARY KEY
How do we Maintain them?
• Given a change to DB, there are several possible violations:– Insert new tuple with bogus foreign key value– Update a tuple to a bogus foreign key value– Delete a tuple in the referenced table with the
referenced foreign key value– Update a tuple in the referenced table that
changes the referenced foreign key value
How to Maintain?• Recall, ActedIn has FK MovieName...
Movies(MovieName, year)
(Fatal Attraction, 1987)
ActedIn(ActorName, MovieName)
(Michael Douglas, Fatal Attraction)
insert: (Rick Moranis, Strange Brew)
How to Maintain?• Policies for handling the change…
– Reject the update (default)– Cascade (example: cascading deletes)– Set NULL
• Can set update and delete actions independently in CREATE TABLE
MovieName CHAR(30)
REFERENCES Movies(MovieName))
ON DELETE SET NULL
ON UPDATE CASCADE
Constraining Attribute Values
• Constrain invalid values– NOT NULL– gender CHAR(1)
CHECK (gender IN (‘F’, ‘M’))– MovieName CHAR(30)
CHECK (MovieName IN (SELECT MovieName FROM Movies))
• The constraint is checked whenever the value of the attribute is changed or added.
• Isn’t this the same as REFERENCE?
Constraining Values with User Defined ‘Types’
• Can define new domains to use as the attribute type...CREATE DOMAIN GenderDomain CHAR(1)
CHECK (VALUE IN (‘F’, ‘M’));
• Then update our attribute definition...gender GenderDomain
• Note use of VALUE to refer to the attribute value.• Deferred constraints: deal with circular
referencial integrity constraints.
More Complex Constraints...
• …Among several attributes in one table– Specify at the end of CREATE TABLE
CHECK (gender = ‘F’ OR name NOT LIKE ‘Ms.%’)
• Checked whenever a tuple of the relation is added or updated.
• Note, changes in other places in the database may cause the constraint to be violated. If this is important, use assertions.
Declaring Assertions
• CREATE ASSERTION <name> CHECK (<condition>)
CREATE ASSERTION RichPres CHECK
(NOT EXISTS
(SELECT *
FROM Studio, MovieExec
WHERE presC# = cert#
AND netWorth < 10000000))
Checked whenever any change occurs to the database.
Another Example
CREATE ASSERTION SumLength
CHECK (10000 < ALL (SELECT SUM(length) FROM Movie GROUP BY studioName)).
Can we also write this as a tuple constraint?
Different Constraint TypesType Where Declared When activated Guaranteed to hold?
Attribute with attribute on insertion not if CHECK or update subquery
Tuple relation schema insertion or not if CHECK update to subquery relation
Assertion database schema on change to Yes any relation mentioned
TriggersEnable the database programmer to specify:• when to check a constraint,• what exactly to do.
A trigger has 3 parts:
• An event (e.g., update to an attribute)• A condition (e.g., a query to check)• An action (deletion, update, insertion)
When the event happens, the system will check the constraint, and if satisfied, will perform the action.
NOTE: triggers may cause cascading effects. Database vendors did not wait for standards with triggers!
Elements of Triggers (in SQL3)• Timing of action execution: before, after or instead of triggering event
• The action can refer to both the old and new state of the database.
• Update events may specify a particular column or set of columns.
• A condition is specified with a WHEN clause.
• The action can be performed either for• once for every tuple, or• once for all the tuples that are changed by the database operation.
Example: Row Level Trigger
CREATE TRIGGER NoLowerPrices
AFTER UPDATE OF price ON ProductREFERENCING OLD AS OldTuple NEW AS NewTupleWHEN (OldTuple.price > NewTuple.price) UPDATE Product SET price = OldTuple.price WHERE name = NewTuple.name
FOR EACH ROW
Statement Level Trigger CREATE TRIGGER average-price-preserveINSTEAD OF UPDATE OF price ON Product
REFERENCING OLD_TABLE AS OldStuff NEW_TABLE AS NewStuffWHEN (1000 < (SELECT AVG (price) FROM ((Product EXCEPT OldStuff) UNION NewStuff))DELETE FROM Product WHERE (name, price, company) IN OldStuff;INSERT INTO Product (SELECT * FROM NewStuff)
Bad Things Can HappenCREATE TRIGGER Bad-trigger
AFTER UPDATE OF price IN ProductREFERENCING OLD AS OldTuple NEW AS NewTuple
WHEN (NewTuple.price > 50)
UPDATE Product SET price = NewTuple.price * 2 WHERE name = NewTuple.name
FOR EACH ROW
Embedded SQL
• Direct SQL is rarely used: usually, SQL is embedded in some application code.
• We need some method to reference SQL statements.
• But: there is an impedance mismatch problem.• So: we use cursors.• Many things can be explained with the impedance
mismatch.
Programs with SQL
Host language + Embedded SQL
Preprocessor
Host Language + function calls
Host language compiler
Host language program
Preprocessor
Host language compiler
The Impedance Mismatch Problem
The host language manipulates variables, values, pointers
SQL manipulates relations.
There is no construct in the host language for manipulating relations.
Why not use only one language?• Forgetting SQL: “we can quickly dispense with this idea” [Ullman & Widom, pg. 363].• SQL cannot do everything that the host language can do.
Interface: SQL / Host Language
Values get passed through shared variables.
Colons precede shared variables when they occur within the SQL statements.
EXEC SQL: precedes every SQL statement in the host language.
The variable SQLSTATE provides error messages and status reports (e.g., 00000 says that the operation completed with noproblem).
EXEC SQL BEGIN DECLARE SECTION; char productName[30];EXEC SQL END DECLARE SECTION;
Using Shared VariablesVoid simpleInsert() {
EXEC SQL BEGIN DECLARE SECTION; char productName[20], company[30]; char SQLSTATE[6]; EXEC SQL END DECLARE SECTION;
/* get values for productName and company somehow */
EXEC SQL INSERT INTO Product(name, company) VALUES (:productName, :company); }
Single-Row Select Statements
Void getPrice() {
EXEC SQL BEGIN DECLARE SECTION; char productName[20], company[30]; integer price; char SQLSTATE[6]; EXEC SQL END DECLARE SECTION;/* read value of product name */ EXEC SQL SELECT price INTO :price FROM Product WHERE Product.name = :productName;
/* print out value of price */ }
CursorsEXEC SQL DECLARE cursorName CURSOR FOR SELECT …. FROM …. WHERE …. ;
EXEC SQL OPEN cursorName;
while (true) {
EXEC SQL FETCH FROM cursorName INTO :variables;
if (NO_MORE_TUPLES) break;
/* do something with values */ } EXEC SQL CLOSE cursorName;
More on Cursors
• cursors can modify a relation as well as read it.
• We can determine the order in which the cursor will get tuples by the ORDER BY keyword in the SQL query.
• Cursors can be protected against changes to the underlying relations.
• The cursor can be a scrolling one: can go forward, backward +n, -n, Abs(n), Abs(-n).