csc 261/461 –database systems lecture 5 · 2017. 9. 20. · 6 nested queries: sub-queries...

Post on 21-Feb-2021

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CSC 261/461 – Database SystemsLecture 5

Fall 2017

MULTISET OPERATIONS IN SQL

2

UNION

3

SELECT R.AFROM R, SWHERE R.A=S.AUNIONSELECT R.AFROM R, TWHERE R.A=T.A Q1 Q2

𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 ∪ 𝑟. 𝐴 𝑟. 𝐴 = 𝑡. 𝐴}

Whyaren’tthereduplicates?

Whatifwewantduplicates?

UNION ALL

4

SELECT R.AFROM R, SWHERE R.A=S.AUNION ALLSELECT R.AFROM R, TWHERE R.A=T.A Q1 Q2

𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 ∪ 𝑟. 𝐴 𝑟. 𝐴 = 𝑡. 𝐴}

ALLindicatesMultisetoperations

EXCEPT

5

SELECT R.AFROM R, SWHERE R.A=S.AEXCEPTSELECT R.AFROM R, TWHERE R.A=T.A Q1 Q2

𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 \{𝑟. 𝐴|𝑟. 𝐴 = 𝑡. 𝐴}

6

Nested queries: Sub-queries Returning Relations

SELECT DISTINCT c.cityFROM Company cWHERE c.name IN (

SELECT pr.makerFROM Purchase p, Product prWHERE p.product = pr.name

AND p.buyer = ‘Joe Blow‘)

“CitieswhereonecanfindcompaniesthatmanufactureproductsboughtbyJoeBlow”

Company(name, city)Product(name, maker)Purchase(id, product, buyer)

Anotherexample:

7

Subqueries Returning Relations

SELECT nameFROM ProductWHERE price > ALL(

SELECT priceFROM ProductWHERE maker = ‘Gizmo-Works’)

Product(name, price, category, maker)

Youcanalsouseoperationsoftheform:• s>ALLR• s<ANYR• EXISTSR

Findproductsthataremoreexpensivethanallthoseproducedby“Gizmo-Works”

Ex:

ANYandALLnotsupportedbySQLite.

8

Subqueries Returning Relations

SELECT p1.nameFROM Product p1WHERE p1.maker = ‘Gizmo-Works’

AND EXISTS(SELECT p2.name

FROM Product p2WHERE p2.maker <> ‘Gizmo-Works’

AND p1.name = p2.name)

Product(name, price, category, maker)

Youcanalsouseoperationsoftheform:• s>ALLR• s<ANYR• EXISTSR

Find‘copycat’products,i.e.productsmadebycompetitorswiththesamenamesasproductsmadeby“Gizmo-Works”

Ex:

<>means!=

9

Nested queries as alternatives to INTERSECT and EXCEPT

(SELECT R.A, R.BFROM R)INTERSECT(SELECT S.A, S.BFROM S)

SELECT R.A, R.BFROM RWHERE EXISTS(

SELECT *FROM S

WHERE R.A=S.A AND R.B=S.B)

SELECT R.A, R.BFROM RWHERE NOT EXISTS(

SELECT *FROM SWHERE R.A=S.A AND R.B=S.B)

(SELECT R.A, R.BFROM R)EXCEPT(SELECT S.A, S.BFROM S)

10

Correlated Queries

SELECT DISTINCT titleFROM Movie AS mWHERE year <> ANY(

SELECT yearFROM MovieWHERE title = m.title)

Movie(title, year, director, length)Findmovieswhosetitleappearsmorethanonce.

Notethescopingofthevariables!

Basic SQL Summary

• SQL provides a high-level declarative language for manipulating data (DML)

• The workhorse is the SFW block

• Set operators are powerful but have some subtleties

• Powerful, nested queries also allowed.

11

2. AGGREGATION & GROUP BY

12

What you will learn about in this section

1. Aggregationoperators

2. GROUPBY

3. GROUPBY:withHAVING,semantics

13

14

Aggregation

SELECT COUNT(*)FROM ProductWHERE year > 1995

ExceptCOUNT,allaggregationsapplytoasingleattribute

SELECT AVG(price)FROM ProductWHERE maker = “Toyota”

• SQLsupportsseveralaggregation operations:• SUM,COUNT,MIN,MAX,AVG

15

• COUNTappliestoduplicates,unlessotherwisestated

SELECT COUNT(category) FROM ProductWHERE year > 1995

Note:SameasCOUNT(*).Why?

Weprobablywant:

SELECT COUNT(DISTINCT category)FROM ProductWHERE year > 1995

Aggregation: COUNT

16

Purchase(product, date, price, quantity)

More Examples

SELECT SUM(price * quantity)FROM Purchase

SELECT SUM(price * quantity)FROM PurchaseWHERE product = ‘bagel’

Whatdothesemean?

17

Simple Aggregations

PurchaseProduct Date Price Quantitybagel 10/21 1 20banana 10/3 0.5 10banana 10/10 1 10bagel 10/25 1.50 20

SELECT SUM(price * quantity)FROM PurchaseWHERE product = ‘bagel’

50(=1*20+1.50*20)

18

Grouping and Aggregation

SELECT product,SUM(price * quantity) AS TotalSales

FROM PurchaseWHERE date > ‘10/1/2005’GROUP BY product

Let’sseewhatthismeans…

Findtotalsalesafter10/1/2005perproduct.

Purchase(product, date, price, quantity)

19

Grouping and Aggregation

1.ComputetheFROM andWHERE clauses

2.GroupbytheattributesintheGROUPBY

3.ComputetheSELECT clause:groupedattributesandaggregates

Semanticsofthequery:

20

1. Compute the FROM and WHERE clauses

Product Date Price QuantityBagel 10/21 1 20Bagel 10/25 1.50 20Banana 10/3 0.5 10Banana 10/10 1 10

SELECT product, SUM(price*quantity) AS TotalSalesFROM PurchaseWHERE date > ‘10/1/2005’GROUP BY product

FROM

Product Date Price QuantityBagel 10/21 1 20Bagel 10/25 1.50 20Banana 10/3 0.5 10Banana 10/10 1 10

21

2. Group by the attributes in the GROUP BY

SELECT product, SUM(price*quantity) AS TotalSalesFROM PurchaseWHERE date > ‘10/1/2005’GROUP BY product

GROUP BY Product Date Price Quantity

Bagel10/21 1 2010/25 1.50 20

Banana10/3 0.5 1010/10 1 10

22

3. Compute the SELECT clause: grouped attributes and aggregates

SELECT product, SUM(price*quantity) AS TotalSalesFROM PurchaseWHERE date > ‘10/1/2005’GROUP BY product

Product TotalSales

Bagel 50

Banana 15

SELECTProduct Date Price Quantity

Bagel10/21 1 2010/25 1.50 20

Banana10/3 0.5 1010/10 1 10

23

HAVING Clause

Samequeryasbefore,exceptthatweconsideronlyproductsthathavemorethan100buyers

HAVINGclausescontainsconditionsonaggregates

SELECT product, SUM(price*quantity)FROM PurchaseWHERE date > ‘10/1/2005’GROUP BY productHAVING SUM(quantity) > 100

WhereasWHEREclausesconditiononindividualtuples…

24

General form of Grouping and Aggregation

• S = Can ONLY contain attributes a1,…,ak and/or aggregates over other attributes

• C1 = is any condition on the attributes in R1,…,Rn

• C2 = is any condition on the aggregate expressions

SELECT SFROM R1,…,RnWHERE C1GROUP BY a1,…,akHAVING C2

Why?

25

General form of Grouping and Aggregation

SELECT SFROM R1,…,RnWHERE C1GROUP BY a1,…,akHAVING C2

Evaluationsteps:

1. EvaluateFROM-WHERE:applyconditionC1 ontheattributesinR1,…,Rn

2. GROUPBYtheattributesa1,…,ak3. ApplyconditionC2 toeachgroup(mayhaveaggregates)

4. ComputeaggregatesinSandreturntheresult

26

Group-by vs. Nested Query

• Find authors who wrote ³ 10 documents:

• Attempt 1: with nested queriesSELECT DISTINCT Author.nameFROM AuthorWHERE COUNT(

SELECT Wrote.urlFROM WroteWHERE Author.login = Wrote.login) > 10

Author(login, name)Wrote(login, url)

ThisisSQLbyanovice

27

Group-by vs. Nested Query

• Find all authors who wrote at least 10 documents:• Attempt 2: SQL style (with GROUP BY)

SELECT Author.nameFROM Author, WroteWHERE Author.login = Wrote.loginGROUP BY Author.nameHAVING COUNT(Wrote.url) > 10

NoneedforDISTINCT:automaticallyfromGROUPBY

ThisisSQLbyanexpert

Group-by vs. Nested Query

Which way is more efficient?

• Attempt #1- With nested: How many times do we do a SFW query over all of the Wrote relations?

• Attempt #2- With group-by: How about when written this way?

WithGROUPBYcanbemuchmoreefficient!

Topics Covered

1. NULLs

2. OuterJoins

3. WithandCase

4. Constraint

5. Schema Change Statements

29

Comparisons Involving NULLand Three-Valued Logic (cont’d.)

Slide 7- 6

31

NULLS in SQL

• Whenever we don’t have a value, we can put a NULL

• Can mean many things:– Value does not exists– Value exists but is unknown– Value not applicable– Etc.

• The schema specifies for each attribute if can be null (nullable attribute) or not

• Each individual NULL value considered to be different from every other NULL value

• SQL uses a three-valued logic:– TRUE, FALSE, and UNKNOWN (like Maybe)

• NULL = NULL comparison is avoided

• How does SQL cope with tables that have NULLs?

32

Null Values

Unexpected behavior:

SELECT *FROM PersonWHERE age < 25 OR age >= 25

SomePersonsarenotincluded!

33

Null Values

Can test for NULL explicitly:– x IS NULL– x IS NOT NULL

SELECT *FROM PersonWHERE age < 25 OR age >= 25

OR age IS NULL

NowitincludesallPersons!

34

RECAP: Inner Joins

By default, joins in SQL are “inner joins”:

SELECT Product.name, Purchase.storeFROM Product JOIN Purchase ON Product.name = Purchase.prodName

SELECT Product.name, Purchase.storeFROM Product, PurchaseWHERE Product.name = Purchase.prodName

Product(name, category)Purchase(prodName, store)

Bothequivalent:BothINNERJOINS!

35

Inner Joins + NULLS = Lost data?

By default, joins in SQL are “inner joins”:

However:Productsthatneversold(withnoPurchasetuple)willbelost!

SELECT Product.name, Purchase.storeFROM Product JOIN Purchase ON Product.name = Purchase.prodName

SELECT Product.name, Purchase.storeFROM Product, PurchaseWHERE Product.name = Purchase.prodName

Product(name, category)Purchase(prodName, store)

36

Outer Joins

• An outer join returns tuples from the joined relations that don’t have a corresponding tuple in the other relations– I.e. If we join relations A and B on a.X = b.X, and there is an entry in A

with X=5, but none in B with X=5…• A LEFT OUTER JOIN will return a tuple (a, NULL)!

• Left outer joins in SQL: SELECT Product.name, Purchase.storeFROM Product LEFT OUTER JOIN Purchase ON

Product.name = Purchase.prodName

Nowwe’llgetproductseveniftheydidn’tsell

37

name category

Gizmo gadget

Camera Photo

OneClick Photo

prodName store

Gizmo Wiz

Camera Ritz

Camera Wiz

name store

Gizmo Wiz

Camera Ritz

Camera Wiz

Product PurchaseINNER JOIN:

SELECT Product.name, Purchase.storeFROM Product INNER JOIN Purchase

ON Product.name = Purchase.prodName

Note:anotherequivalentwaytowriteanINNERJOIN!

38

name category

Gizmo gadget

Camera Photo

OneClick Photo

prodName store

Gizmo Wiz

Camera Ritz

Camera Wiz

name store

Gizmo Wiz

Camera Ritz

Camera Wiz

OneClick NULL

Product PurchaseLEFT OUTER JOIN:

SELECT Product.name, Purchase.storeFROM Product LEFT OUTER JOIN Purchase

ON Product.name = Purchase.prodName

39

Other Outer Joins

• Left outer join:– Include the left tuple even if there’s no match

• Right outer join:– Include the right tuple even if there’s no match

• Full outer join:– Include the both left and right tuples even if there’s no match

Acknowledgement

• Some of the slides in this presentation are taken from the slides provided by the authors.

• Many of these slides are taken from cs145 course offered byStanford University.

• Thanks to YouTube, especially to Dr. Daniel Soper for his useful videos.

CSC261,Spring2017,UR

top related