cm2020: introduction to database systems queries in relational databases database systems 4 th...
TRANSCRIPT
CM2020: Introduction to Database Systems
Queries In Relational Databases
Database Systems 4th edition
Connolly and Begg Chapter 5
Dr Nirmalie [email protected]
Tutorials and Labs• Jointly organised with Dr Robin Boswell ([email protected]) and
Hatem Ahriz ([email protected])
• Tutorials: Tuesdays 2-3pm– A (Robin): C39/39a SAS (CS2, CIM2)– B (Nirmalie): C411 Schoolhill (BIS2, MMD3, CIMD3)– C (Hatem):B40 SAS (ISTD2, ISTD3, CGA2)
• Labs: Tuesday 3-5pm– A (Nirmalie): C18 SAS (CS2, CIM2)– B (Hatem): C24 SAS (BIS2, MMD3, CIMD3)– C (Robin): C26 St Andrew street (ISTD2, ISTD3, CGA2)
Last week …
student# name address age course100 Bobby Dundee 17 C100200 Helen Aberdeen 21 C100300 Helen Stirling 18 C200400 Freddy Perth 30 C300
course# course_name location C100 Computing St Andrews St C200 Pharmacy Schoolhill C300 Architecture Garthdee
• Relational Databases (DB)s Consists of relations or tables
• Tables consists of rows and columns
• Primary Key uniquely identifies each row
• Composite primary keys are allowed and formed by combining multiple columns
• Foreign key is an attribute / set of attributes, within one relation that matches the primary key of another relation
Table: Students
Table: Courses
SQL
• Structured Query Language (SQL)• Data Manipulation Language
– SELECT • Single table• Multi table
– MS Access (Query by Example)– GROUP BY with SELECT– INSERT, UPDATE, DELETE
• Data Definition Language– CREATE TABLE, DROP TABLE, ALTER TABLE
• Querying allows retrieval of data in response to a question about the data in the DB
• Relational DBMSs translates queries into SQL (a.k.a Structured Query Language or SEQUEL)
• SQL’s SELECT construct
– extracts data from the tables of the database
– allows specification of conditions
– does not facilitate modification to the tables or the data contained in tables
Querying with SELECT
SELECT
SELECT {* | [column1 [AS newName]] [,column2, …]} [ALL | DISTINCT]
FROM table1 [alias] [, table2,…][WHERE "conditions"][GROUP BY "column-list"][HAVING "conditions"][ORDER BY "column-list" [ASC | DESC] ];
Everything within [ ] is optional
Standard syntax is:
; denotes end of SELECT statement
Single-Table Select Queries
Driversdriver# name address pointsD010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12
The result of a query is another table - this table is not permanent - this table is called a DYNASET
(a.k.a Dynamic Data Set) in Microsoft Access
SELECT
Drivers
driver# name address pointsD010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12
EXAMPLE: List all the ids (driver#) from the Drivers table.
SELECT driver#FROM Drivers;
driver#D010 D020 D030 D040D050
SELECT driver#
SELECT cont
Drivers
driver# name address points
D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12
EXAMPLE: List all the ids (driver#) and points from the Drivers table
SELECT driver#, pointsFROM Drivers;
driver# pointsD010 7D020 4D030 2D040 3D050 12
SELECT driver#, points
We use commas to
separate fields
SELECT with Wildcards
SELECT *FROM Drivers;
Drivers
driver# name address points
D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12
driver# name address points
D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12
EXAMPLE: List all the details (i.e. all the fields) from the Drivers table
SELECT *
We use * for all fields
SELECT with WHERE
Drivers
driver# name address points
D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12
EXAMPLE: List all the ids (driver#) of the drivers from Dundee from the Drivers table
SELECT driver#FROM DriversWHERE address = “Dundee”;
driver#
D020 D030
SELECT driver#
WHERE address = “Dundee”;
Extending SELECT• We can extend SELECT in a number of ways:
1. Can pattern match within certain fields of a table
- can use the LIKE clause
2. Can check if a field has a value from a set of values
- can use the IN operator
3. Can perform calculations on values in a table
- can use COLUMN FUNCTIONS
4. Can make your queries have a choice or be more specific
- can use the OR and AND clauses
5. Can sort output of a query into ascending/descending order
- can use the ORDER BY clause
The LIKE clause• Sometimes we may want to pattern
match within certain fields of a table
? matches a single character Like ‘Sm?th’
* matches any number of characters
Like St*
# matches a single digit [0-9] Like ‘D0#0’
[list] matches any character in list Like ‘[A-F]*’
[!list] matches any character not in list
Like ‘[!A-F]*’
SELECT LIKE
Drivers
driver# name address points
D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12
EXAMPLE: List all the names and addresses of the drivers whose address begins with the letters ‘Du’ from the Drivers table
SELECT name, addressFROM DriversWHERE address Like ‘Du*’;
name address
jimmy DundeeDavid Dundee
SELECT name, addressWHERE address Like ‘Du*’;
The IN Operator
Driversdriver# name address points
D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12
The IN operator is used to check if an attribute(s) has a value from a set of values
[NOT] IN (<value-list>)
name
JimmyDavidBobby
Syntax is:
a) List the names of drivers from Dundee or AberdeenSELECT nameFROM DriversWHERE address IN (‘Dundee’, ‘Aberdeen’);
b) List the names of drivers not from Dundee or AberdeenSELECT nameFROM DriversWHERE address NOT IN (‘Dundee’, ‘Aberdeen’);
name
FredJohn
Column / Aggregate FunctionsColumn functions perform calculations on the values in the column of a table
COUNT - calculates the number of values in a column
SUM - calculates the sum (total) of all values in a column
AVG - calculates the average of all values in a column
MAX - calculates the maximum value in a column
MIN - calculates the minimum value in a column
SELECT with Aggregates
Driversdriver# name address points
D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12
EXAMPLE: Calculate the number of drivers, the total number of points, the average number of points, the maximum points value and the range of points from the Driver
SELECT COUNT(driver#), SUM(points), AVG(points), MAX(points), MIN(points), MAX(points) - MIN(points)FROM Drivers;
COUNT(driver#) SUM(points) AVG(points) MAX(points) MIN(points) MAX(points) - MIN(points)
5 28 5.6 12 2 10
SELECT with ORDER BY
Driversdriver# name address points
D010 Fred Perth 7D020 Bobby Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Jimmy Aberdeen 12
List in alphabetical order, names of drivers with a Dundee or Aberdeen address
SELECT nameFROM DriversWHERE address IN (‘Dundee’, ‘Aberdeen’)ORDER BY name;
name
BobbyDavidJimmy
NOTE: Default is ascending
SELECT nameIN (‘Dundee’, ‘Aberdeen’);
SELECT with ORDER BYList in reverse alphabetical order, names of drivers with a Dundee or Aberdeen address
SELECT nameFROM DriversWHERE address IN (‘Dundee’, ‘Aberdeen’)ORDER BY name DESC;
name
JimmyDavidBobby
Driversdriver# name address points
D010 Fred Perth 7D020 Bobby Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Jimmy Aberdeen 12
IN (‘Dundee’, ‘Aberdeen’);SELECT name
DESC means descending
Complex Conditions with WHERE
• Mathematical operators– =, <, >, <>, <=, >=
• Logical operatorsAND, OR, NOT
Combining Operators AND
Driversdriver# name address points
D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12
EXAMPLE: List the names of drivers whose licence points are between 4 and 15 from the Drivers table and not from Dundee
SELECT nameFROM DriversWHERE points >= 4 AND points <= 15AND (address <> ‘Dundee');
name
FredBobby
SELECT name
WHERE points >= 4 AND points <= 15AND (address <> ‘Dundee’);
Driversdriver# name address points
D010 Fred Perth 7D020 Jimmy Dundee 4D030 David Dundee 2D040 John Stirling 3D050 Bobby Aberdeen 12
EXAMPLE: List the names of drivers with addresses from either Aberdeen or Dundee
SELECT nameFROM DriversWHERE (address LIKE ‘D*') OR (address LIKE ‘A*');
name
JimmyDavidBobby
SELECT nameWHERE (address LIKE ‘D*’)OR (address LIKE ‘A*’)
Combining Operators OR
Mathematical Operators• supports the following
– +, - , * , / (division), % (modulo)
item# qty price
Shoes 2 20Raft 1 60Skateboard 10 20Life Vest 1 5
SELECT item, qty*price AS TotalCost FROM Items_ordered;
item# TotalCost
Shoes 40Raft 60Skateboard 200Life Vest 5
Items_ordered
Use “AS” to specify new
column name
Summary: Single Table Queries• Queries may be created in a DBMS using SQL SELECT statement
• The basic ANSI syntax is:
SELECT <list of attributes>FROM <table-name>[ WHERE <condition(s)> ] ;
• Can pattern match within certain fields of a table
- LIKE clause
• Can check if a field has a value from a set of values
- IN operator
• Can perform calculations on values in a table
- COLUMN FUNCTIONS
• Can make your queries have a choice or be more specific
- OR and AND clauses
• Can sort output of a query into ascending/descending order
- ORDER BY clause
Multi-Table Queries
• Commonly require data from more than one table
• Involves JOINING tables to form a SUPERTABLE
• Table JOINS– PRODUCT or CARTESIAN– INNER JOIN or EQUI JOIN– NATURAL JOIN– OUTER JOIN (LEFT and RIGHT)
Multi-Table Select Queries
• List data on all orders including customer names and addresses
Customerscust# name address
C100 Allan AberdeenC101 John DundeeC102 Betty StirlingC200 Dean Dundee
Ordersorder# cust# date
2000 C100 20/11/013000 C101 27/11/014000 C100 30/11/01
Product Join• forms every possible combination of
rows combining all columnsE.g Customer PRODUCT Order
C.cust# name address order# O.cust# date
C100 Allan Aberdeen 2000 C100 20/11/01C101 John Dundee 2000 C100 20/11/01C102 Betty Stirling 2000 C100 20/11/01C200 Dean Dundee 2000 C100 20/11/01C100 Allan Aberdeen 3000 C101 27/11/95
etc
SELECT *FROM Customer, Order;
Inner Join (or Equi-Join)
1. Apply a product join2. Retrieve rows with matching linked attribute
values
E.g Customer INNER JOIN Order
C.cust# name address order# O.cust# date
C100 Allan Aberdeen 2000 C100 20/11/01C101 John Dundee 3000 C101 27/11/01C100 Allan Aberdeen 4000 C100 30/11/01
SELECT *FROM Customer C, Order OWHERE C.cust# = O.cust#;
Inner join is a more useful kind of join for data retrieval
Use of alias. Can also use “AS”
here. E.g. Order AS O
Natural Join
C.cust# name address order# date
C100 Allan Aberdeen 2000 20/11/01C101 John Dundee 3000 27/11/01C100 Allan Aberdeen 4000 30/11/01
SELECT C.cust, name, address, order#, dateFROM Customer C, Order OWHERE C.cust# = O.cust#;
Similar to Inner join but with a single matching attributee.g. no O.cust#
Queries on joined tables
cust# name address order# date
C101 John Dundee 3000 27/11/01C100 Allan Aberdeen 4000 30/11/01
SELECT C.cust#, name, order#, dateFROM Customer AS C, Order AS OWHERE Customer.cust# = Orders.cust#AND date > 25/11/01;
OUTER JOIN
• Use when we want to see the details from unmatched rows as well– columns of the output corresponding
to unavailable data are set to null• Right Outer JOIN
– E.g. Not all customers have placed orders in the Orders table
• Left Outer Join
Left Outer Join• rows in the 1st table with unmatched rows in the 2nd
table are still included with NULL values
E.g Customer LEFT OUTER JOIN Order
cust# name address order# date
C100 Allan Aberdeen 2000 20/11/01C101 John Dundee 3000 27/11/01C100 Allan Aberdeen 4000 30/11/01C102 Betty Stirling NULL NULLC200 Dean Dundee NULL NULL
SELECT Customer.cust#, name, order#, dateFROM CustomerLEFT JOIN orderON Customer.cust# = Orders.cust#;
Summary Multi-Table Joins
• It is possible to join as many tables as you want– generally for n tables n-1 joins
• Inner Join (and Natural join) is most common– outer joins will display unmatched
rows
Query Processor
• Extracts required information from database
• How?– Structured Query Language (SQL)– Query by Example (QBE)
• Querying without having to know SQL
QBE with MS AccessE.g. Inner Join
MS Access SQL View
Fu
ncti
on
ally S
imilar
SQL - Next
• Structured Query Language (SQL)• Data Manipulation Language
– SELECT • Single table• Multi table
– MS Access (Query by Example)– GROUP BY with SELECT– INSERT, UPDATE, DELETE
• Data Definition Language– CREATE TABLE, DROP TABLE, ALTER TABLE
Grouping Results
• Useful to summarise results
• Closely integrated with SELECT– Specify column names– Aggregate functions e.g. count, avg, sum, max, min– WHERE conditions– Any combination of the above
• All columns names in the SELECT must appear in the GROUP BY– HAVING clause used to specify conditions on groups
of rows
GROUP BYANSI syntax is:
All select columnsMust appear here
Specify conditions applicable to groups
SELECT {* | [column1 [AS newName]] [,column2, …]} [ALL | DISTINCT]
FROM table1 [alias] [, table2,…][WHERE "conditions"][GROUP BY "column-list"][HAVING "conditions"][ORDER BY "column-list" [ASC | DESC] ];
Group By Clause ExampleGiven the following table:
Employees
NI_no name address salaryNS 111111 Fred Aberdeen 15,000NS 222222 Bobby Dundee 20,000NS 333333 Dave Aberdeen 12,000NS 444444 Steve Stirling 10,000NS 555555 Betty Dundee 25,000
Show, for each address, the number of employees that live there
SELECT address, count(NI_no)FROM EmployeesGROUP BY address;
Address count(NI_no)Aberdeen 2Dundee 2Stirling 1
Group By Clause Example
In order to do the previous query, an intermediate table is created that is group by (sorted on) address i.e.:
Employees
NI_no name address salaryNS 111111 Fred Aberdeen 15,000NS 333333 Dave Aberdeen 12,000NS 222222 Bobby Dundee 20,000NS 555555 Betty Dundee 25,000NS 444444 Steve Stirling 10,000
We perform count(NI_no) for each of the 3 groupings
grouping 1
grouping 2
grouping 3
Given the following table:
EmployeesNI_no name address salaryNS 111111 Fred Aberdeen 15,000NS 222222 Bobby Dundee 20,000NS 333333 Dave Aberdeen 12,000NS 444444 Steve Stirling 10,000NS 555555 Betty Dundee 25,000
Show the addresses whose employees’ average salary is greater than £21,000. Also show the average salary.
SELECT address, avg(salary)FROM EmployeesGROUP BY addressHAVING avg(salary) > 12000;
address avg(salary)Aberdeen 13500Dundee 22500
NOTE: compare with intermediate table from previous slide
GROUP BY with Having