lab 4 more sql

12
Islamic University of Gaza Faculty of Engineering Department of Computer Engineering ECOM 3422: Database Systems [spring 2020] Abeer J. Al-Aydi & Abdallah H. Azzami Lab 4: More SQL Objectives To be familiar with all types of joins in SQL. To practice writing and understanding queries that employ subqueries. To use your newly acquired knowledge to write rich and complex DML. To familiarize yourself with Dates in SQL. Table of Content OBJECTIVES ____________________________________________________________________________________________ 1 TABLE OF CONTENT _____________________________________________________________________________________ 1 1. FLAVORS OF JOIN___________________________________________________________________________________ 2 JOINED TABLES _________________________________________________________________________________________ 2 JOIN TYPES ____________________________________________________________________________________________ 2 QUALIFIED JOINS ________________________________________________________________________________________ 3 COALESCE FUNCTION _____________________________________________________________________________________ 5 2. SUBQUERIES _______________________________________________________________________________________ 6 SIMPLE SCALAR SUBQUERIES ________________________________________________________________________________ 6 CORRELATED SCALAR SUBQUERIES ____________________________________________________________________________ 6 VIEWING SUBQUERY RESULTS AS SETS __________________________________________________________________________ 7 COMMON TABLE EXPRESSIONS AND THE WITH CLAUSE _______________________________________________________________ 8 3. MORE ADVANCED DML ______________________________________________________________________________ 9 4. MORE LAB WORK _________________________________________________________________________________ 10 5. HOMEWORK _____________________________________________________________________________________ 11 APPENDIX ____________________________________________________________________________________________ 12

Upload: others

Post on 03-Jun-2022

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lab 4 More SQL

Islamic University of Gaza Faculty of Engineering

Department of Computer Engineering

ECOM 3422: Database Systems [spring 2020] Abeer J. Al-Aydi & Abdallah H. Azzami

Lab 4: More SQL

Objectives

To be familiar with all types of joins in SQL.

To practice writing and understanding queries that employ subqueries.

To use your newly acquired knowledge to write rich and complex DML.

To familiarize yourself with Dates in SQL.

Table of Content

OBJECTIVES ____________________________________________________________________________________________ 1

TABLE OF CONTENT _____________________________________________________________________________________ 1

1. FLAVORS OF JOIN ___________________________________________________________________________________ 2

JOINED TABLES _________________________________________________________________________________________ 2 JOIN TYPES ____________________________________________________________________________________________ 2 QUALIFIED JOINS ________________________________________________________________________________________ 3 COALESCE FUNCTION _____________________________________________________________________________________ 5

2. SUBQUERIES _______________________________________________________________________________________ 6

SIMPLE SCALAR SUBQUERIES ________________________________________________________________________________ 6 CORRELATED SCALAR SUBQUERIES ____________________________________________________________________________ 6 VIEWING SUBQUERY RESULTS AS SETS __________________________________________________________________________ 7 COMMON TABLE EXPRESSIONS AND THE WITH CLAUSE _______________________________________________________________ 8

3. MORE ADVANCED DML ______________________________________________________________________________ 9

4. MORE LAB WORK _________________________________________________________________________________ 10

5. HOMEWORK _____________________________________________________________________________________ 11

APPENDIX ____________________________________________________________________________________________ 12

Page 2: Lab 4 More SQL

Lab 4| More SQL

Page 2 of 12

1. Flavors of Join

Last lab we explored the famous select-from-where query. We used different types functions

in that query and we wrote several conditions. In this section we will take a deeper dive into the from

part of the query.

The from clause derives a table from one or more tables given in a comma-separated list. After

the word from and before the clause where those tables can be

a simple table reference such as “instructor”, a derived table such in the form of a subquery as we will see in the following section, or a join construct,

or even a combination of these 3. If more than one table reference is listed in the from clause, the

cartesian product of the listed tables is formed. The result of the from is an intermediate virtual table

that can then be subject to transformations by the where, group by, and other clauses and is finally

the result of the overall table expression.

Joined Tables

A joined table is a table derived from two other (real or derived) tables according to the rules

of the particular join type. Inner, outer, and cross-joins are the different flavors we can choose form.

The general syntax of a joined table is

Table1 {join type} Table2 [join condition]

Joins of all types can be chained together, or nested: either or both Table1 and Table2 can be

themselves joined tables. Parentheses can be used around join clauses to control the join order. In

the absence of parentheses, join clauses nest left-to-right.

Join Types

First of all, please note that this lab assumes you know the theoretical knowledge about joins

covered in chapter 4 of our textbook. So, make sure you got this covered as this lab will focus on

syntax and provide some additional insights, and exercises.

Type 1: Cross join written as

Table1 cross join Table2

This simply computes the cross (Cartesian) product Table1 × Table2

Example 1: select * from instructor cross join teaches

will compute the cross product instructor × teaches

Note that: from instructor cross join teaches is equivalent to from instructor [inner]

join teaches on true. And is also equivalent to from instructor, teaches.

Page 3: Lab 4 More SQL

Lab 4| More SQL

Page 3 of 12

Qualified joins

A join that requires the tuple to satisfy a condition in order to qualify to be a part of the resulting

relation is called a ‘qualified join’. In this sense, all joins other that the cross join are qualified joins.

We can write a qualified join as follows:

Table1 {[inner] | {left | right | full} [outer]} join Table2 on (Boolean expression)

Or

Table1 {[inner] | {left | right | full} [outer]} join Table2 using (join column list)

Or

Table1 natural {[inner] | {left | right | full} [outer]} join Table2

Note that:

The words inner and outer are optional in all forms. inner is the default; left, right, and full

automatically imply an outer join.

The join condition is specified in the on or using clause, or implicitly by the word natural. The join

condition determines which rows from the two source tables are considered to “match”, as we

explained in details in the lecture.

inner join For each row r1 of Table1, the joined table has a row for each row in Table2 that satisfies the join condition with r1.

left outer join

First, an inner join is performed. Then, for each row in Table1 that does not satisfy the join condition with any row in Table2, a joined row is added with null values in columns of Table2. Thus, the joined table always has at least one row for each row in Table1.

right outer join

First, an inner join is performed. Then, for each row in Table2 that does not satisfy the join condition with any row in Table1, a joined row is added with null values in columns of Table1. Thus, the result table will always have a row for each row in Table2.

full outer join First, an inner join is performed. Then, for each row in Table1 that does not satisfy the join condition with any row in Table2, a joined row is added with null values in columns of Table2. Also, for each row of Table2 that does not satisfy the join condition with any row in Table1, a joined row with null values in the columns of Table1 is added.

The on clause is the most general kind of join condition: it takes a boolean value expression of the

same kind as is used in a where clause. A pair of rows from Table1 and Table2 match if the on

expression evaluates to true.

The using clause is a shorthand that allows you to take advantage of the specific situation where both

sides of the join use the same name for the joining column(s). It takes a comma-separated list of the

shared column names and forms a join condition that includes an equality comparison for each one.

For example, joining Table1 and Table2 with using (a, b) produces the join condition on Table1.a =

Table2.a and Table1.b = Table2.b.

Page 4: Lab 4 More SQL

Lab 4| More SQL

Page 4 of 12

Furthermore, the output of join using suppresses redundant columns: there is no need to

print both of the matched columns, since they must have equal values. while join on produces all

columns from Table1 followed by all columns from Table2, join using produces one output column

for each of the listed column pairs (in the listed order), followed by any remaining columns from

Table1, followed by any remaining columns from Table2.

Finally, natural is a shorthand form of using: it forms a using list consisting of all column

names that appear in both input tables. As with using, these columns appear only once in the output

table. If there are no common column names, natural join behaves like join ... on true, producing a cross-product join.

Exercise 1: Execute those statements and observe the deference:

[1] select * from instructor inner join teaches on instructor.id = teaches.id;

[2] select * from instructor inner join teaches using (id);

[3] select * from instructor natural inner join teaches;

Exercise 2: Execute those statements and observe the deference; look for an instructor named ‘Gold’.

[1] select * from instructor left join teaches on instructor.id = teaches.id;

[2] select * from instructor left join teaches using (id);

[3] select * from instructor natural left join teaches;

NOTE: The join condition specified with ON can also contain conditions that do not relate directly to the join. This can prove useful sometimes but you need to be careful.

For example:

select * from instructor left join teaches on instructor.id = teaches.id and teaches.course_id = 'PHY-101'; is very different form:

select * from instructor left join teaches on instructor.id = teaches.id where teaches.course_id = 'PHY-101';

Question 1

Execute both queries to see for yourself and discuss with your colleague why it is so. When you think you have an answer look at [Answer 1] in the appendix to see if you did figure it out correctly.

Page 5: Lab 4 More SQL

Lab 4| More SQL

Page 5 of 12

Hope I got you exited for the lab

Now let’s test your knowledge with this relatively simple task:

Lab Work 1.1

Display a list of all instructors (names and IDs), showing each instructor and the number of sections

he taught. Make sure to show the number of sections as 0 for instructors who have not taught any section. Your query should use an outer join, and should not use subqueries.

Lab Work 1.2

Write the same query as above, but using a scalar subquery, not using outer join.

Coalesce function

coalesce( value1 [,value2, value3, ...] )

The coalesce function returns the first of its arguments that is not null. Null is returned only

if all arguments are null. It is often used in conjunction with outer joins to substitute a default value for null values when data is displayed, for example you can write:

Select name, coalesce( tot_cred , 0 ) from student

This will chick if tot_credit is not null; it will display it. Otherwise it will display 0 instead. You can use this function to archive the requirements of the next exercise:

Lab Work 1.3

Display the list of all course sections offered in Spring 2010, along with the ID and name of each

instructor teaching the section. If a section has more than one instructor, that section should appear

as many times in the result as it has instructors. If a section does not have any instructor, it should

still appear in the result with the instructor name set to “—”.

Page 6: Lab 4 More SQL

Lab 4| More SQL

Page 6 of 12

2. Subqueries

At this point in our lab, we can say that you've learned all the basics of SQL. You understand

how queries work, and how to access information from single or multiple tables. You can even do

some aggregations and produce some meaningful results. There are a few things left to learn that will

make you able to solve any problem you're confronted with. Among those few; the use of subqueries

is the most important.

I will not write a comprehensive theoretical explanation in this section as I think the best way

to learn subqueries is to learn by doing. Also, the textbook and slides can serve as a very good

explanations so I encourage you to review the material before diving in.

Simple scalar subqueries

As we’ve seen in the lecture, SQL is powerful enough to take any query that returns a single

tuple containing a single value and convert it into a simple value that can be used in the select or in

the were clauses. Take a look at the following examples and run them yourself for more details.

Example 2: The following query will compute the percentage of computer science students in our university

select (select count (*) from student where dept_name = 'Comp. Sci.') * 1.0 / (select count (*) from student);

Example 3: The following query will display the instructors who earn less than the average salary

select name, salary from instructor where salary < (select avg(salary) from instructor)

Correlated scalar subqueries

Along with the powerful SQL scalar value extraction capability we can use the other strong

capability of writing a correlated query to get even more meaningful results

Example 4: Display the names of instructors who advise exactly 4 students from the same department

select name from instructor i where 4 = ( select count (distinct s.id) from advisor a join student s on (s.id = a.s_id) where s.dept_name = i.dept_name and i.id = a.i_id);

Now it’s your turn:

Lab Work 2.1

Find the names of all students who have taken at least 2 courses taught by Mr. ‘Einstein’

Page 7: Lab 4 More SQL

Lab 4| More SQL

Page 7 of 12

Viewing subquery results as sets

SQL allows us to treat the results of a subquery as a ‘set of values’ that we can apply set

membership testing, and likewise operations on. For that SQL provides us with the clauses we already know for the lecture:

in, and not in [op] all, and [op] some exists, and not exists

all of which take a value as a first operand, a set as a second operand and produce [true or false]

based on the test each of them does. This makes them ideal to use as a part of the logical predicate in

the where clause. Following are some examples you should run yourself and understand.

Example 5: Find all instructors who did not teach in the year 2009 nor in 2010.

select * from instructor where id not in (select id from teaches t where year in (2009, 2010));

Example 6: Find all instructors who earn more than all instructors in the ‘Finance’ department

select * from instructor where salary > all (select salary from instructor where dept_name = 'Finance');

Example 7: Find the department(s) that have the highest average salary

select dept_name from instructor group by dept_name having avg (salary) >= all (select avg (salary) from instructor group by dept_name);

The next example shows a clever way to mimic the “division” operator from relational algebra in SQL

Example 8: Find all students who have taken all courses offered in the ‘Finance’ department

select distinct S.ID, S.name from student as S where not exists ( (select course_id from course where dept_name = 'Finance') except (select T.course_id from takes as T where S.ID = T.ID) );

Page 8: Lab 4 More SQL

Lab 4| More SQL

Page 8 of 12

Now it’s your turn:

Lab Work 2.2

1. Find students who have more tot_cred than all ‘Finance’ students.

2. Find students who have advisors who have not taught them any course. 3. Find the names of all students who have taken all the courses thought by Mr. ‘Einstein’.

Common table expressions and the with clause

Subqueries can be used in several places within a query. Since the result of a query is always

a table, it can really be used anywhere you might use a table; especially in the from clause. Of course

you can write a subquery and has its output as a table, and even join this table with other tables you

name in the from clause. But! One problem with subqueries is that they can make your queries

lengthy and difficult to read.

Common Table Expressions can help break your query into separate components so that your

query logic is more easily readable. When you use a CTE, you can read the subqueries logic on its

own and then read the final queries logic easily as well, allowing you to handle complex tasks with relative ease.

As we have seen in the lecture, we can create a CTE using the with clause. In the following

example we're only defining one common table expression but we could theoretically write as many

as we want. We can even use a previously defined CTE to define another CTE in the same query. We

only need to define any common table expressions at the beginning of the query before using them

in other parts of the query at the bottom.

Example 9: Find all instructors who earn the maximum salary in the university

with max_salary(value) as (select max (salary) from instructor) select name, salary, value from instructor, max_salary where instructor.salary = max_salary.value;

Example 10: Find all departments where the total salary of instructors in the department is greater than the

average of the total salary at all departments

with dept_total (dept_name, value) as (select dept_name, sum(salary) from instructor group by dept_name),

dept_total_avg(value) as (select avg(value) from dept_total) select dept_name, dept_total.value, dept_total_avg.value from dept_total, dept_total_avg where dept_total.value > dept_total_avg.value;

Page 9: Lab 4 More SQL

Lab 4| More SQL

Page 9 of 12

Now it’s your turn:

Lab Work 2.3

1. Find the department that has students with the highest “Avg total credits”.

2. Find the sections that had the maximum enrollment in Fall 2009.

3. More advanced DML

In the previous lab we learned how to modify data in our database. We explored inserting

into, updating, and deleting from tables using simple SQL commands. In this lap we will not introduce

any new DML commands at all. We will use our new powerful skills -namely writing subqueries- in DML making it more powerful.

Follow the examples and apply them, each time you execute one take some time to view the

changes to the database by writing the necessary queries. And remember that you can always

restore your database to its original status by rerunning the scripts from lab 1.

Example 11: Delete all tuples in the instructor relation for those instructors who did not teach any course.

delete from instructor where id not in (select id from teaches);

Example 12: Delete all instructors whose salary is more than twice the average salary of instructors

delete from instructor where salary > 2 * (select avg (salary) from instructor);

Example 13: Make each student in the Music department who has earned more than 144 credit hours an instructor

in the Music department with a salary of $18,000.

insert into instructor select ID, name, dept_name, 18000 from student where dept_name = 'Music' and tot_cred > 144;

Example 14: Give a 5% salary raise to instructors whose salary is less than average

update instructor set salary = salary * 1.05 where salary < ( select avg (salary) from instructor);

Lab Work 3.1

Some of you may have noticed that the tot_creds value for students did not match the credits from

courses they have taken. Write and execute query to update tot_creds based on the credits passed, to bring the database back to consistency. (This query is provided in the slides and the book.)

Page 10: Lab 4 More SQL

Lab 4| More SQL

Page 10 of 12

The following example uses the case statement in a conditional update based on the old value in the tuple

Example 15: Increase salaries of instructors whose salary is over $100,000 by 3%, and all others by a 5%

update instructor set salary = case when salary <= 100000 then salary * 1.05 else salary * 1.03 end

You can also use the case clause in one of the columns in the select clause as shown below

Example 16: display all students with along their level

select name, tot_cred , case when tot_cred >= 94 then 'Fourth level' when tot_cred >= 63 then 'Third level' when tot_cred >= 31 then 'Second level' else 'First level' end as level from student;

Lab Work 3.2

Write the following inserts, deletes, or updates in SQL, using the university schema.

1. Increase the salary of each instructor in the Comp. Sci. department by 10%.

2. Delete all courses that have never been offered (i.e., do not occur in the section relation).

3. Insert every student whose tot cred attribute is greater than 100 as an instructor in the same

department, with a salary of $30,000.

4. More lab work

If you remember, in lab work 4 of our pervious lab (Lab3) we created a new relation

grade_points(grade, points) that provided a conversion from letter grades in the takes relation to

numeric scores. We used it then to calculate the GPA for our students.

With this relation in hand, and with your good knowledge of SQL queries, subqueries, and DML. Write the SQL necessary to do the following tasks:

1. Oh! We found out that the mapping of letters to points in our lab3 answers here was wrong.

We should add 5 to all points so that A+ maps to 100, A to 95, and so on. I guess you should do

that for us using SQL. By the way; F should map to 40, no need to change that one.

2. Create a new relation graduate(ID, name, GPA, general_grade) and insert all students who

finished 100 credits or more into this relation, calculating their correct GPA and assigning them the correct general_grade based on this GPA. A general_grade can be ( ‘Excellent’, ‘Very

Good’, ‘Good’, ‘Acceptable’ and ‘Not Acceptable’ ) assigned in the usual way.

3. Find the percentage of graduates with each grade.

4. Find the top graduate(s) of each department.

Page 11: Lab 4 More SQL

Lab 4| More SQL

Page 11 of 12

5. Homework

1. Display a list of all instructors (names and IDs), for each instructor show the number

of physics courses he taught. Make sure to show the number of course as 0 for

instructors who have not taught any physics courses.

a. Do it without a subquery.

b. Do it using a subquery.

2. Find the enrollment across all sections, but now also include sections with no

students taking them; the enrollment for such sections should be treated as 0. Do

this in two different ways (and create required data for testing).

3. Find all courses whose identifier starts with the string "CS-1" then:

a. find instructors who have taught all the above courses

i. Using the "not exists ... except ..." structure

ii. Try “matching of counts” i.e. make sure the count of two sets you are

comparing is the same. (don't forget the distinct clause!).

4. Insert each instructor as a student, with tot_creds = 0, in the same department.

Now delete all the newly added "students" above (important note: already existing

students who happened to have tot_creds = 0 should not get deleted).

5. Update the salary of each instructor to 10000 times the number of course sections

they have taught.

6. Find all rooms that have been assigned to more than one section at the same time.

Display the rooms along with the assigned sections; I suggest you use a with clause

to simplify this query.

7. Create your own query: define what you want to do in English, then write the query

in SQL. Make it as difficult as you wish, the harder the better.

Page 12: Lab 4 More SQL

Lab 4| More SQL

Page 12 of 12

Appendix

[Answer 1]

This happens because a condition placed in the on clause is processed before the join, while a condition

placed in the where clause is processed after the join. That does not matter with inner joins, but it matters

a lot with outer joins. In this particular example when a where condition is used; this means that all tuples

resulting from the left join will be filtered AFTER the relation is computed. The original relation

(instructor left join teaches on instructor.id = teaches.id) will have 23 tuples but

only 3 tuples will bass the where condition. On the other hand, when the condition is specified in the on

clause; the relation teaches can be viewed as the relation (select * from teaches where

course_id = 'PHY-101') which in our case has 3 tuples. This small relation will be left joined with

the instructor relation producing the result you see.