lab 4 more sql
TRANSCRIPT
Islamic University of Gaza Faculty of Engineering
Department of Computer Engineering
ECOM 3422: Database Systems [spring 2020] Abeer J. Al-Aydi & Abdallah H. Azzami
Lab 4: More SQL
Objectives
To be familiar with all types of joins in SQL.
To practice writing and understanding queries that employ subqueries.
To use your newly acquired knowledge to write rich and complex DML.
To familiarize yourself with Dates in SQL.
Table of Content
OBJECTIVES ____________________________________________________________________________________________ 1
TABLE OF CONTENT _____________________________________________________________________________________ 1
1. FLAVORS OF JOIN ___________________________________________________________________________________ 2
JOINED TABLES _________________________________________________________________________________________ 2 JOIN TYPES ____________________________________________________________________________________________ 2 QUALIFIED JOINS ________________________________________________________________________________________ 3 COALESCE FUNCTION _____________________________________________________________________________________ 5
2. SUBQUERIES _______________________________________________________________________________________ 6
SIMPLE SCALAR SUBQUERIES ________________________________________________________________________________ 6 CORRELATED SCALAR SUBQUERIES ____________________________________________________________________________ 6 VIEWING SUBQUERY RESULTS AS SETS __________________________________________________________________________ 7 COMMON TABLE EXPRESSIONS AND THE WITH CLAUSE _______________________________________________________________ 8
3. MORE ADVANCED DML ______________________________________________________________________________ 9
4. MORE LAB WORK _________________________________________________________________________________ 10
5. HOMEWORK _____________________________________________________________________________________ 11
APPENDIX ____________________________________________________________________________________________ 12
Lab 4| More SQL
Page 2 of 12
1. Flavors of Join
Last lab we explored the famous select-from-where query. We used different types functions
in that query and we wrote several conditions. In this section we will take a deeper dive into the from
part of the query.
The from clause derives a table from one or more tables given in a comma-separated list. After
the word from and before the clause where those tables can be
a simple table reference such as “instructor”, a derived table such in the form of a subquery as we will see in the following section, or a join construct,
or even a combination of these 3. If more than one table reference is listed in the from clause, the
cartesian product of the listed tables is formed. The result of the from is an intermediate virtual table
that can then be subject to transformations by the where, group by, and other clauses and is finally
the result of the overall table expression.
Joined Tables
A joined table is a table derived from two other (real or derived) tables according to the rules
of the particular join type. Inner, outer, and cross-joins are the different flavors we can choose form.
The general syntax of a joined table is
Table1 {join type} Table2 [join condition]
Joins of all types can be chained together, or nested: either or both Table1 and Table2 can be
themselves joined tables. Parentheses can be used around join clauses to control the join order. In
the absence of parentheses, join clauses nest left-to-right.
Join Types
First of all, please note that this lab assumes you know the theoretical knowledge about joins
covered in chapter 4 of our textbook. So, make sure you got this covered as this lab will focus on
syntax and provide some additional insights, and exercises.
Type 1: Cross join written as
Table1 cross join Table2
This simply computes the cross (Cartesian) product Table1 × Table2
Example 1: select * from instructor cross join teaches
will compute the cross product instructor × teaches
Note that: from instructor cross join teaches is equivalent to from instructor [inner]
join teaches on true. And is also equivalent to from instructor, teaches.
Lab 4| More SQL
Page 3 of 12
Qualified joins
A join that requires the tuple to satisfy a condition in order to qualify to be a part of the resulting
relation is called a ‘qualified join’. In this sense, all joins other that the cross join are qualified joins.
We can write a qualified join as follows:
Table1 {[inner] | {left | right | full} [outer]} join Table2 on (Boolean expression)
Or
Table1 {[inner] | {left | right | full} [outer]} join Table2 using (join column list)
Or
Table1 natural {[inner] | {left | right | full} [outer]} join Table2
Note that:
The words inner and outer are optional in all forms. inner is the default; left, right, and full
automatically imply an outer join.
The join condition is specified in the on or using clause, or implicitly by the word natural. The join
condition determines which rows from the two source tables are considered to “match”, as we
explained in details in the lecture.
inner join For each row r1 of Table1, the joined table has a row for each row in Table2 that satisfies the join condition with r1.
left outer join
First, an inner join is performed. Then, for each row in Table1 that does not satisfy the join condition with any row in Table2, a joined row is added with null values in columns of Table2. Thus, the joined table always has at least one row for each row in Table1.
right outer join
First, an inner join is performed. Then, for each row in Table2 that does not satisfy the join condition with any row in Table1, a joined row is added with null values in columns of Table1. Thus, the result table will always have a row for each row in Table2.
full outer join First, an inner join is performed. Then, for each row in Table1 that does not satisfy the join condition with any row in Table2, a joined row is added with null values in columns of Table2. Also, for each row of Table2 that does not satisfy the join condition with any row in Table1, a joined row with null values in the columns of Table1 is added.
The on clause is the most general kind of join condition: it takes a boolean value expression of the
same kind as is used in a where clause. A pair of rows from Table1 and Table2 match if the on
expression evaluates to true.
The using clause is a shorthand that allows you to take advantage of the specific situation where both
sides of the join use the same name for the joining column(s). It takes a comma-separated list of the
shared column names and forms a join condition that includes an equality comparison for each one.
For example, joining Table1 and Table2 with using (a, b) produces the join condition on Table1.a =
Table2.a and Table1.b = Table2.b.
Lab 4| More SQL
Page 4 of 12
Furthermore, the output of join using suppresses redundant columns: there is no need to
print both of the matched columns, since they must have equal values. while join on produces all
columns from Table1 followed by all columns from Table2, join using produces one output column
for each of the listed column pairs (in the listed order), followed by any remaining columns from
Table1, followed by any remaining columns from Table2.
Finally, natural is a shorthand form of using: it forms a using list consisting of all column
names that appear in both input tables. As with using, these columns appear only once in the output
table. If there are no common column names, natural join behaves like join ... on true, producing a cross-product join.
Exercise 1: Execute those statements and observe the deference:
[1] select * from instructor inner join teaches on instructor.id = teaches.id;
[2] select * from instructor inner join teaches using (id);
[3] select * from instructor natural inner join teaches;
Exercise 2: Execute those statements and observe the deference; look for an instructor named ‘Gold’.
[1] select * from instructor left join teaches on instructor.id = teaches.id;
[2] select * from instructor left join teaches using (id);
[3] select * from instructor natural left join teaches;
NOTE: The join condition specified with ON can also contain conditions that do not relate directly to the join. This can prove useful sometimes but you need to be careful.
For example:
select * from instructor left join teaches on instructor.id = teaches.id and teaches.course_id = 'PHY-101'; is very different form:
select * from instructor left join teaches on instructor.id = teaches.id where teaches.course_id = 'PHY-101';
Question 1
Execute both queries to see for yourself and discuss with your colleague why it is so. When you think you have an answer look at [Answer 1] in the appendix to see if you did figure it out correctly.
Lab 4| More SQL
Page 5 of 12
Hope I got you exited for the lab
Now let’s test your knowledge with this relatively simple task:
Lab Work 1.1
Display a list of all instructors (names and IDs), showing each instructor and the number of sections
he taught. Make sure to show the number of sections as 0 for instructors who have not taught any section. Your query should use an outer join, and should not use subqueries.
Lab Work 1.2
Write the same query as above, but using a scalar subquery, not using outer join.
Coalesce function
coalesce( value1 [,value2, value3, ...] )
The coalesce function returns the first of its arguments that is not null. Null is returned only
if all arguments are null. It is often used in conjunction with outer joins to substitute a default value for null values when data is displayed, for example you can write:
Select name, coalesce( tot_cred , 0 ) from student
This will chick if tot_credit is not null; it will display it. Otherwise it will display 0 instead. You can use this function to archive the requirements of the next exercise:
Lab Work 1.3
Display the list of all course sections offered in Spring 2010, along with the ID and name of each
instructor teaching the section. If a section has more than one instructor, that section should appear
as many times in the result as it has instructors. If a section does not have any instructor, it should
still appear in the result with the instructor name set to “—”.
Lab 4| More SQL
Page 6 of 12
2. Subqueries
At this point in our lab, we can say that you've learned all the basics of SQL. You understand
how queries work, and how to access information from single or multiple tables. You can even do
some aggregations and produce some meaningful results. There are a few things left to learn that will
make you able to solve any problem you're confronted with. Among those few; the use of subqueries
is the most important.
I will not write a comprehensive theoretical explanation in this section as I think the best way
to learn subqueries is to learn by doing. Also, the textbook and slides can serve as a very good
explanations so I encourage you to review the material before diving in.
Simple scalar subqueries
As we’ve seen in the lecture, SQL is powerful enough to take any query that returns a single
tuple containing a single value and convert it into a simple value that can be used in the select or in
the were clauses. Take a look at the following examples and run them yourself for more details.
Example 2: The following query will compute the percentage of computer science students in our university
select (select count (*) from student where dept_name = 'Comp. Sci.') * 1.0 / (select count (*) from student);
Example 3: The following query will display the instructors who earn less than the average salary
select name, salary from instructor where salary < (select avg(salary) from instructor)
Correlated scalar subqueries
Along with the powerful SQL scalar value extraction capability we can use the other strong
capability of writing a correlated query to get even more meaningful results
Example 4: Display the names of instructors who advise exactly 4 students from the same department
select name from instructor i where 4 = ( select count (distinct s.id) from advisor a join student s on (s.id = a.s_id) where s.dept_name = i.dept_name and i.id = a.i_id);
Now it’s your turn:
Lab Work 2.1
Find the names of all students who have taken at least 2 courses taught by Mr. ‘Einstein’
Lab 4| More SQL
Page 7 of 12
Viewing subquery results as sets
SQL allows us to treat the results of a subquery as a ‘set of values’ that we can apply set
membership testing, and likewise operations on. For that SQL provides us with the clauses we already know for the lecture:
in, and not in [op] all, and [op] some exists, and not exists
all of which take a value as a first operand, a set as a second operand and produce [true or false]
based on the test each of them does. This makes them ideal to use as a part of the logical predicate in
the where clause. Following are some examples you should run yourself and understand.
Example 5: Find all instructors who did not teach in the year 2009 nor in 2010.
select * from instructor where id not in (select id from teaches t where year in (2009, 2010));
Example 6: Find all instructors who earn more than all instructors in the ‘Finance’ department
select * from instructor where salary > all (select salary from instructor where dept_name = 'Finance');
Example 7: Find the department(s) that have the highest average salary
select dept_name from instructor group by dept_name having avg (salary) >= all (select avg (salary) from instructor group by dept_name);
The next example shows a clever way to mimic the “division” operator from relational algebra in SQL
Example 8: Find all students who have taken all courses offered in the ‘Finance’ department
select distinct S.ID, S.name from student as S where not exists ( (select course_id from course where dept_name = 'Finance') except (select T.course_id from takes as T where S.ID = T.ID) );
Lab 4| More SQL
Page 8 of 12
Now it’s your turn:
Lab Work 2.2
1. Find students who have more tot_cred than all ‘Finance’ students.
2. Find students who have advisors who have not taught them any course. 3. Find the names of all students who have taken all the courses thought by Mr. ‘Einstein’.
Common table expressions and the with clause
Subqueries can be used in several places within a query. Since the result of a query is always
a table, it can really be used anywhere you might use a table; especially in the from clause. Of course
you can write a subquery and has its output as a table, and even join this table with other tables you
name in the from clause. But! One problem with subqueries is that they can make your queries
lengthy and difficult to read.
Common Table Expressions can help break your query into separate components so that your
query logic is more easily readable. When you use a CTE, you can read the subqueries logic on its
own and then read the final queries logic easily as well, allowing you to handle complex tasks with relative ease.
As we have seen in the lecture, we can create a CTE using the with clause. In the following
example we're only defining one common table expression but we could theoretically write as many
as we want. We can even use a previously defined CTE to define another CTE in the same query. We
only need to define any common table expressions at the beginning of the query before using them
in other parts of the query at the bottom.
Example 9: Find all instructors who earn the maximum salary in the university
with max_salary(value) as (select max (salary) from instructor) select name, salary, value from instructor, max_salary where instructor.salary = max_salary.value;
Example 10: Find all departments where the total salary of instructors in the department is greater than the
average of the total salary at all departments
with dept_total (dept_name, value) as (select dept_name, sum(salary) from instructor group by dept_name),
dept_total_avg(value) as (select avg(value) from dept_total) select dept_name, dept_total.value, dept_total_avg.value from dept_total, dept_total_avg where dept_total.value > dept_total_avg.value;
Lab 4| More SQL
Page 9 of 12
Now it’s your turn:
Lab Work 2.3
1. Find the department that has students with the highest “Avg total credits”.
2. Find the sections that had the maximum enrollment in Fall 2009.
3. More advanced DML
In the previous lab we learned how to modify data in our database. We explored inserting
into, updating, and deleting from tables using simple SQL commands. In this lap we will not introduce
any new DML commands at all. We will use our new powerful skills -namely writing subqueries- in DML making it more powerful.
Follow the examples and apply them, each time you execute one take some time to view the
changes to the database by writing the necessary queries. And remember that you can always
restore your database to its original status by rerunning the scripts from lab 1.
Example 11: Delete all tuples in the instructor relation for those instructors who did not teach any course.
delete from instructor where id not in (select id from teaches);
Example 12: Delete all instructors whose salary is more than twice the average salary of instructors
delete from instructor where salary > 2 * (select avg (salary) from instructor);
Example 13: Make each student in the Music department who has earned more than 144 credit hours an instructor
in the Music department with a salary of $18,000.
insert into instructor select ID, name, dept_name, 18000 from student where dept_name = 'Music' and tot_cred > 144;
Example 14: Give a 5% salary raise to instructors whose salary is less than average
update instructor set salary = salary * 1.05 where salary < ( select avg (salary) from instructor);
Lab Work 3.1
Some of you may have noticed that the tot_creds value for students did not match the credits from
courses they have taken. Write and execute query to update tot_creds based on the credits passed, to bring the database back to consistency. (This query is provided in the slides and the book.)
Lab 4| More SQL
Page 10 of 12
The following example uses the case statement in a conditional update based on the old value in the tuple
Example 15: Increase salaries of instructors whose salary is over $100,000 by 3%, and all others by a 5%
update instructor set salary = case when salary <= 100000 then salary * 1.05 else salary * 1.03 end
You can also use the case clause in one of the columns in the select clause as shown below
Example 16: display all students with along their level
select name, tot_cred , case when tot_cred >= 94 then 'Fourth level' when tot_cred >= 63 then 'Third level' when tot_cred >= 31 then 'Second level' else 'First level' end as level from student;
Lab Work 3.2
Write the following inserts, deletes, or updates in SQL, using the university schema.
1. Increase the salary of each instructor in the Comp. Sci. department by 10%.
2. Delete all courses that have never been offered (i.e., do not occur in the section relation).
3. Insert every student whose tot cred attribute is greater than 100 as an instructor in the same
department, with a salary of $30,000.
4. More lab work
If you remember, in lab work 4 of our pervious lab (Lab3) we created a new relation
grade_points(grade, points) that provided a conversion from letter grades in the takes relation to
numeric scores. We used it then to calculate the GPA for our students.
With this relation in hand, and with your good knowledge of SQL queries, subqueries, and DML. Write the SQL necessary to do the following tasks:
1. Oh! We found out that the mapping of letters to points in our lab3 answers here was wrong.
We should add 5 to all points so that A+ maps to 100, A to 95, and so on. I guess you should do
that for us using SQL. By the way; F should map to 40, no need to change that one.
2. Create a new relation graduate(ID, name, GPA, general_grade) and insert all students who
finished 100 credits or more into this relation, calculating their correct GPA and assigning them the correct general_grade based on this GPA. A general_grade can be ( ‘Excellent’, ‘Very
Good’, ‘Good’, ‘Acceptable’ and ‘Not Acceptable’ ) assigned in the usual way.
3. Find the percentage of graduates with each grade.
4. Find the top graduate(s) of each department.
Lab 4| More SQL
Page 11 of 12
5. Homework
1. Display a list of all instructors (names and IDs), for each instructor show the number
of physics courses he taught. Make sure to show the number of course as 0 for
instructors who have not taught any physics courses.
a. Do it without a subquery.
b. Do it using a subquery.
2. Find the enrollment across all sections, but now also include sections with no
students taking them; the enrollment for such sections should be treated as 0. Do
this in two different ways (and create required data for testing).
3. Find all courses whose identifier starts with the string "CS-1" then:
a. find instructors who have taught all the above courses
i. Using the "not exists ... except ..." structure
ii. Try “matching of counts” i.e. make sure the count of two sets you are
comparing is the same. (don't forget the distinct clause!).
4. Insert each instructor as a student, with tot_creds = 0, in the same department.
Now delete all the newly added "students" above (important note: already existing
students who happened to have tot_creds = 0 should not get deleted).
5. Update the salary of each instructor to 10000 times the number of course sections
they have taught.
6. Find all rooms that have been assigned to more than one section at the same time.
Display the rooms along with the assigned sections; I suggest you use a with clause
to simplify this query.
7. Create your own query: define what you want to do in English, then write the query
in SQL. Make it as difficult as you wish, the harder the better.
Lab 4| More SQL
Page 12 of 12
Appendix
[Answer 1]
This happens because a condition placed in the on clause is processed before the join, while a condition
placed in the where clause is processed after the join. That does not matter with inner joins, but it matters
a lot with outer joins. In this particular example when a where condition is used; this means that all tuples
resulting from the left join will be filtered AFTER the relation is computed. The original relation
(instructor left join teaches on instructor.id = teaches.id) will have 23 tuples but
only 3 tuples will bass the where condition. On the other hand, when the condition is specified in the on
clause; the relation teaches can be viewed as the relation (select * from teaches where
course_id = 'PHY-101') which in our case has 3 tuples. This small relation will be left joined with
the instructor relation producing the result you see.