aggregation operations on relations
TRANSCRIPT
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 1/36
Aggregation Operations on Relations
P. M. Jat @ DAIICT
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 2/36
8/26/2011 Database Systems 2
First, Issues with Null Values
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 3/36
8/26/2011 Database Systems 3
Issues with Null Values
An attribute having NULL could mean either of
following-
Value is unknown or not available right now
Value is not application for the tuple: a employee
not having supervisor will have null in this attribute
Arithmetic expressions (+,-,*,/) involving null values
result null value for result
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 4/36
8/26/2011 Database Systems 4
Null Values and Comparisons
There are more issues with Null value when it appears
in a attribute used in WHERE clause of SQL-SELECT
What should ( r.a < 10 ) where r.a happens to be
null
When we compare a NULL value with another valueincluding NULL, result is UNKNOWN.
In relational operations UNKNOWN is another truth-value like TRUE and FALSE
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 5/36
8/26/2011 Database Systems 5
Truth values for UNKOWN NOT
NOT UNKOWN -> UNKWON
AND
TRUE AND UNKOWN -> UNKOWN
FALSE AND UNKOWN -> FALSE
UNKWON AND UNKOWN -> UNKOWN
OR
TRUE OR UNKOWN -> TRUE
FALSE OR UNKOWN -> UNKOWN
UNKWON OR UNKOWN -> UNKOWN
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 6/36
8/26/2011 Database Systems 6
Null Values and Comparisons
While evaluating WHERE clause tuples with UNKOWN or
FALSE truth values are not included in result
Find out how presence of null in any of attribute will affect
result of following predicates
SELECT * FROM EMPLOYEE WHERE
bdate < DATE '2001-01-01' AND salary > 30000
SELECT * FROM EMPLOYEE WHERE bdate < DATE ‘2001-01-01’ OR salary > 30000
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 7/36
8/26/2011 Database Systems 7
Null Values and Comparisons – IS NULL
Following will not give desired result. Why? -SELECT * FROM employee
WHERE superssn = NULL;
This is so because Null = Null is also UNKOWN,
therefore SQL provides IS NULL (and IS NOT NULL)
to check if a tuple has null value in an attribute
We write as following for such situations – SELECT * FROM employee
WHERE superssn IS NULL ;
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 8/36
8/26/2011 Database Systems 8
Now, Aggregation Operations
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 9/36
8/26/2011 Database Systems 9
Consider getting answer of following queries
from employee relation of company schema
Find out total salary we pay
Find out total number of employees, total number ofsupervisors
Find out employee who are drawing maximum salary
Find out employee who are drawing above averagesalary,
Find out average salary paid to managers,
And so forth
These are basically aggregation operations
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 10/36
8/26/2011 Database Systems 10
Aggregation operations
Are basically counting or summing of an attribute-
values over all tuples or grouped tupples of a relations
Following are common aggregation operations:
SUM, AVG, MAX, MIN, COUNT are common
aggregate operations over specified column of a table
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 11/36
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 12/36
8/26/2011 Database Systems 12
Aggregation operations - COUNT
SELECT count(*) FROM employee
returns count of tuples
SELECT count(superssn) FROM employee
returns count of tuples having superssn value (excludestuples having null value in superssn)
SELECT count(DISTINCT superssn)
FROM employee;
returns count of distinct values of superssn (excludes tuples
having null value)
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 13/36
8/26/2011 Database Systems 13
Aggregation operations: sum, min, max, avg
Example Query:
SELECT sum(salary), min(salary),
max(salary), avg(salary)
FROM employee;
The query returns sum of salaries (of all salary values),
min salary, average of salaries computed over all tuplesof employee relation
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 14/36
8/26/2011 Database Systems 14
NULL’s are ignored in Aggregation
NULL never contributes to a sum, average, or count,
and can never be the minimum or maximum of a
column.
But if there are no non-NULL values in a column, thenthe result of the aggregation is NULL.
Exception: COUNT of an empty set is 0.
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 15/36
8/26/2011 Database Systems 15
Example: Effect of NULL’s
Observe the difference in result of following two
queries -
SELECT COUNT( * )
FROM employee;
SELECT COUNT( superssn )
FROM employee;
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 16/36
8/26/2011 Database Systems 16
Eliminating Duplicates in an Aggregation
Observe the difference in result of following two
queries -
SELECT COUNT( superssn )
FROM employee;
SELECT COUNT( DISTINCT superssn )
FROM employee;
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 17/36
8/26/2011 Database Systems 17
Aggregation over grouped tuples
Normally aggregate operations are not applied to full
column, values in column are grouped based on someattribute(s)
Following query requires grouping of tuples-
Find out department-wise count of employees
Find out total salary we pay to each department
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 18/36
8/26/2011 Database Systems 18
Aggregation operations- Grouping For example (Q71): find out average of all
departments. Will be written as following in SQL (and
shows results as given against)-
SELECT dno, avg(salary) AS dept_avg
FROM employeeGROUP BY dno;
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 19/36
8/26/2011 Database Systems 19
GROUP BY groups the tuples for every distinct value
in grouping attribute(s).
For example GROUP BY dno, will group all tuples of
employee for each dno, and aggregation applied overthat group, and finally you have aggregated values for
each dno.
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 20/36
8/26/2011 Database Systems 20
Example Q72
Modify solution of Query 71 such that you get
Department Name also, i.e.
Q72:List DNo, DName, and Avg(Salary) of all departments
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 21/36
8/26/2011 Database Systems 21
Example Q72
List DNo, DName, and Avg(Salary) of alldepartments-
SELECT dno, dname, dept_avg FROM (SELECT
dno, avg(salary) AS dept_avg FROM
employee GROUP BY dno) AS avg_sal NATURAL JOIN department;
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 22/36
8/26/2011 Database Systems 22
Example Q72 List DNo, DName, and Avg(Salary) of all
departments–
Following solution will also give correct result but isless intuitive solution
SELECT dno, dname, avg(salary) AS
dept_avg FROM employee NATURAL JOIN
department GROUP BY dno, dname;
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 23/36
8/26/2011 Database Systems 23
Aggregate operations in RA
Symbolically can be represented using script F (F)
operator. General form could be-
<grouping-attributes> F<funtion-list>(R)
Typically functions are: count, sum, max, min, avg
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 24/36
8/26/2011 Database Systems 24
Aggregate operations in RA
For example, a aggregate query-
DNO FDNO, COUNT(SSN) -> No_of_Emps,
AVG(SALARY) -> AVG_SAL (EMPLOYEE)
In SQL, you write, asSELECT dno, count(ssn) AS No_of_Emps,
avg(salary) AS avg_sal
FROM employee
GROUP BY dno;
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 25/36
8/26/2011 Database Systems 25
Aggregation operations- GROUP BY
Normally aggregate functions appear in SELECTclause of query, however attributes that appear inGROUP BY can also (and only ?) appear here.
Obviously following (in red) are wrongly placed
attributes in statements below -
SELECT dno, ssn, avg(salary) ASavg_sal FROM employee GROUP BY dno;
SELECT dno, avg(salary) AS avg_salFROM employee;
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 26/36
8/26/2011 Database Systems 26
HAVING clause
HAVING is used to specify restrict over result of
aggregation
For example:SELECT dno, avg(salary) AS avg_salFROM employee GROUP BY dnoHAVING avg(salary) > 50000;
Note that you can use avg_sal instead of avg(salary) inHAVING clause, because renaming is done at the timeof projection
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 27/36
8/26/2011 Database Systems 27
Aggregate operations in SQL
Semantics of SELECT-GROUP BY-HAVING in sql
select statement
SELECT <attrib and/or function-list>
FROM <table-list>[WHERE <condition>]
[GROUP BY <grouping attribute(s)>]
[HAVING <group-filter-condition>][ORDER BY <attrib-list>];
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 28/36
8/26/2011 Database Systems 28
Semantics of SELECT-GROUP BY-HAVING
Result of FROM and WHERE is given to GROUP BYoperation
GROUP BY operation computes aggregated values for eachgroup value, and gives you one tuple for each group value.
group-filter-conditions in HAVING are used toapply restriction over result of GROUP BY operation
Finally project is applied as defined in SELECT clause ofSELECT statement.
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 29/36
8/26/2011 Database Systems 29
Example:
List employee-ssn along with count of employees they
are supervising
List employee-ssn along with count of employees they
are supervising that are supervising more than 2employees
List employee-ssn, employee-fname along with countof employees they are supervising that are supervising
more than 2 employees
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 30/36
8/26/2011 Database Systems 30
List employee-ssn along with count of employees they
are supervising
SELECT superssn, COUNT(superssn) FROMemployee GROUP BY superssn;
Following are also OK-
SELECT superssn, COUNT(ssn) FROM
employee GROUP BY superssn; SELECT superssn, COUNT(*) FROM employee
GROUP BY superssn;
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 31/36
8/26/2011 Database Systems 31
Example:
List employee-ssn along with count of employees they
are supervising that are supervising more than 2
employees
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 32/36
8/26/2011 Database Systems 32
List employee-ssn along with count of employees they
are supervising that are supervising more than 2employees
SELECT superssn, COUNT(superssn)FROM employee
GROUP BY superssn
HAVING count ( superssn ) > 2;
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 33/36
8/26/2011 Database Systems 33
Example:
List employee-ssn, employee-fname, no of emps
supervised for emps that are supervising more than 2employees
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 34/36
8/26/2011 Database Systems 34
List employee-ssn, employee-fname, no of emps
supervised for emps that are supervising more than 2
employees
SELECT emp.ssn, emp.fname, ssns.NoOfEmpsSupervised
FROM employee AS emp JOIN
(SELECT superssn,
COUNT(superssn) AS NoOfEmpsSupervised
FROM employee
GROUP BY superssn
HAVING count ( superssn ) > 2) AS ssns
ON emp.ssn = ssns.superssn;
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 35/36
8/26/2011 Database Systems 35
More queries
Find out average salary manager
List second level supervisors (may not precisely
aggregation query)
8/13/2019 Aggregation Operations on Relations
http://slidepdf.com/reader/full/aggregation-operations-on-relations 36/36
8/26/2011 Database Systems 36
Thanks