aggregation operations on relations

36
8/13/2019 Aggregation Operations on Relations http://slidepdf.com/reader/full/aggregation-operations-on-relations 1/36 Aggregation Operations on Relations P. M. Jat @ DAIICT

Upload: suyashi-purwar

Post on 04-Jun-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 1/36

Aggregation Operations on Relations

P. M. Jat @ DAIICT

Page 2: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 2/36

8/26/2011 Database Systems 2

First, Issues with Null Values

Page 3: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 3/36

8/26/2011 Database Systems 3

Issues with Null Values

An attribute having NULL could mean either of

following-

Value is unknown or not available right now

Value is not application for the tuple: a employee

not having supervisor will have null in this attribute

Arithmetic expressions (+,-,*,/) involving null values

result null value for result

Page 4: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 4/36

8/26/2011 Database Systems 4

 Null Values and Comparisons

There are more issues with Null value when it appears

in a attribute used in WHERE clause of SQL-SELECT

What should ( r.a < 10 ) where r.a happens to be

null

When we compare a NULL value with another valueincluding NULL, result is UNKNOWN.

In relational operations UNKNOWN is another truth-value like TRUE and FALSE

Page 5: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 5/36

8/26/2011 Database Systems 5

Truth values for UNKOWN NOT

 NOT UNKOWN -> UNKWON

 AND

TRUE AND UNKOWN -> UNKOWN

FALSE AND UNKOWN -> FALSE

UNKWON AND UNKOWN -> UNKOWN

OR 

TRUE OR UNKOWN -> TRUE

FALSE OR UNKOWN -> UNKOWN

UNKWON OR UNKOWN -> UNKOWN

Page 6: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 6/36

8/26/2011 Database Systems 6

 Null Values and Comparisons

While evaluating WHERE clause tuples with UNKOWN or

FALSE truth values are not included in result

Find out how presence of null in any of attribute will affect

result of following predicates

SELECT * FROM EMPLOYEE WHERE

 bdate < DATE '2001-01-01' AND salary > 30000

SELECT * FROM EMPLOYEE WHERE bdate < DATE ‘2001-01-01’ OR salary > 30000

Page 7: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 7/36

8/26/2011 Database Systems 7

 Null Values and Comparisons – IS NULL

Following will not give desired result. Why? -SELECT * FROM employee

 WHERE superssn = NULL;

This is so because Null = Null is also UNKOWN,

therefore SQL provides IS NULL (and IS NOT NULL)

to check if a tuple has null value in an attribute

We write as following for such situations – SELECT * FROM employee

 WHERE superssn IS NULL ;

Page 8: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 8/36

8/26/2011 Database Systems 8

 Now, Aggregation Operations

Page 9: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 9/36

8/26/2011 Database Systems 9

Consider getting answer of following queries

from employee relation of company schema

Find out total salary we pay

Find out total number of employees, total number ofsupervisors

Find out employee who are drawing maximum salary

Find out employee who are drawing above averagesalary,

Find out average salary paid to managers,

And so forth

These are basically aggregation operations

Page 10: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 10/36

8/26/2011 Database Systems 10

Aggregation operations

Are basically counting or summing of an attribute-

values over all tuples or grouped tupples of a relations

Following are common aggregation operations:

SUM, AVG, MAX, MIN, COUNT are common

aggregate operations over specified column of a table

Page 11: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 11/36

Page 12: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 12/36

8/26/2011 Database Systems 12

Aggregation operations - COUNT

SELECT count(*) FROM employee

returns count of tuples

SELECT count(superssn) FROM employee

returns count of tuples having superssn value (excludestuples having null value in superssn)

SELECT count(DISTINCT superssn)

FROM employee;

returns count of distinct values of superssn (excludes tuples

having null value)

Page 13: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 13/36

8/26/2011 Database Systems 13

Aggregation operations: sum, min, max, avg

Example Query:

SELECT sum(salary), min(salary),

 max(salary), avg(salary)

FROM employee;

The query returns sum of salaries (of all salary values),

min salary, average of salaries computed over all tuplesof employee relation

Page 14: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 14/36

8/26/2011 Database Systems 14

 NULL’s are ignored in Aggregation

 NULL never contributes to a sum, average, or count,

and can never be the minimum or maximum of a

column.

But if there are no non-NULL values in a column, thenthe result of the aggregation is NULL.

Exception: COUNT of an empty set is 0.

Page 15: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 15/36

8/26/2011 Database Systems 15

Example: Effect of NULL’s

Observe the difference in result of following two

queries -

SELECT COUNT( * )

FROM employee;

SELECT COUNT( superssn )

FROM employee;

Page 16: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 16/36

8/26/2011 Database Systems 16

Eliminating Duplicates in an Aggregation

Observe the difference in result of following two

queries -

SELECT COUNT( superssn )

FROM employee;

SELECT COUNT( DISTINCT superssn )

FROM employee;

Page 17: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 17/36

8/26/2011 Database Systems 17

Aggregation over grouped tuples

 Normally aggregate operations are not applied to full

column, values in column are grouped based on someattribute(s)

Following query requires grouping of tuples-

Find out department-wise count of employees

Find out total salary we pay to each department

Page 18: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 18/36

8/26/2011 Database Systems 18

Aggregation operations- Grouping For example (Q71): find out average of all

departments. Will be written as following in SQL (and

shows results as given against)-

SELECT dno, avg(salary) AS dept_avg

FROM employeeGROUP BY dno;

Page 19: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 19/36

8/26/2011 Database Systems 19

GROUP BY groups the tuples for every distinct value

in grouping attribute(s).

For example GROUP BY dno, will group all tuples of

employee for each dno, and aggregation applied overthat group, and finally you have aggregated values for

each dno.

Page 20: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 20/36

8/26/2011 Database Systems 20

Example Q72

Modify solution of Query 71 such that you get

Department Name also, i.e.

Q72:List DNo, DName, and Avg(Salary) of all departments

Page 21: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 21/36

8/26/2011 Database Systems 21

Example Q72

List DNo, DName, and Avg(Salary) of alldepartments-

SELECT dno, dname, dept_avg FROM (SELECT

dno, avg(salary) AS dept_avg FROM

employee GROUP BY dno) AS avg_sal NATURAL JOIN department;

Page 22: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 22/36

8/26/2011 Database Systems 22

Example Q72 List DNo, DName, and Avg(Salary) of all

departments– 

Following solution will also give correct result but isless intuitive solution

SELECT dno, dname, avg(salary) AS

dept_avg FROM employee NATURAL JOIN

department GROUP BY dno, dname;

Page 23: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 23/36

8/26/2011 Database Systems 23

Aggregate operations in RA

Symbolically can be represented using script F (F)

operator. General form could be-

<grouping-attributes> F<funtion-list>(R)

Typically functions are: count, sum, max, min, avg

Page 24: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 24/36

8/26/2011 Database Systems 24

Aggregate operations in RA

For example, a aggregate query-

DNO  FDNO, COUNT(SSN) -> No_of_Emps,

AVG(SALARY) -> AVG_SAL (EMPLOYEE)

In SQL, you write, asSELECT dno, count(ssn) AS No_of_Emps,

avg(salary) AS avg_sal

FROM employee

GROUP BY dno;

Page 25: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 25/36

8/26/2011 Database Systems 25

Aggregation operations- GROUP BY

 Normally aggregate functions appear in SELECTclause of query, however attributes that appear inGROUP BY can also (and only ?) appear here.

Obviously following (in red) are wrongly placed

attributes in statements below -

SELECT dno, ssn, avg(salary) ASavg_sal FROM employee GROUP BY dno;

SELECT dno, avg(salary) AS avg_salFROM employee;

Page 26: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 26/36

8/26/2011 Database Systems 26

HAVING clause

HAVING is used to specify restrict over result of

aggregation

For example:SELECT dno, avg(salary) AS avg_salFROM employee GROUP BY dnoHAVING avg(salary) > 50000;

 Note that you can use avg_sal instead of avg(salary) inHAVING clause, because renaming is done at the timeof projection

Page 27: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 27/36

8/26/2011 Database Systems 27

Aggregate operations in SQL

Semantics of SELECT-GROUP BY-HAVING in sql

select statement

SELECT <attrib and/or function-list>

FROM <table-list>[WHERE <condition>]

[GROUP BY <grouping attribute(s)>]

[HAVING <group-filter-condition>][ORDER BY <attrib-list>];

Page 28: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 28/36

8/26/2011 Database Systems 28

Semantics of SELECT-GROUP BY-HAVING

Result of FROM and WHERE is given to GROUP BYoperation

GROUP BY operation computes aggregated values for eachgroup value, and gives you one tuple for each group value.

group-filter-conditions in HAVING are used toapply restriction over result of GROUP BY operation

Finally project is applied as defined in SELECT clause ofSELECT statement.

Page 29: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 29/36

8/26/2011 Database Systems 29

Example:

List employee-ssn along with count of employees they

are supervising

List employee-ssn along with count of employees they

are supervising that are supervising more than 2employees

List employee-ssn, employee-fname along with countof employees they are supervising that are supervising

more than 2 employees

Page 30: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 30/36

8/26/2011 Database Systems 30

List employee-ssn along with count of employees they

are supervising

SELECT superssn, COUNT(superssn) FROMemployee GROUP BY superssn;

Following are also OK-

SELECT superssn, COUNT(ssn) FROM

employee GROUP BY superssn; SELECT superssn, COUNT(*) FROM employee

GROUP BY superssn;

Page 31: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 31/36

8/26/2011 Database Systems 31

Example:

List employee-ssn along with count of employees they

are supervising that are supervising more than 2

employees

Page 32: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 32/36

8/26/2011 Database Systems 32

List employee-ssn along with count of employees they

are supervising that are supervising more than 2employees

SELECT superssn, COUNT(superssn)FROM employee

GROUP BY superssn

HAVING count ( superssn ) > 2;

Page 33: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 33/36

8/26/2011 Database Systems 33

Example:

List employee-ssn, employee-fname, no of emps

supervised for emps that are supervising more than 2employees

Page 34: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 34/36

8/26/2011 Database Systems 34

List employee-ssn, employee-fname, no of emps

supervised for emps that are supervising more than 2

employees

SELECT emp.ssn, emp.fname, ssns.NoOfEmpsSupervised 

FROM employee AS emp JOIN

(SELECT superssn,

COUNT(superssn) AS NoOfEmpsSupervised 

FROM employee

GROUP BY superssn

HAVING count ( superssn ) > 2) AS ssns

ON emp.ssn = ssns.superssn;

Page 35: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 35/36

8/26/2011 Database Systems 35

More queries

Find out average salary manager 

List second level supervisors (may not precisely

aggregation query)

Page 36: Aggregation Operations on Relations

8/13/2019 Aggregation Operations on Relations

http://slidepdf.com/reader/full/aggregation-operations-on-relations 36/36

8/26/2011 Database Systems 36

Thanks