managing data(bases) using sql (non-procedural · pdf filemanaging data(bases) using sql ......

28
MANAGING DATA(BASES) USING SQL (NON - PROCEDURAL SQL , X401.9) Professional Program: Data Administration and Management Instructor: Michael Kremer, Ph.D. Technology & Information Management Class 6

Upload: vankien

Post on 06-Mar-2018

241 views

Category:

Documents


4 download

TRANSCRIPT

MANAGING DATA(BASES) USING SQL

(NON-PROCEDURAL SQL, X401.9)

Professional Program: Data Administration and Management

Instructor: Michael Kremer, Ph.D.Technology & Information Management

Class 6

AGENDA

9. Aggregate Queries

9.1 Aggregate Functions

9.2 Grouping Data

9.3 Filtering Grouped/Aggregated Data

Aggregate Queries

9.

9.1 AGGREGATE FUNCTIONS

Aggregate function: Values of multiple rows are grouped together

as input on certain criteria to form a single value of more

significant meaning or measurement

Aggregate queries are still part of SELECT queries (DQL)

Following functions are

part of SQL standard:

Additional aggregate

functions include StDev, Var, First, Last (db platform dependent)

169

9.1 AGGREGATE FUNCTIONS

Create expressions

combining multiple

aggregate functions

together

Rules about aggregate function expressions:

170

9.1 AGGREGATE FUNCTIONS

Use of SQL

predicates (such as

DISTINCT) is allowed:

Use WHERE clause

to apply aggregation

over subset of data:

171

9.1 AGGREGATE FUNCTIONS

SELECT query using dynamic WHERE clause criteria based on

aggregate query

Hard-coded WHERE

clause, average

salary of ST_CLERK

(based on previous

example)

Use of subquery

in WHERE clause

to dynamically

calculate average

salary of ST_CLERK

172

9.1 AGGREGATE FUNCTIONS

Subqueries will be

covered in next chapter

in more detail

Basic rules of

subqueries:

173

9.2 GROUPING DATA

So far, aggregation took place over entire table or subset using WHERE clause

Grouping of data Partition table data into groups and aggregate data within each group

New Keyword: GROUP BY

Partition of a set divides the set into subsets such that the union of the subsets returns the originalset (pizza, pie)

GROUP BY clause takes the result ofthe FROM and WHERE clauses and arranges the rows into groups defined as having the same values for the columns listed in the Group By clause

Each group is reduced to a single row in the resulting set

174

9.2 GROUPING DATA

175

9.2 GROUPING DATA

If the grouped fields contain null values, all null values are

arranged into one group as if they were all equal

SELECT syntax including

GROUP BY clause

Rules for aggregate

functions:

176

9.2 GROUPING DATA

NULL behavior of

aggregate functions

(Aggregate functions

do not consider NULL

values)

Rules for GROUP BY:

177

9.2 GROUPING DATA

Error message when not

using GROUP BY in

SELECT query

Same query using

GROUP BY

178

9.2 GROUPING DATA

Use multiple columns

in GROUP BY clause

Order in GROUP BY

clause has no bearing

on results

To achieve desired

order, use ORDER BY

clause

179

9.2 GROUPING DATA

Use multiple aggregate

functions

Group on expressions,

this allows you to

customize your query in a

very specific way based

on certain business rules

You have to list expression twice, in SELECT clause as well as in

GROUP BY clause

Due to the fact that GROUP BY clause is processed before

SELECT clause

Better solution is to use subquery (demonstrated later in this

course)

180

9.2 GROUPING DATA

Group on expression

Aggregation based

on query using

multiple tables

181

9.2 GROUPING DATA

When using multiple tables, make sure to aggregate data only

on lowest child table

When using multiple tables, you denormalize data and display

parent records multiple times aggregation takes place on

redundant parent data!

182

9.2 GROUPING DATA

Remove COUNT function

to demonstrate number

of records

To get the correct count,

use the DISTINCT

predicate

183

9.2 GROUPING DATA

Display distinct

departments

Only shows count of

departments that have

employees associated

with them (INNER JOIN)

To count all departments,

use OUTER JOIN!

184

9.2 GROUPING DATA

Nest aggregate functions up to a level of two in Oracle (this is

not according to the SQL standard)

Must use GROUP BY for inner aggregate function returns

multiple values

Error message when

omitting GROUP BY:

Add GROUP BY clause

to be used for inner

aggregate function:

185

9.2 GROUPING DATA

Count employees hired in the same year

186

9.2 GROUPING DATA

Transpose data results in complex expression

187

9.3 FILTERING GROUPED/AGGREGATED DATA

3 different kind of columns in aggregate SELECT statement:

Group By column

Aggregate column

Non-Select column (only used for additional filtering)

Hence there 3 filtering situations:

Filtering on Non-Select

column WHERE clause:

188

9.3 FILTERING GROUPED/AGGREGATED DATA

Same query as before but without WHERE clause

Resulting data is different!

189

9.3 FILTERING GROUPED/AGGREGATED DATA

Filtering on Group By

column WHERE clause

Previous two examples: WHERE clause filters data before

aggregation takes place

Next example shows filtering on aggregate column HAVING

clause

HAVING clause is used to restrict the number of rows after the

aggregation of data used for aggregate columns

190

9.3 FILTERING GROUPED/AGGREGATED DATA

HAVING clause comes

after GROUP BY

HAVING clause is processed

after aggregation takes

place

WHERE clause, on the other

hand, is processed before!

You can place HAVING

clause before GROUP BY

This is just semantic, it does not represent processing order!

191

9.3 FILTERING GROUPED/AGGREGATED DATA

Filtering on aggregate

column HAVING clause

In order to filter average

salary greater than 10000,

we must know the average

salary first!

Aggregation must take

place first not very

efficient as the database must aggregate all rows first before

applying the HAVING clause to filter out rows

SQL processing order:

192

9.3 FILTERING GROUPED/AGGREGATED DATA

Based on processing

order, we understand now that we cannot reference an

expression from the SELECT clause in the GROUP BY clause

Same is true for WHERE clause

But we can reference

expressions in the ORDER

BY clause by using

column alias

193