db2 for z/os is serious about analytics: leveraging sql to...

36
DB2 for z/OS is Serious About Analytics: Leveraging SQL to Analyze Your Data Where it Resides Maryela Weihrauch, IBM, Distinguished Engineer, [email protected] Meg Bernal, IBM, Senior Software Engineer, [email protected]

Upload: others

Post on 30-Apr-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

DB2 for z/OS is Serious About Analytics:

Leveraging SQL to Analyze Your Data Where it Resides

Maryela Weihrauch, IBM, Distinguished Engineer, [email protected]

Meg Bernal, IBM, Senior Software Engineer, [email protected]

Page 2: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 2

Agenda

• How DB2 and Analytics Make Sense

• OLAP Specification

• Rolling Aggregations

• Grouping

• Math Functions

Page 3: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 3

System z:

re-inventing

business around

an integrated data

lifecycle

Reduced data movement, reduced complexity, reduced configuration resources

More accurate, more secure, more available

Better business response

VSAM

DB2

VSAM

IMS

DB2AA

Page 4: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

HTAP -

DB2 with

Analytics

Accelerator

OLAP

DB

2 fo

r z/O

S

Pro

ce

ss

ing

IBM

DB

2 A

na

lytic

s A

cce

lera

tor

DB2 for z/OS CPU savings

target• Operational (in transaction)

analytics

• (complex) OLTP

Accelerator focus• Ad-hoc queries

• Complex queries scanning

large amount of data

• ETL acceleration/virtual

transformation

Complex queries (more history)

OLTP Transactions

High concurrency

Hybrid Transactional &

Analytical Processing

Standard reports

Page 5: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 5

Optimizer

Acce

lera

tor D

RD

AR

equ

esto

r

Application

Applicatio

n

Interface

Queries executed with Accelerator

Queries executed without Accelerator

Heartbeat (availability and performance indicators)

Query execution run-time for queries

that cannot be or should not be off-

loaded to Accelerator

SPU

MemorySPU

MemorySPU

MemorySPU

Memory

SM

PH

ost

Heartbeat

DB2 for z/OS

CPU FPGA

CPU FPGA

CPU FPGA

CPU FPGA

CPU FPGA

CPU FPGA

CPU FPGA

CPU FPGA

Accelerator

Query

Execution

Flow

Page 6: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 6

Analytics through

SQL

V9 V10 V11 V12

RANK √

DENSE_RANK √

ROW_NUMBER √

Moving AVERAGE √

Moving SUM √

Grouping √

ROLLUP √

CUBE √

MEDIAN √

PERCENTILE √

LISTAGG* √

Page 7: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

OLAP SPECIFICATION

Page 8: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 8

OLAP Specifications

• OLAP specifications can be used to determine ranking, row numbering, and

aggregation as a scalar value

• The OLAP specification is defined using a Window

– A window may specify three (3) components (each is optional)

1. Partitioning – PARTITION BY CLAUSE

2. Ordering – ORDER BY clause

3. Aggregation Group – ROWS/RANGE

Page 9: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 9

Differences

between the

function types

Function Type

Action Examples

Scalar compute a single value for the current row

SUBSTR, CONCAT, LTRIM

Aggregate collapse a group of rows into a single row and compute a single value for this group

MAX, MIN, SUM, COUNT

Moving Aggregate

compute a single value for the current row based on some or all rows in a defined group

Aggregate function, like above, for a window

Page 10: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 10

RANK, DENSE_RANK, ROW_NUMBER

• RANK/DENSE_RANK

– Returns the ordinal rank of a row in the window

• RANK: may have a gap in the ranking

• DENSE_RANK: no gaps in the ranking

• ROW_NUMBER

– Returns the sequential row number

Page 11: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 11

RANK

Example

SELECT EMPNO, LASTNAME, SALARY+BONUS AS TOTAL_SALARY,

RANK() OVER(ORDER BY SALARY+BONUS DESC) AS RANK_SALARY

FROM DSN8A10.EMP WHERE SALARY+BONUS > 30000 ORDER BY RANK_SALARY;

+-----------------------------------------------------------+

| EMPNO | LASTNAME | TOTAL_SALARY | RANK_SALARY |

+-----------------------------------------------------------+

1_| 000010 | HAAS | 53750.00 | 1 |

2_| 200010 | HEMMINGER | 47500.00 | 2 |

3_| 000110 | LUCCHESI | 47400.00 | 3 |

4_| 000020 | THOMPSON | 42050.00 | 4 |

5_| 000050 | GEYER | 40975.00 | 5 |

6_| 000030 | KWAN | 39050.00 | 6 |

7_| 000070 | PULASKI | 36870.00 | 7 |

8_| 000060 | STERN | 32850.00 | 8 |

9_| 000220 | LUTZ | 30440.00 | 9 |

10_| 200220 | JOHN | 30440.00 | 9 |

11_| 000090 | HENDERSON | 30350.00 | 11 |

+-----------------------------------------------------------+

LUTZ & JOHN have same total salary therefore they get the same ranking

HENDERSON gets the rank based off of the number of rows that previously qualified the prior rank

Page 12: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 12

DENSE_RANK

– Rank based

on uniqueness

SELECT EMPNO, LASTNAME, SALARY+BONUS AS TOTAL_SALARY,

DENSE_RANK() OVER(ORDER BY SALARY+BONUS DESC) AS RANK_SALARY

FROM DSN8A10.EMP WHERE SALARY+BONUS > 30000 ORDER BY RANK_SALARY;

+-----------------------------------------------------------+

| EMPNO | LASTNAME | TOTAL_SALARY | RANK_SALARY |

+-----------------------------------------------------------+

1_| 000010 | HAAS | 53750.00 | 1 |

2_| 200010 | HEMMINGER | 47500.00 | 2 |

3_| 000110 | LUCCHESI | 47400.00 | 3 |

4_| 000020 | THOMPSON | 42050.00 | 4 |

5_| 000050 | GEYER | 40975.00 | 5 |

6_| 000030 | KWAN | 39050.00 | 6 |

7_| 000070 | PULASKI | 36870.00 | 7 |

8_| 000060 | STERN | 32850.00 | 8 |

9_| 000220 | LUTZ | 30440.00 | 9 |

10_| 200220 | JOHN | 30440.00 | 9 |

11_| 000090 | HENDERSON | 30350.00 | 10 |

+-----------------------------------------------------------+

HENDERSON gets a ranking based on the # of unique total salaries that precede it

Page 13: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 13

ROW_NUMBER

- Sequential

row number

SELECT EMPNO, LASTNAME, SALARY+BONUS AS TOTAL_SALARY,

ROW_NUMBER() OVER(ORDER BY SALARY+BONUS DESC) AS RANK_SALARY

FROM DSN8A10.EMP WHERE SALARY+BONUS > 30000 ORDER BY RANK_SALARY;

+-----------------------------------------------------------+

| EMPNO | LASTNAME | TOTAL_SALARY | RANK_SALARY |

+-----------------------------------------------------------+

1_| 000010 | HAAS | 53750.00 | 1 |

2_| 200010 | HEMMINGER | 47500.00 | 2 |

3_| 000110 | LUCCHESI | 47400.00 | 3 |

4_| 000020 | THOMPSON | 42050.00 | 4 |

5_| 000050 | GEYER | 40975.00 | 5 |

6_| 000030 | KWAN | 39050.00 | 6 |

7_| 000070 | PULASKI | 36870.00 | 7 |

8_| 000060 | STERN | 32850.00 | 8 |

9_| 000220 | LUTZ | 30440.00 | 9 |

10_| 200220 | JOHN | 30440.00 | 10 |

11_| 000090 | HENDERSON | 30350.00 | 11 |

+-----------------------------------------------------------+

Page 14: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 14

Moving AVG

Example

SELECT Territory, Month, Sales,

AVG(Sales) OVER (PARTITION BY Territory

ORDER BY Month

ROWS 2 PRECEDING) AS Moving_Avg

FROM Sales_History;

Territory Month Sales

East 199810 10

East 199811 4

East 199812 10

East 199901 7

East 199902 10

West 199810 8

West 199811 12

West 199812 11

West 199901 7

West 199902 6

Aggregation group

Compute the average sales over the current month and the

preceding two months, for each territory, and month

Page 15: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 15

Territory Month Sales Moving_Average

East 199810 10 10

East 199811 4 7

East 199812 10 8

East 199901 7 7

East 199902 10 9

West 199810 8 8

West 199811 12 10

West 199812 7 9

West 199901 11 10

West 199902 6 8

Moving average

in action

Page 16: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 16

Moving SUM Example

SELECT Territory, Month, Sales,

SUM(Sales) OVER (PARTITION BY Territory

ORDER BY Month ASC

ROWS UNBOUNDED PRECEDING) AS Cumulative_Sum

FROM Sales_History;

Aggregation group

Page 17: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 17

Territory Month Sales Cumulative_Sum

East 199810 10 10

East 199811 4 14

East 199812 10 24

East 199901 7 31

East 199902 10 41

West 199810 8 8

West 199811 12 20

West 199812 7 27

West 199901 11 38

West 199902 6 44

Cumulative Sum in action

Page 18: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

GROUPING, ROLLUP, CUBE

Page 19: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 19

Grouping Sets

• Grouping sets allow multiple groups to be specified

• Querying and reporting are made easier and faster by producing a single result from essentially performing

a UNION ALL of two or more groups of rows

• The group-by-clause has been enhanced:

– grouping-sets

• Allows multiple grouping clauses to be specified in a single statement

• Allows the groups to be computed with a single pass over the data

• Can be used to determine subtotals and grand totals

– super-groups

• Pre-defined grouping-sets : ROLLUP, CUBE

Page 20: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 20

ROLLUP• Creates subtotals that ‘roll up’ from the most detailed level to a

grand total

• N elements translate to N+1 grouping sets:

The order specified is significant to the result:

GROUP BY ROLLUP (C1,C2,C3)

GROUP BY GROUPING SETS ((C1,C2,C3),

(C1,C2),

(C1),

());

equivalent

GROUP BY ROLLUP (C1,C2)GROUP BY GROUPING SETS ((C1,C2),

(C1),

());

equivalent

GROUP BY ROLLUP (C2,C1)

GROUP BY GROUPING SETS ((C2,C1),

(C2),

());

equivalent

versus

Page 21: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 21

CUBE • Creates subtotals for all permutations

• N elements translate to 2n grouping sets:

• Unlike ROLLUP, the order specified is NOT significant to the

result therefore use ORDER BY to guarantee the order

• grand-total

– the overall aggregation of previous subtotals

– Both CUBE and ROLLUP return this row as the last row

GROUP BY CUBE (C1,C2,C3)GROUP BY GROUPING SETS ((C1,C2,C3),

(C1,C2), (C1,C3), (C2,C3),

(C1), (C2), (C3),

());

equivalent

8 groups

3 elements

grand-total

Page 22: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 24

Grouping Sets

on

DSN8B10.PR

OJ

DEPTNO RESPEMP PRSTAFF

B01 000020 1.00

C01 000030 2.00

C01 000030 1.00

D01 000010 12.00

D01 000010 6.50

D11 000060 9.00

D11 000220 2.00

D11 000150 3.00

D11 000160 3.00

D21 000070 6.00

D21 000230 2.00

D21 000250 1.00

D21 000270 2.00

E01 000050 6.00

E01 000050 5.00

E11 000090 5.00

E21 000100 4.00

E21 000320 1.00

E21 000330 1.00

E21 000340 1.00

DEPTNO RESPEMP PRSTAFF

B01 000020 1.00

C01 000030 3.00

D01 000010 18.50

D11 000060 9.00

D11 000150 3.00

D11 000160 3.00

D11 000220 2.00

D21 000070 6.00

D21 000230 2.00

D21 000250 1.00

D21 000270 2.00

E01 000050 11.00

E11 000090 5.00

E21 000100 4.00

E21 000320 1.00

E21 000330 1.00

E21 000340 1.00

SELECT DEPTNO, RESPEMP,

SUM(PRSTAFF)

FROM DSN8B10.PROJ

GROUP BY DEPTNO,RESPEMP;

SELECT DEPTNO, RESPEMP,

SUM(PRSTAFF)

FROM DSN8B10.PROJ

GROUP BY

GROUPING SETS ((DEPTNO,RESPEMP));

17 rows after grouping20 rows originally

equivalent

Page 23: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 25

ROLLUP on DSN8B10.PROJ

B01 ? 1.00

C01 ? 3.00

D01 ? 18.50

D11 ? 17.00

D21 ? 11.00

E01 ? 11.00

E11 ? 5.00

E21 ? 7.00

? ? 73.50

DEPTNO RESPEMP PRSTAFF

B01 000020 1.00

C01 000030 3.00

D01 000010 18.50

D11 000060 9.00

D11 000150 3.00

D11 000160 3.00

D11 000220 2.00

D21 000070 6.00

D21 000230 2.00

D21 000250 1.00

D21 000270 2.00

E01 000050 11.00

E11 000090 5.00

E21 000100 4.00

E21 000320 1.00

E21 000330 1.00

E21 000340 1.00

SELECT DEPTNO, RESPEMP,

SUM(PRSTAFF)

FROM DSN8B10.PROJ

GROUP BY

ROLLUP (DEPTNO,RESPEMP);

SELECT DEPTNO, RESPEMP,

SUM(PRSTAFF)

FROM DSN8B10.PROJ

GROUP BY

GROUPING SETS ((DEPTNO,RESPEMP),

(DEPTNO),

());

26 rows after ROLLUP

Grand total of ALL groups

Subtotals for each

DEPTNO group

equivalent

Subtotals for each

DEPTNO, RESPEMP group

Page 24: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

SELECT DEPTNO, RESPEMP, SUM(PRSTAFF)

FROM DSN8B10.PROJ

GROUP BY

GROUPING SETS ((DEPTNO,RESPEMP),

(DEPTNO),

(RESPEMP),

());

CUBE on

DSN8B10.PROJ

? 000010 18.50

? 000020 1.00

? 000030 3.00

? 000050 11.00

? 000060 9.00

? 000070 6.00

? 000090 5.00

? 000100 4.00

? 000150 3.00

? 000160 3.00

? 000220 2.00

? 000230 2.00

? 000250 1.00

? 000270 2.00

? 000320 1.00

? 000330

? 000340 1.00

? ? 73.50

DEPT.. RESP.. PR...

B01 000020 1.00

C01 000030 3.00

D01 000010 18.50

D11 000060 9.00

D11 000150 3.00

D11 000160 3.00

D11 000220 2.00

D21 000070 6.00

D21 000230 2.00

D21 000250 1.00

D21 000270 2.00

E01 000050 11.00

E11 000090 5.00

E21 000100 4.00

E21 000320 1.00

E21 000330 1.00

E21 000340 1.00

SELECT DEPTNO, RESPEMP, SUM(PRSTAFF)

FROM DSN8B10.PROJ

GROUP BY

CUBE (DEPTNO,RESPEMP);

equivalent

Grand total of ALL groups

B01 ? 1.00

C01 ? 3.00

D01 ? 18.50

D11 ? 17.00

D21 ? 11.00

E01 ? 11.00

E11 ? 5.00

E21 ? 7.00

Subtotals for each

DEPTNO group

Subtotals for

each RESPEMP

group

Page 25: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

MEDIAN, PERCENTILE, LISTAGG

Page 26: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 28

Overview

• PERCENTILE_CONT and PERCENTILE_DISC are categorized as "inverse distribution

functions" in the SQL Standard that compute a percentile value within a given group.

• MEDIAN, which we know intuitively as producing the "middle" value can be equivalently

expressed as PERCENTILE_CONT(0.5), where the argument 0.5 indicates that we

want the function to return the "middle" percentage value.

Page 27: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 29

PERCENTILE_CONT vs PERCENTILE_DISC vs MEDIAN

• PERCENTILE_CONT treats the group as a "continuous distribution",

meaning that if the percentage falls between two values, the result is

interpolated between those two values. For example, the median

(PERCENTILE_CONT(0.5)) of a group of 6 numbers would be

average of the 3rd and 4th number, because we find the interpolated

middle value between those two.

• PERCENTILE_DISC treats the group as a set of discrete values. The

result is not interpolated and is always one of the values in the group.

PERCENTILE_DISC(0.5) over a group of 6 numbers would return the

3rd number.

• MEDIAN is equivalent with PERCENTILE_CONT(0.5).

Salary

2450

2975

1375

Salary

2450

2975

1250

PERCENTILE_CONT

PERCENTILE_DISC

VS.

Page 28: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

PERCENTILE_CONT

and MEDIAN

• Suppose we would like to determine the salary, within each

department, which is at the 50th percentile

Salary

2450

2975

1375

Result:

SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY SALARY)

FROM EMP

GROUP BY DEPT_NO;

SELECT MEDIAN(SALARY)

FROM EMP

GROUP BY DEPT_NO;

NAME SALARY DEPT_NO

Miller 1300 10

Clark 2450 10

King 5002 10

Smith 800 20

Adams 1100 20

Jones 2975 20

Scott 3000 20

Ford 3000 20

James 950 30

Martin 1250 30

Ward 1250 30

Turner 1500 30

Allen 1600 30

Blake 2850 30

MEDIAN == PERCENTILE(0.5)

Page 29: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

PERCENTILE_DISC

Name Salary DEPT_NO

Miller 1300 10

Clark 2450 10

King 5002 10

Smith 800 20

Adams 1100 20

Jones 2975 20

Scott 3000 20

Ford 3000 20

James 950 30

Martin 1250 30

Ward 1250 30

Turner 1500 30

Allen 1600 30

Blake 2850 30

Salary

2450

2975

1250

Result:

Suppose we would like to determine the

actual salary value, within each department,

which is at the 50th percentile

SELECT PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY SALARY)

FROM EMP

GROUP BY DEPT_NO;

Page 30: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 32

PERCENTILE Syntax Diagram

• percentile-expr:

Specifies the percentile to be calculated by the function.

percentile-expr must return a built-in numeric value between 0

and 1

Page 31: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 35

LISTAGG Overview

• The LISTAGG function is used to aggregate a set of string values into a

single string

• The values are appended based on the order specified by the WITHIN

GROUP clause

• The values may be separated by a separator

Page 32: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

LISTAGG ALL

DID EName EID EBYear

10 Jack 0012 1983

10 Lily 0015 1990

20 Tom 0019 1983

20 Bob 0022 1976

20 Frank 0004 1983

20 Tom 0014 1985

30 Jerry 0028 1991

30 Chris 0021 1981

30 Jill 0002 1984

30 Jerry 0031 1984

30 Allan 0006 1995

DID Name_List

10 Jack; Lily

20 Bob; Frank; Tom; Tom

30 Chris; Jill; Jerry; Jerry; Allan

Result:

Suppose we would like to output employee names, in ascending order, under the

same department according to their employee ID and birth year from the EMP table

SELECT Dept_ID,

ListAgg(ALL E_Name, ‘; ’) WITHIN GROUP (ORDR BY Birth, Emp_ID)

AS Name_List

FROM EMP

GROUP BY Dept_ID;

*To be delivered in the service stream

Page 33: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

LISTAGG DISTINCT

DID EName EID EBYear

10 Jack 0012 1983

10 Lily 0015 1990

20 Tom 0019 1983

20 Bob 0022 1976

20 Frank 0004 1983

20 Tom 0014 1985

30 Jerry 0028 1991

30 Chris 0021 1981

30 Jill 0002 1984

30 Jerry 0031 1984

30 Allan 0006 1995

DID Name_List

10 Jack; Lily

20 Bob; Frank; Tom

30 Chris; Jill; Jerry; Allan

Result:

Suppose we would like to output employee names, in ascending order, under the same department

according to their employee ID and birth year from the EMP table, eliminating any duplicate names

SELECT Dept_ID,

ListAgg(DISTINCT E_Name, ‘; ’) WITHIN GROUP (ORDR BY Birth, Emp_ID)

AS Name_List

FROM EMP

GROUP BY Dept_ID;

Page 34: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 38

Notices and disclaimers

• Copyright © 2016 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission from IBM.

• U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.

• Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided.

• IBM products are manufactured from new parts or new and used parts. In some cases, a product may not be new and may have been previously installed. Regardless, our warranty terms apply.”

• Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.

• Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary.

• References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business.

• Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation.

• It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law.

Page 35: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

© 2016 IBM Corporation03/01/17World of Watson 2016 39

Notices and disclaimers continued

• Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

• The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right.

• IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®, StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.

Page 36: DB2 for z/OS is Serious About Analytics: Leveraging SQL to ...api.ning.com/files/GNqxi34AHImafvsg*C**yITPZ0CFaTIh1IdS4scbS-… · DB2 for z/OS is Serious About Analytics: Leveraging

Maryela Weihrauch IBM

[email protected]

Title

DB2 for z/OS is Serious About Analytics:

Leveraging SQL to Analyze Your Data

Where it ResidesPlease fill out your

session evaluation before

leaving!