anton dignös - towards a temporal postgressql

33
Towards a Temporal PostgreSQL Incorporating Primitives for Interval Processing into PostgreSQL Anton Dign¨ os 1 Michael B¨ ohlen 1 Johann Gamper 2 1 Department of Computer Science University of Z¨ urich 2 Faculty of Computer Science Free University of Bozen-Bolzano SFScon13 sfscon 2013 1/20 AntonDign¨os

Upload: south-tyrol-free-software-conference

Post on 10-May-2015

973 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Anton Dignös - Towards a Temporal PostgresSQL

Towards a Temporal PostgreSQLIncorporating Primitives for Interval Processing into

PostgreSQL

Anton Dignos1 Michael Bohlen1 Johann Gamper2

1Department of Computer ScienceUniversity of Zurich

2Faculty of Computer ScienceFree University of Bozen-Bolzano

SFScon13

sfscon 2013 1/20 Anton Dignos

Page 2: Anton Dignös - Towards a Temporal PostgresSQL

Temporal Example

We have: Projects managed by departments

projName Dept Budg Start End

P1 M 10k Jan Dec

P2 PH 7k Feb Aug

P3 CS 5k Jun Dec

sfscon 2013 2/20 Anton Dignos

Page 3: Anton Dignös - Towards a Temporal PostgresSQL

Temporal Example

We have: Projects managed by departments

projName Dept Budg Start End

P1 M 10k Jan Dec

P2 PH 7k Feb Aug

P3 CS 5k Jun Dec

Question: What are the top-2 time periods with most concurrentprojects?

sfscon 2013 2/20 Anton Dignos

Page 4: Anton Dignös - Towards a Temporal PostgresSQL

Temporal Example

We have: Projects managed by departments

projName Dept Budg Start End

P1 M 10k Jan Dec

P2 PH 7k Feb Aug

P3 CS 5k Jun Dec

Question: What are the top-2 time periods with most concurrentprojects?

Count Start End

3 Jun Aug

2 Feb Jun

sfscon 2013 2/20 Anton Dignos

Page 5: Anton Dignös - Towards a Temporal PostgresSQL

Temporal Example

We have: Projects managed by departments

projName Dept Budg Start End

P1 M 10k Jan Dec

P2 PH 7k Feb Aug

P3 CS 5k Jun Dec

Question: What are the top-2 time periods with most concurrentprojects?

Count Start End

3 Jun Aug

2 Feb Jun

Counting procedure: 1 @ Jan, 2 @ Feb, 2 @ Mar, . . .

sfscon 2013 2/20 Anton Dignos

Page 6: Anton Dignös - Towards a Temporal PostgresSQL

Some Facts about our WorkI 4 years of intensive research work

I 1 year of my master (Free University of Bozen-Bolzano)I 3 years of my Ph.D (University of Zurich)

I Published in top-3 DB conferences with acceptance rate below 20%I Published and presented at SIGMOD’12 in Scottsdale, Arizona, USAI Demonstrated at ICDE’13 in Brisbane, Queensland, Australia

I Widely adopted in the database communityI Initially we have developed an SQL language extensionI SQL extension was selected and proposed as amendment to the

ANSI/ISO standardization committeeI SQL amendment was adapted and partially implemented by Teradata

sfscon 2013 3/20 Anton Dignos

Page 7: Anton Dignös - Towards a Temporal PostgresSQL

Table of Content

Why Time?

The Temporal Database Field

Our Solution

Summary and Vision

sfscon 2013 4/20 Anton Dignos

Page 8: Anton Dignös - Towards a Temporal PostgresSQL

Why Time? /1

Ubiquitous: All information is qualified with a time interval

I medical records

I loans

I transport information

I . . .

Gain: Additional Information

I Prediction

I Analysis

I Strategy planning

I Accountability

sfscon 2013 5/20 Anton Dignos

Page 9: Anton Dignös - Towards a Temporal PostgresSQL

Why Time? /2

Projects with their department manager

Mgr Dept Name Budg Start End

Ann M P1 10k Jan Jun

Sam PH P2 7k Feb Aug

Ann CS P3 5k Jun Dec

Joe M P1 10k Jun Dec

Additional Information:

I Ann supervised P1 before Joe

I Ann supervised two projects in total

I Joe did not supervise the entire P1

I There was a project P2 in the past supervised by Sam

I . . .

sfscon 2013 6/20 Anton Dignos

Page 10: Anton Dignös - Towards a Temporal PostgresSQL

The Temporal Database Field /1

I Active research field since the 1980s

I Many language proposals (TQuel, IXSQL, . . . )

I Consensus language TSQL2 (1992)

I SQL/Temporal official amendment of SQL3

TQuel

IXSQL

TempSQL

HSQL...

TSQL2

ChronoLogChronoSQL

SQL/Temporal

ATSQL

SQL/TP

statement modifiers

Teradata

Lack of implementations and working solutions

sfscon 2013 7/20 Anton Dignos

Page 11: Anton Dignös - Towards a Temporal PostgresSQL

The Temporal Database Field /2

I Support for temporal data varies a lot depending on database vendor(Order from most (1.) to least (5.) support)

1. Teradata2. Oracle DB3. IBM DB24. PostgreSQL5. Microsoft SQL Server

Time Infrastructure

Time Processing

Datatype

and

Functions

Temporal DB

Time Travel

MS SQL Server

PostgreSQL

Oracle DB

IBM DB2

SAP Hana

Teradata

PostgreSQL

Our goal is to advance PostgreSQL into a leading position

sfscon 2013 8/20 Anton Dignos

Page 12: Anton Dignös - Towards a Temporal PostgresSQL

The Temporal Database Field /2

I Support for temporal data varies a lot depending on database vendor(Order from most (1.) to least (5.) support)

1. Teradata2. Oracle DB3. IBM DB24. PostgreSQL5. Microsoft SQL Server

Time Infrastructure

Time Processing

Datatype

and

Functions

Temporal DB

Time Travel

MS SQL Server

PostgreSQL

Oracle DB

IBM DB2

SAP Hana

Teradata

PostgreSQL

Our goal is to advance PostgreSQL into a leading positionsfscon 2013 8/20 Anton Dignos

Page 13: Anton Dignös - Towards a Temporal PostgresSQL

Microsoft SQL ServerI Very limited support for time

I Date datatypes and some functionsI No support for intervals

I David Lomet (MS Research)- Immortal DB1 (2002)I Transaction time support for SQL ServerI Prototype

I Workaround proposed by Itzik Ben-Gan et al.2 (2009)

(1) Transform intervals into points; (2) perform operations onpoints; (3) transform points into intervals

Workaround is inefficient and does not consider intervals

1http://research.microsoft.com/en-us/projects/immortaldb/2Itzik Ben-Gan et al., Inside Microsoft SQL Server 2008: T-SQL Programming,

Chap. 12 Temporal Support in the Relational Model, MSPress, 2009sfscon 2013 9/20 Anton Dignos

Page 14: Anton Dignös - Towards a Temporal PostgresSQL

PostgreSQLI Jeff Davis - Temporal Postgres3 (2007)

I Interval datatype and UDF functions on intervalsI Indexing via GiST index

I PostgreSQL release 9.24 (2012)I Range TypesI Indexing via GiST or SP-GiST indexI Constraints on Ranges, i.e., temporal key constraints

No support for time travel, no support for temporal queries

3http://temporal.projects.pgfoundry.org/4http://www.postgresql.org/docs/9.2/static/rangetypes.html

sfscon 2013 10/20 Anton Dignos

Page 15: Anton Dignös - Towards a Temporal PostgresSQL

IBM DB2I Temporal extension added as of IBM DB2 10 for Z/OS5 (2010)

I Support for time travel:I SYSTEM TIME AS OFI SYSTEM TIME FROM...TO...I SYSTEM TIME BETWEEN...AND...

I Technology: Current and history tables

Support for time travel, no support for temporal queries

5https://www.ibm.com/developerworks/data/library/techarticle/

dm-1204db2temporaldata/sfscon 2013 11/20 Anton Dignos

Page 16: Anton Dignös - Towards a Temporal PostgresSQL

Oracle DBI Temporal extension added via workspace manager as of Oracle DB

9i6 (2001)I Support for time travel:

I WHERE AS OFI SetValidTime

I Technology: Flashback

Support for time travel, no support for temporal queries

6http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28396.pdfsfscon 2013 12/20 Anton Dignos

Page 17: Anton Dignös - Towards a Temporal PostgresSQL

TeradataI Temporal support added as of Teradata 13.107 (2010)

I Currently DB with most support for timeI Time travel similar to IBM DB2 and Oracle DBI Implements ANSI Temporal SQL (1992-1999)I Technology: Translation of queries at SQL level (Al-Kateb et al.

EDBT ’13)

Support of time travel, partial support for temporal queries

7http://www.info.teradata.com/do_redirect.cfm?itemid=102320064sfscon 2013 13/20 Anton Dignos

Page 18: Anton Dignös - Towards a Temporal PostgresSQL

Our Solution - Splitting of Intervals

Same project data drawn on a timeline

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec t

P2, PH, 7k

P1, M, 10k

P3, CS, 5kproj

1

22

3

2

I Intervals in input and output are not the same

I Requires splitting of intervals

sfscon 2013 14/20 Anton Dignos

Page 19: Anton Dignös - Towards a Temporal PostgresSQL

Key Insight and Solution

Key Insight

I Databases are not good at dealing with interval queries

Idea

I Provide temporal primitives to split intervals

I After splitting use traditional database operators with equality oninterval fragments

SolutionI Two temporal primitives are required

I Normalization NI Alignment φ

I Reduction rules at algebraic level reduce temporal operations totemporal primitives and traditional database operations

sfscon 2013 15/20 Anton Dignos

Page 20: Anton Dignös - Towards a Temporal PostgresSQL

Reduction Rules

Blueprint for database programmers

Operator Reduction

Selection σTθ (r) = σθ(r)

Projection πTB(r) = πB,T (N r.B=s.B(r, r/s))

Aggregation BϑTF (r) = B,TϑF (N r.B=s.B(r, r/s))

Difference r −T s = N r.A=s.A(r, s) − N r.A=s.A(s, r)

Union r ∪T s = N r.A=s.A(r, s) ∪ N r.A=s.A(s, r)

Intersection r ∩T s = N r.A=s.A(r, s) ∩ N r.A=s.A(s, r)

Cart. Prod. r ×T s = α((φ>(r, s))onr.T=s.T (φ>(s, r)))

Inner Join r onTθ s = α((φθ(r, s))onθ∧r.T=s.T (φθ(s, r)))

Left O. Join r d|><|Tθ s = α((φθ(r, s)) d|><| θ∧r.T=s.T (φθ(s, r)))

Right O. Join r |><|dTθ s = α((φθ(r, s)) |><|d θ∧r.T=s.T (φθ(s, r)))

Full O. Join r d|><|dTθ s = α((φθ(r, s)) d|><|d θ∧r.T=s.T (φθ(s, r)))

Anti Join r BTθ s = (φθ(r, s))Bθ∧r.T=s.T (φθ(s, r))

Temporal Op. = Primitive + Traditional Op.

sfscon 2013 16/20 Anton Dignos

Page 21: Anton Dignös - Towards a Temporal PostgresSQL

PostgreSQL Implementation /1

PostgreSQL prototype with implemented primitives availablehttp://www.ifi.uzh.ch/dbtg/research/align.html

SQL

DBMS8

Parser60kloc150

Analyzer/Rewriter20kloc450

Optimizer50kloc150

Executor40kloc400

Files and Access Methods

Buffer Manager

Disk Manager

RecoveryManager

RecoveryManager

LockManager

Data and Index Files

8Image: Raghu Ramakrishnan and Johannes Gehrke. Database Management Systems. McGraw-Hill 2003

sfscon 2013 17/20 Anton Dignos

Page 22: Anton Dignös - Towards a Temporal PostgresSQL

PostgreSQL Implementation /2

Implementation Approach:I Temporal primitives are implemented into query flow

I Temporal primitives are nodes in query/plan/executor treesI Primitives themselves reuse traditional database operationsI Only one new Executor function

sfscon 2013 18/20 Anton Dignos

Page 23: Anton Dignös - Towards a Temporal PostgresSQL

PostgreSQL Implementation /2

Implementation Approach:I Temporal primitives are implemented into query flow

I Temporal primitives are nodes in query/plan/executor treesI Primitives themselves reuse traditional database operationsI Only one new Executor function

Advantages:I Temporal primitives are optimized within the plan tree

I Cost estimation XI (Join) order XI Selection push-down XI Propagate orderings X

I Traditional database operations are optimized out of the box X

I Local potential for performance improvements (work in progress . . . )

sfscon 2013 18/20 Anton Dignos

Page 24: Anton Dignös - Towards a Temporal PostgresSQL

PostgreSQL Implementation /2

Implementation Approach:I Temporal primitives are implemented into query flow

I Temporal primitives are nodes in query/plan/executor treesI Primitives themselves reuse traditional database operationsI Only one new Executor function

Advantages:I Temporal primitives are optimized within the plan tree

I Cost estimation XI (Join) order XI Selection push-down XI Propagate orderings X

I Traditional database operations are optimized out of the box X

I Local potential for performance improvements (work in progress . . . )

sfscon 2013 18/20 Anton Dignos

Page 25: Anton Dignös - Towards a Temporal PostgresSQL

PostgreSQL Implementation /2

Implementation Approach:I Temporal primitives are implemented into query flow

I Temporal primitives are nodes in query/plan/executor treesI Primitives themselves reuse traditional database operationsI Only one new Executor function

Advantages:I Temporal primitives are optimized within the plan tree

I Cost estimation XI (Join) order XI Selection push-down XI Propagate orderings X

I Traditional database operations are optimized out of the box X

I Local potential for performance improvements (work in progress . . . )

sfscon 2013 18/20 Anton Dignos

Page 26: Anton Dignös - Towards a Temporal PostgresSQL

SQL Example

Query: What is the number of concurrent projects per department?

SELECT Dept, COUNT(*)

FROM proj

GROUP BY Dept

SELECT Dept, COUNT(*), Start, End

FROM (proj NORMALIZE proj USING (Dept)) pnrom

GROUP BY Dept, Start, End

sfscon 2013 19/20 Anton Dignos

Page 27: Anton Dignös - Towards a Temporal PostgresSQL

SQL Example

Query: What is the number of concurrent projects per department?

SELECT Dept, COUNT(*)

FROM proj

GROUP BY Dept

Operator Reduction

Aggregation BϑTF (r) = B,TϑF (N r.B=s.B(r, r/s))

SELECT Dept, COUNT(*), Start, End

FROM (proj NORMALIZE proj USING (Dept)) pnrom

GROUP BY Dept, Start, End

sfscon 2013 19/20 Anton Dignos

Page 28: Anton Dignös - Towards a Temporal PostgresSQL

SQL Example

Query: What is the number of concurrent projects per department?

SELECT Dept, COUNT(*)

FROM proj

GROUP BY Dept

Operator Reduction

Aggregation BϑTF (r) = B,TϑF (N r.B=s.B(r, r/s))

SELECT Dept, COUNT(*), Start, End

FROM

(proj NORMALIZE proj USING (Dept)) pnrom

GROUP BY Dept, Start, End

sfscon 2013 19/20 Anton Dignos

Page 29: Anton Dignös - Towards a Temporal PostgresSQL

SQL Example

Query: What is the number of concurrent projects per department?

SELECT Dept, COUNT(*)

FROM proj

GROUP BY Dept

Operator Reduction

Aggregation BϑTF (r) = B,TϑF (N r.B=s.B(r, r/s))

SELECT Dept, COUNT(*), Start, End

FROM (proj NORMALIZE proj USING (Dept)) pnrom

GROUP BY Dept, Start, End

sfscon 2013 19/20 Anton Dignos

Page 30: Anton Dignös - Towards a Temporal PostgresSQL

SQL Example

Query: What is the number of concurrent projects per department?

SELECT Dept, COUNT(*)

FROM proj

GROUP BY Dept

Operator Reduction

Aggregation BϑTF (r) = B,TϑF (N r.B=s.B(r, r/s))

SELECT Dept, COUNT(*), Start, End

FROM (proj NORMALIZE proj USING (Dept)) pnrom

GROUP BY Dept, Start, End

Reduction rules are systematic and mechanic!

sfscon 2013 19/20 Anton Dignos

Page 31: Anton Dignös - Towards a Temporal PostgresSQL

Summary and Vision

Currently

I PostgreSQL prototype with implemented primitives availablehttp://www.ifi.uzh.ch/dbtg/research/align.html

I Supports all temporal queries

I Evaluation shows good performance

I Working on additional index structures

sfscon 2013 20/20 Anton Dignos

Page 32: Anton Dignös - Towards a Temporal PostgresSQL

Summary and Vision

Currently

I PostgreSQL prototype with implemented primitives availablehttp://www.ifi.uzh.ch/dbtg/research/align.html

I Supports all temporal queries

I Evaluation shows good performance

I Working on additional index structures

Vision

Integrate temporal primitives for temporal queries into the PostgreSQLrelease

sfscon 2013 20/20 Anton Dignos

Page 33: Anton Dignös - Towards a Temporal PostgresSQL

Summary and Vision

Currently

I PostgreSQL prototype with implemented primitives availablehttp://www.ifi.uzh.ch/dbtg/research/align.html

I Supports all temporal queries

I Evaluation shows good performance

I Working on additional index structures

Vision

Integrate temporal primitives for temporal queries into the PostgreSQLrelease

Thank you for your attention!

sfscon 2013 20/20 Anton Dignos