anton dignös - towards a temporal postgressql
TRANSCRIPT
Towards a Temporal PostgreSQLIncorporating Primitives for Interval Processing into
PostgreSQL
Anton Dignos1 Michael Bohlen1 Johann Gamper2
1Department of Computer ScienceUniversity of Zurich
2Faculty of Computer ScienceFree University of Bozen-Bolzano
SFScon13
sfscon 2013 1/20 Anton Dignos
Temporal Example
We have: Projects managed by departments
projName Dept Budg Start End
P1 M 10k Jan Dec
P2 PH 7k Feb Aug
P3 CS 5k Jun Dec
sfscon 2013 2/20 Anton Dignos
Temporal Example
We have: Projects managed by departments
projName Dept Budg Start End
P1 M 10k Jan Dec
P2 PH 7k Feb Aug
P3 CS 5k Jun Dec
Question: What are the top-2 time periods with most concurrentprojects?
sfscon 2013 2/20 Anton Dignos
Temporal Example
We have: Projects managed by departments
projName Dept Budg Start End
P1 M 10k Jan Dec
P2 PH 7k Feb Aug
P3 CS 5k Jun Dec
Question: What are the top-2 time periods with most concurrentprojects?
Count Start End
3 Jun Aug
2 Feb Jun
sfscon 2013 2/20 Anton Dignos
Temporal Example
We have: Projects managed by departments
projName Dept Budg Start End
P1 M 10k Jan Dec
P2 PH 7k Feb Aug
P3 CS 5k Jun Dec
Question: What are the top-2 time periods with most concurrentprojects?
Count Start End
3 Jun Aug
2 Feb Jun
Counting procedure: 1 @ Jan, 2 @ Feb, 2 @ Mar, . . .
sfscon 2013 2/20 Anton Dignos
Some Facts about our WorkI 4 years of intensive research work
I 1 year of my master (Free University of Bozen-Bolzano)I 3 years of my Ph.D (University of Zurich)
I Published in top-3 DB conferences with acceptance rate below 20%I Published and presented at SIGMOD’12 in Scottsdale, Arizona, USAI Demonstrated at ICDE’13 in Brisbane, Queensland, Australia
I Widely adopted in the database communityI Initially we have developed an SQL language extensionI SQL extension was selected and proposed as amendment to the
ANSI/ISO standardization committeeI SQL amendment was adapted and partially implemented by Teradata
sfscon 2013 3/20 Anton Dignos
Table of Content
Why Time?
The Temporal Database Field
Our Solution
Summary and Vision
sfscon 2013 4/20 Anton Dignos
Why Time? /1
Ubiquitous: All information is qualified with a time interval
I medical records
I loans
I transport information
I . . .
Gain: Additional Information
I Prediction
I Analysis
I Strategy planning
I Accountability
sfscon 2013 5/20 Anton Dignos
Why Time? /2
Projects with their department manager
Mgr Dept Name Budg Start End
Ann M P1 10k Jan Jun
Sam PH P2 7k Feb Aug
Ann CS P3 5k Jun Dec
Joe M P1 10k Jun Dec
Additional Information:
I Ann supervised P1 before Joe
I Ann supervised two projects in total
I Joe did not supervise the entire P1
I There was a project P2 in the past supervised by Sam
I . . .
sfscon 2013 6/20 Anton Dignos
The Temporal Database Field /1
I Active research field since the 1980s
I Many language proposals (TQuel, IXSQL, . . . )
I Consensus language TSQL2 (1992)
I SQL/Temporal official amendment of SQL3
TQuel
IXSQL
TempSQL
HSQL...
TSQL2
ChronoLogChronoSQL
SQL/Temporal
ATSQL
SQL/TP
statement modifiers
Teradata
Lack of implementations and working solutions
sfscon 2013 7/20 Anton Dignos
The Temporal Database Field /2
I Support for temporal data varies a lot depending on database vendor(Order from most (1.) to least (5.) support)
1. Teradata2. Oracle DB3. IBM DB24. PostgreSQL5. Microsoft SQL Server
Time Infrastructure
Time Processing
Datatype
and
Functions
Temporal DB
Time Travel
MS SQL Server
PostgreSQL
Oracle DB
IBM DB2
SAP Hana
Teradata
PostgreSQL
Our goal is to advance PostgreSQL into a leading position
sfscon 2013 8/20 Anton Dignos
The Temporal Database Field /2
I Support for temporal data varies a lot depending on database vendor(Order from most (1.) to least (5.) support)
1. Teradata2. Oracle DB3. IBM DB24. PostgreSQL5. Microsoft SQL Server
Time Infrastructure
Time Processing
Datatype
and
Functions
Temporal DB
Time Travel
MS SQL Server
PostgreSQL
Oracle DB
IBM DB2
SAP Hana
Teradata
PostgreSQL
Our goal is to advance PostgreSQL into a leading positionsfscon 2013 8/20 Anton Dignos
Microsoft SQL ServerI Very limited support for time
I Date datatypes and some functionsI No support for intervals
I David Lomet (MS Research)- Immortal DB1 (2002)I Transaction time support for SQL ServerI Prototype
I Workaround proposed by Itzik Ben-Gan et al.2 (2009)
(1) Transform intervals into points; (2) perform operations onpoints; (3) transform points into intervals
Workaround is inefficient and does not consider intervals
1http://research.microsoft.com/en-us/projects/immortaldb/2Itzik Ben-Gan et al., Inside Microsoft SQL Server 2008: T-SQL Programming,
Chap. 12 Temporal Support in the Relational Model, MSPress, 2009sfscon 2013 9/20 Anton Dignos
PostgreSQLI Jeff Davis - Temporal Postgres3 (2007)
I Interval datatype and UDF functions on intervalsI Indexing via GiST index
I PostgreSQL release 9.24 (2012)I Range TypesI Indexing via GiST or SP-GiST indexI Constraints on Ranges, i.e., temporal key constraints
No support for time travel, no support for temporal queries
3http://temporal.projects.pgfoundry.org/4http://www.postgresql.org/docs/9.2/static/rangetypes.html
sfscon 2013 10/20 Anton Dignos
IBM DB2I Temporal extension added as of IBM DB2 10 for Z/OS5 (2010)
I Support for time travel:I SYSTEM TIME AS OFI SYSTEM TIME FROM...TO...I SYSTEM TIME BETWEEN...AND...
I Technology: Current and history tables
Support for time travel, no support for temporal queries
5https://www.ibm.com/developerworks/data/library/techarticle/
dm-1204db2temporaldata/sfscon 2013 11/20 Anton Dignos
Oracle DBI Temporal extension added via workspace manager as of Oracle DB
9i6 (2001)I Support for time travel:
I WHERE AS OFI SetValidTime
I Technology: Flashback
Support for time travel, no support for temporal queries
6http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28396.pdfsfscon 2013 12/20 Anton Dignos
TeradataI Temporal support added as of Teradata 13.107 (2010)
I Currently DB with most support for timeI Time travel similar to IBM DB2 and Oracle DBI Implements ANSI Temporal SQL (1992-1999)I Technology: Translation of queries at SQL level (Al-Kateb et al.
EDBT ’13)
Support of time travel, partial support for temporal queries
7http://www.info.teradata.com/do_redirect.cfm?itemid=102320064sfscon 2013 13/20 Anton Dignos
Our Solution - Splitting of Intervals
Same project data drawn on a timeline
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec t
P2, PH, 7k
P1, M, 10k
P3, CS, 5kproj
1
22
3
2
I Intervals in input and output are not the same
I Requires splitting of intervals
sfscon 2013 14/20 Anton Dignos
Key Insight and Solution
Key Insight
I Databases are not good at dealing with interval queries
Idea
I Provide temporal primitives to split intervals
I After splitting use traditional database operators with equality oninterval fragments
SolutionI Two temporal primitives are required
I Normalization NI Alignment φ
I Reduction rules at algebraic level reduce temporal operations totemporal primitives and traditional database operations
sfscon 2013 15/20 Anton Dignos
Reduction Rules
Blueprint for database programmers
Operator Reduction
Selection σTθ (r) = σθ(r)
Projection πTB(r) = πB,T (N r.B=s.B(r, r/s))
Aggregation BϑTF (r) = B,TϑF (N r.B=s.B(r, r/s))
Difference r −T s = N r.A=s.A(r, s) − N r.A=s.A(s, r)
Union r ∪T s = N r.A=s.A(r, s) ∪ N r.A=s.A(s, r)
Intersection r ∩T s = N r.A=s.A(r, s) ∩ N r.A=s.A(s, r)
Cart. Prod. r ×T s = α((φ>(r, s))onr.T=s.T (φ>(s, r)))
Inner Join r onTθ s = α((φθ(r, s))onθ∧r.T=s.T (φθ(s, r)))
Left O. Join r d|><|Tθ s = α((φθ(r, s)) d|><| θ∧r.T=s.T (φθ(s, r)))
Right O. Join r |><|dTθ s = α((φθ(r, s)) |><|d θ∧r.T=s.T (φθ(s, r)))
Full O. Join r d|><|dTθ s = α((φθ(r, s)) d|><|d θ∧r.T=s.T (φθ(s, r)))
Anti Join r BTθ s = (φθ(r, s))Bθ∧r.T=s.T (φθ(s, r))
Temporal Op. = Primitive + Traditional Op.
sfscon 2013 16/20 Anton Dignos
PostgreSQL Implementation /1
PostgreSQL prototype with implemented primitives availablehttp://www.ifi.uzh.ch/dbtg/research/align.html
SQL
DBMS8
Parser60kloc150
Analyzer/Rewriter20kloc450
Optimizer50kloc150
Executor40kloc400
Files and Access Methods
Buffer Manager
Disk Manager
RecoveryManager
RecoveryManager
LockManager
Data and Index Files
8Image: Raghu Ramakrishnan and Johannes Gehrke. Database Management Systems. McGraw-Hill 2003
sfscon 2013 17/20 Anton Dignos
PostgreSQL Implementation /2
Implementation Approach:I Temporal primitives are implemented into query flow
I Temporal primitives are nodes in query/plan/executor treesI Primitives themselves reuse traditional database operationsI Only one new Executor function
sfscon 2013 18/20 Anton Dignos
PostgreSQL Implementation /2
Implementation Approach:I Temporal primitives are implemented into query flow
I Temporal primitives are nodes in query/plan/executor treesI Primitives themselves reuse traditional database operationsI Only one new Executor function
Advantages:I Temporal primitives are optimized within the plan tree
I Cost estimation XI (Join) order XI Selection push-down XI Propagate orderings X
I Traditional database operations are optimized out of the box X
I Local potential for performance improvements (work in progress . . . )
sfscon 2013 18/20 Anton Dignos
PostgreSQL Implementation /2
Implementation Approach:I Temporal primitives are implemented into query flow
I Temporal primitives are nodes in query/plan/executor treesI Primitives themselves reuse traditional database operationsI Only one new Executor function
Advantages:I Temporal primitives are optimized within the plan tree
I Cost estimation XI (Join) order XI Selection push-down XI Propagate orderings X
I Traditional database operations are optimized out of the box X
I Local potential for performance improvements (work in progress . . . )
sfscon 2013 18/20 Anton Dignos
PostgreSQL Implementation /2
Implementation Approach:I Temporal primitives are implemented into query flow
I Temporal primitives are nodes in query/plan/executor treesI Primitives themselves reuse traditional database operationsI Only one new Executor function
Advantages:I Temporal primitives are optimized within the plan tree
I Cost estimation XI (Join) order XI Selection push-down XI Propagate orderings X
I Traditional database operations are optimized out of the box X
I Local potential for performance improvements (work in progress . . . )
sfscon 2013 18/20 Anton Dignos
SQL Example
Query: What is the number of concurrent projects per department?
SELECT Dept, COUNT(*)
FROM proj
GROUP BY Dept
SELECT Dept, COUNT(*), Start, End
FROM (proj NORMALIZE proj USING (Dept)) pnrom
GROUP BY Dept, Start, End
sfscon 2013 19/20 Anton Dignos
SQL Example
Query: What is the number of concurrent projects per department?
SELECT Dept, COUNT(*)
FROM proj
GROUP BY Dept
Operator Reduction
Aggregation BϑTF (r) = B,TϑF (N r.B=s.B(r, r/s))
SELECT Dept, COUNT(*), Start, End
FROM (proj NORMALIZE proj USING (Dept)) pnrom
GROUP BY Dept, Start, End
sfscon 2013 19/20 Anton Dignos
SQL Example
Query: What is the number of concurrent projects per department?
SELECT Dept, COUNT(*)
FROM proj
GROUP BY Dept
Operator Reduction
Aggregation BϑTF (r) = B,TϑF (N r.B=s.B(r, r/s))
SELECT Dept, COUNT(*), Start, End
FROM
(proj NORMALIZE proj USING (Dept)) pnrom
GROUP BY Dept, Start, End
sfscon 2013 19/20 Anton Dignos
SQL Example
Query: What is the number of concurrent projects per department?
SELECT Dept, COUNT(*)
FROM proj
GROUP BY Dept
Operator Reduction
Aggregation BϑTF (r) = B,TϑF (N r.B=s.B(r, r/s))
SELECT Dept, COUNT(*), Start, End
FROM (proj NORMALIZE proj USING (Dept)) pnrom
GROUP BY Dept, Start, End
sfscon 2013 19/20 Anton Dignos
SQL Example
Query: What is the number of concurrent projects per department?
SELECT Dept, COUNT(*)
FROM proj
GROUP BY Dept
Operator Reduction
Aggregation BϑTF (r) = B,TϑF (N r.B=s.B(r, r/s))
SELECT Dept, COUNT(*), Start, End
FROM (proj NORMALIZE proj USING (Dept)) pnrom
GROUP BY Dept, Start, End
Reduction rules are systematic and mechanic!
sfscon 2013 19/20 Anton Dignos
Summary and Vision
Currently
I PostgreSQL prototype with implemented primitives availablehttp://www.ifi.uzh.ch/dbtg/research/align.html
I Supports all temporal queries
I Evaluation shows good performance
I Working on additional index structures
sfscon 2013 20/20 Anton Dignos
Summary and Vision
Currently
I PostgreSQL prototype with implemented primitives availablehttp://www.ifi.uzh.ch/dbtg/research/align.html
I Supports all temporal queries
I Evaluation shows good performance
I Working on additional index structures
Vision
Integrate temporal primitives for temporal queries into the PostgreSQLrelease
sfscon 2013 20/20 Anton Dignos
Summary and Vision
Currently
I PostgreSQL prototype with implemented primitives availablehttp://www.ifi.uzh.ch/dbtg/research/align.html
I Supports all temporal queries
I Evaluation shows good performance
I Working on additional index structures
Vision
Integrate temporal primitives for temporal queries into the PostgreSQLrelease
Thank you for your attention!
sfscon 2013 20/20 Anton Dignos