20101217 mtg
DESCRIPTION
TRANSCRIPT
Minimization of SuperSQL Execution Time by Query
Decomposition Toyama Laboratory
Midterm Presentation
Ria Mae Borromeo25 December 2010
2
an extension of SQL that provides the capability of generating various kinds of application data directly as a result of a query
Introduction (1)
4
Introduction (2)GENERATE HTML { { imagefile(path="C:\webpage", "prc_logoback.gif"), imagefile(path="C:\webpage", "main_top02.gif") @{valign=middle, align=center} } @{bgcolor="#296793", width="600"}! ""@{border-width=0, height=20}! "August 2010 Physician Licensure Exam"@{font-size=20, align="center"}! { {"List of Exams"@{align="center", font-size=14}! [link(t.t_name@{font-size=12, width=150}, file="", att="")@{border-width=0}]! }@{valign=top}, {"List of Passers"@{align="center", font-size=14}! [e.e_name@{font-size=10, border-width=0}]! }@{width=330, align=left, valign=top}, {"List of Schools"@{align="center", font-size=14}! [s.s_name@{font-size=10, border-width=0}]! }@{width=330, align=left, valign=top} }@{align="center"}} FROM examinee e, school s, examtype t
Remove decorations
5
Introduction (3)
GENERATE HTML { {"List of Exams"![t.t_name]!}, {"List of Passers"![e.e_name]!}, {"List of Schools"![s.s_name]!}} FROM examtype t, examinee e, school s
SELECT DISTINCT t.t_name, e.e_name, s.s_name, FROM examtype t, examinee e, school s
SQL Query
Problem Illustration
SELECT DISTINCT t.t_name, e.e_name, s.s_name, FROM examtype t, examinee e, school s
t
(20)
e
(250)
s
(50)
x x
t e s
…
Expected Output
Intermediate Table (250,000)20 + 250 + 50 = 320 tuples
6
Proposal (1)
SELECT t_nameFROM examtype
SELECT e_nameFROM examinee
SELECT s_nameFROM school
GENERATE HTML { {"List of Exams"![t.t_name]!}, {"List of Passers"![e.e_name]!}, {"List of Schools"![s.s_name]!}} FROM examtype t, examinee e, school s
t_name
(20)
e_name
(250)
s_name
(50)
Expected Output
… … …
SELECT DISTINCT t.t_name, e.e_name, s.s_name, FROM examtype t, examinee e, school s
8
Proposal (2)
3 Parts1. Detection of divisible queries2. Division of Queries
▪ Creation of SQL queries▪ Execution of SQL queries
3. Combination of query results
start
SuperSQL
query
Make SQL Statemen
t
Parse Query
Make Schema
Execute SQL Statement
Make Tree Structure
Generate Code
Application Data
end9O
rig
inal P
rocess Parse Query
Page decoration and query attributes are separated
Parse Query
Make Schema
Data
C
on
str
ucti
on
10
Schema
A tree-structured representation of the layout of the attributes
[ [t.t_name], [e.e_name], [s.s_name] ]
start
SuperSQL
query
Make SQL Statemen
t
Parse Query
Make Schema
Execute SQL Statement
Make Tree Structure
Generate Code
Application Data
end11O
rig
inal P
rocess Parse Query
Page decoration and query attributes are separated
Make Tree Structure The schema and the
tuples from the database are combined into a tree-structure
Generate Code Using the tree
structure, code for application data is generated
Parse Query
Make Schema
Make SQL Statemen
t
Execute SQL Statement
Make Tree Structure
Generate Code
Data
C
on
str
ucti
on
start
SuperSQL
query
Check Divisibility
Execute SQL
Statements
Combine results into
a tree structure
Generate Code
Application Data
end
Make SQL Statements
YN
12Pro
posed
Pro
cess
Make SQL Statemen
t
Make Schema
Execute SQL Statement
Make Tree Structure
Parse Query
Check Divisibility
13
Check Divisibility Algorithm
[ [t.t_name], [e.e_name], [s.s_name] ]
1. Get the attributes from the schemaGENERATE HTML { {"List of Exams"![t.t_name]!}, {"List of Passers"![e.e_name]!}, {"List of Schools"![s.s_name]!}} FROM examtype t, examinee e, school s
Schema
14
Check Divisibility Algorithm
2. Make a graph to identify relationshipst.t_nam
es.s_nam
ee.e_nam
e
Vertices : attributesEdges:
If two attributes are from the same table If two attributes are equated in a where
condition If two attributes are grouped in the
schema
15
Check Divisibility Algorithm
3. Find connected components using Depth-First Search
t.t_name
s.s_name
e.e_name
Connected Component - a subgraph that contains a path between all pairs of vertices in the subgraph
Depth-First Search - A search algorithm that extends the current path as far as possible before backtracking to the last choice point and trying the next alternative path.
16
Example 2
GENERATE HTML [ f.title !{ { { "Director" ! f.dir } ! { "Running" ! f.length } }, { "Starring" , [ r.name % {r.name , { "Birthday" , r.bday }! "Biography" ! r.bio } } ]! }}! f.story ]!FROM film f, actor r, act aWHERE a.film = f.idAND a.act = r.id
1. Get the attributes from the schema
[[f.title, f.dir, f.length, [actor.name, actor.name, actor.birthday, actor.biography], f.story]]
17
Example 2
2. Make a graph to identify relationships
f.title
f.dir
f.len
f.story
f.id
r.name
r.bday
r.bio
r.ida.film
a.act
18
Example 2
3. Find connected components by Depth-First Search
f.title
f.dir
f.len
f.story
f.id
r.name
r.bday
r.bio
r.ida.film
a.act
f.lenf.title
f.dir
f.id
a.film
a.act
r.id
r.bio
r.bday
r.name
start
SuperSQL
query
Check Divisibility
Execute SQL
Statements
Combine results into
a tree structure
Generate Code
Application Data
end
Make SQL Statements
YN
19Pro
posed
Pro
cess
Make SQL Statemen
t
Make Schema
Execute SQL Statement
Make Tree Structure
Parse Query
Check Divisibility Make SQL
Statements
Execute SQL
Statements
Combine results into
a tree structure
20
Current Status
3 Parts1. Detection of divisible
queries2. Division of Queries
▪ Creation of SQL queries▪ Execution of SQL queries
3. Combination of query results
▪ Trivial case: columns are independent of each other
▪ Others
21
Experiments (1)
Intel (R) Core (TM)2 Duo CPU 2.09 GHz 1.97 GB RAM Microsoft Windows XP Professional
2002 SP3 Java 6 Update 22 Standard Edition
22
Experiments (2)
No. of Queries: 25
No. of Attributes per query
2 – 10
Best Case Running Time (ms)
4.043531
Worst Case Running Time (ms)
7.87614
Average Running Time (ms)
5.41567164
Standard Deviation (ms)
1.153155682
Check Divisibility Function
23
Experiments (3)
10 15 20 25
Orig-inal
195.8656734
194.7371448
199.2996856
207.803956
New 197.955362
194.8125736
201.3419246
201.3696432
187.5192.5197.5202.5207.5
Data Construction Time of Small Intermediate Table Size
Tim
e (
ms)
No. of Attributes
24
Experiments (4)
25000 50000 75000 100000 125000
Origi-nal
4503.630762
9322.29402
14360.094452
29402.622868
39187.224832
New
308.163948
296.253179
305.315075
308.774652
316.301103
2500
17500
32500
Data Construction Time of Large Intermediate Table Size
Execu
tion
Tim
e (
ms)
No. of Attributes
25
Summary
Implemented an algorithm for checking if SuperSQL queries can be decomposed into several SQL Queries
Queries are decomposed into several SQL queries when possible
Experiments show that the procedure reduces the execution time of SuperSQL in some query cases
27
Future Work
Continue implementation of the MakeTree function to address other query classes
Testing
More experiments
Statistical Analysis of data
28
Comments from Panel
Research Suggestions: Use database information For the example, use embedded system (with
limited memory) that generates webpages
Presentation Suggestion: Take note of the number of decimal points in the
data
Question: How can you be sure that the output generated by
the new one is the same as the original? It can be verified by checking the tree structure
Thank you for listening.