20101217 mtg

29
Minimization of SuperSQL Execution Time by Query Decomposition Toyama Laboratory Midterm Presentation Ria Mae Borromeo 25 December 2010

Upload: riamaehb

Post on 02-Dec-2014

458 views

Category:

Technology


4 download

DESCRIPTION

 

TRANSCRIPT

Page 1: 20101217 mtg

Minimization of SuperSQL Execution Time by Query

Decomposition Toyama Laboratory

Midterm Presentation

Ria Mae Borromeo25 December 2010

Page 2: 20101217 mtg

2

an extension of SQL that provides the capability of generating various kinds of application data directly as a result of a query

Introduction (1)

Page 3: 20101217 mtg
Page 4: 20101217 mtg

4

Introduction (2)GENERATE HTML { { imagefile(path="C:\webpage", "prc_logoback.gif"), imagefile(path="C:\webpage", "main_top02.gif") @{valign=middle, align=center} } @{bgcolor="#296793", width="600"}! ""@{border-width=0, height=20}! "August 2010 Physician Licensure Exam"@{font-size=20, align="center"}! { {"List of Exams"@{align="center", font-size=14}! [link(t.t_name@{font-size=12, width=150}, file="", att="")@{border-width=0}]! }@{valign=top}, {"List of Passers"@{align="center", font-size=14}! [e.e_name@{font-size=10, border-width=0}]! }@{width=330, align=left, valign=top}, {"List of Schools"@{align="center", font-size=14}! [s.s_name@{font-size=10, border-width=0}]! }@{width=330, align=left, valign=top} }@{align="center"}} FROM examinee e, school s, examtype t

Remove decorations

Page 5: 20101217 mtg

5

Introduction (3)

GENERATE HTML { {"List of Exams"![t.t_name]!}, {"List of Passers"![e.e_name]!}, {"List of Schools"![s.s_name]!}} FROM examtype t, examinee e, school s

SELECT DISTINCT t.t_name, e.e_name, s.s_name, FROM examtype t, examinee e, school s

SQL Query

Page 6: 20101217 mtg

Problem Illustration

SELECT DISTINCT t.t_name, e.e_name, s.s_name, FROM examtype t, examinee e, school s

t

(20)

e

(250)

s

(50)

x x

t e s

Expected Output

Intermediate Table (250,000)20 + 250 + 50 = 320 tuples

6

Page 7: 20101217 mtg

Proposal (1)

SELECT t_nameFROM examtype

SELECT e_nameFROM examinee

SELECT s_nameFROM school

GENERATE HTML { {"List of Exams"![t.t_name]!}, {"List of Passers"![e.e_name]!}, {"List of Schools"![s.s_name]!}} FROM examtype t, examinee e, school s

t_name

(20)

e_name

(250)

s_name

(50)

Expected Output

… … …

SELECT DISTINCT t.t_name, e.e_name, s.s_name, FROM examtype t, examinee e, school s

Page 8: 20101217 mtg

8

Proposal (2)

3 Parts1. Detection of divisible queries2. Division of Queries

▪ Creation of SQL queries▪ Execution of SQL queries

3. Combination of query results

Page 9: 20101217 mtg

start

SuperSQL

query

Make SQL Statemen

t

Parse Query

Make Schema

Execute SQL Statement

Make Tree Structure

Generate Code

Application Data

end9O

rig

inal P

rocess Parse Query

Page decoration and query attributes are separated

Parse Query

Make Schema

Data

C

on

str

ucti

on

Page 10: 20101217 mtg

10

Schema

A tree-structured representation of the layout of the attributes

[ [t.t_name], [e.e_name], [s.s_name] ]

Page 11: 20101217 mtg

start

SuperSQL

query

Make SQL Statemen

t

Parse Query

Make Schema

Execute SQL Statement

Make Tree Structure

Generate Code

Application Data

end11O

rig

inal P

rocess Parse Query

Page decoration and query attributes are separated

Make Tree Structure The schema and the

tuples from the database are combined into a tree-structure

Generate Code Using the tree

structure, code for application data is generated

Parse Query

Make Schema

Make SQL Statemen

t

Execute SQL Statement

Make Tree Structure

Generate Code

Data

C

on

str

ucti

on

Page 12: 20101217 mtg

start

SuperSQL

query

Check Divisibility

Execute SQL

Statements

Combine results into

a tree structure

Generate Code

Application Data

end

Make SQL Statements

YN

12Pro

posed

Pro

cess

Make SQL Statemen

t

Make Schema

Execute SQL Statement

Make Tree Structure

Parse Query

Check Divisibility

Page 13: 20101217 mtg

13

Check Divisibility Algorithm

[ [t.t_name], [e.e_name], [s.s_name] ]

1. Get the attributes from the schemaGENERATE HTML { {"List of Exams"![t.t_name]!}, {"List of Passers"![e.e_name]!}, {"List of Schools"![s.s_name]!}} FROM examtype t, examinee e, school s

Schema

Page 14: 20101217 mtg

14

Check Divisibility Algorithm

2. Make a graph to identify relationshipst.t_nam

es.s_nam

ee.e_nam

e

Vertices : attributesEdges:

If two attributes are from the same table If two attributes are equated in a where

condition If two attributes are grouped in the

schema

Page 15: 20101217 mtg

15

Check Divisibility Algorithm

3. Find connected components using Depth-First Search

t.t_name

s.s_name

e.e_name

Connected Component - a subgraph that contains a path between all pairs of vertices in the subgraph

Depth-First Search - A search algorithm that extends the current path as far as possible before backtracking to the last choice point and trying the next alternative path. 

Page 16: 20101217 mtg

16

Example 2

GENERATE HTML [ f.title !{ { { "Director" ! f.dir } ! { "Running" ! f.length } }, { "Starring" , [ r.name % {r.name , { "Birthday" , r.bday }! "Biography" ! r.bio } } ]! }}! f.story ]!FROM film f, actor r, act aWHERE a.film = f.idAND a.act = r.id

1. Get the attributes from the schema

[[f.title, f.dir, f.length, [actor.name, actor.name, actor.birthday, actor.biography], f.story]]

Page 17: 20101217 mtg

17

Example 2

2. Make a graph to identify relationships

f.title

f.dir

f.len

f.story

f.id

r.name

r.bday

r.bio

r.ida.film

a.act

Page 18: 20101217 mtg

18

Example 2

3. Find connected components by Depth-First Search

f.title

f.dir

f.len

f.story

f.id

r.name

r.bday

r.bio

r.ida.film

a.act

f.lenf.title

f.dir

f.id

a.film

a.act

r.id

r.bio

r.bday

r.name

Page 19: 20101217 mtg

start

SuperSQL

query

Check Divisibility

Execute SQL

Statements

Combine results into

a tree structure

Generate Code

Application Data

end

Make SQL Statements

YN

19Pro

posed

Pro

cess

Make SQL Statemen

t

Make Schema

Execute SQL Statement

Make Tree Structure

Parse Query

Check Divisibility Make SQL

Statements

Execute SQL

Statements

Combine results into

a tree structure

Page 20: 20101217 mtg

20

Current Status

3 Parts1. Detection of divisible

queries2. Division of Queries

▪ Creation of SQL queries▪ Execution of SQL queries

3. Combination of query results

▪ Trivial case: columns are independent of each other

▪ Others

Page 21: 20101217 mtg

21

Experiments (1)

Intel (R) Core (TM)2 Duo CPU 2.09 GHz 1.97 GB RAM Microsoft Windows XP Professional

2002 SP3 Java 6 Update 22 Standard Edition

Page 22: 20101217 mtg

22

Experiments (2)

No. of Queries: 25

No. of Attributes per query

2 – 10

Best Case Running Time (ms)

4.043531

Worst Case Running Time (ms)

7.87614

Average Running Time (ms)

5.41567164

Standard Deviation (ms)

1.153155682

Check Divisibility Function

Page 23: 20101217 mtg

23

Experiments (3)

10 15 20 25

Orig-inal

195.8656734

194.7371448

199.2996856

207.803956

New 197.955362

194.8125736

201.3419246

201.3696432

187.5192.5197.5202.5207.5

Data Construction Time of Small Intermediate Table Size

Tim

e (

ms)

No. of Attributes

Page 24: 20101217 mtg

24

Experiments (4)

25000 50000 75000 100000 125000

Origi-nal

4503.630762

9322.29402

14360.094452

29402.622868

39187.224832

New

308.163948

296.253179

305.315075

308.774652

316.301103

2500

17500

32500

Data Construction Time of Large Intermediate Table Size

Execu

tion

Tim

e (

ms)

No. of Attributes

Page 25: 20101217 mtg

25

Summary

Implemented an algorithm for checking if SuperSQL queries can be decomposed into several SQL Queries

Queries are decomposed into several SQL queries when possible

Experiments show that the procedure reduces the execution time of SuperSQL in some query cases

Page 26: 20101217 mtg
Page 27: 20101217 mtg

27

Future Work

Continue implementation of the MakeTree function to address other query classes

Testing

More experiments

Statistical Analysis of data

Page 28: 20101217 mtg

28

Comments from Panel

Research Suggestions: Use database information For the example, use embedded system (with

limited memory) that generates webpages

Presentation Suggestion: Take note of the number of decimal points in the

data

Question: How can you be sure that the output generated by

the new one is the same as the original? It can be verified by checking the tree structure

Page 29: 20101217 mtg

Thank you for listening.