psoup kevin menard cs 561 4/11/2005. streaming queries over streaming data sirish chandrasekaran uc...
Post on 20-Dec-2015
213 views
TRANSCRIPT
Streaming Queries over Streaming Data
Sirish Chandrasekaran
UC Berkeley
August 20, 2002
with Michael J. Franklin
VLDB 2
002
Slides are modified versions of the following original presentation:
Sirish Chandrasekaran
Psoup Insight #1Queries and data are duals
Store new queries, apply to data that arrived earlier
Store new data, apply to queries that arrived earlier
Multiquery Processing = “join” of query and data– Supports all three types of queries: queries over the past,
(landmark and sliding window) continuous, and hybrid
Data
Index
Result
QueriesQuery
Index
Sirish Chandrasekaran
Psoup Insight #1
Index Index
Data
Result
DataQueries
Queries and data are dualsStore new queries, apply to data that arrived earlier
Store new data, apply to queries that arrived earlier
Multiquery Processing = “join” of query and data– Supports all three types of queries: queries over the past,
(landmark and sliding window) continuous, and hybrid
Sirish Chandrasekaran
Motivation?
Why another model for continuous queries?
What is wrong with how Aurora and STREAM supply responses?
Sirish Chandrasekaran
Motivation: Disconnected Operation
Previous solutions stream out answers immediatelyNot feasible/suitable for all applications
Intermittent Connectivity: e.g., Applications on hand-held devices (as in this morning’s keynote address)
Even if connected: Not always interested in streaming answers
Sirish Chandrasekaran
Psoup Insight #2Separate computation from delivery
Query answers continuously generated in backgroundApply windows on-demand to transmit “current” results
Efficient support for disconnected operationLow response time, Shared computation and storage across invocations
DataID R.aR.b
QueryID Predicate
Results Structure
Queries
Dat
a
T T FF T TF F FT F FRegister
TTFT
Invoke
}
Sirish Chandrasekaran
PSoup Query ModelSELECT select_listFROM from_listWHERE where_clauseBEGIN begin_timeEND end_time
Where clause: conjunction of boolean factorsBEGIN-END clause: system clock or sequence numbers(begin_time, end_time):
(constant, constant) – snapshot query(constant, variable) – landmark window query(variable, variable) – sliding window query
Sirish Chandrasekaran
Query Registration
SELECT select_list
FROM from_list
WHERE where_clause
BEGIN begin_time
END end_time
}
}
Standing Query Clause (SQC)
Windows_Table
Symmetric Jointo the
to the
QueryID: handle for future query invocations
Sirish Chandrasekaran
Selections over Single Stream:Arrival of New Query Specification
Data Store
ID48495051
R.a4730
3380
52 8 4
R.b
PSoup
(a) Initial State
Query Store
ID20212223
Predicate0<R.a<=5
R.a>4 and R.b=30>R.b>4
R.a=4 and R.b=3
Sirish Chandrasekaran
Selections over Single Stream:Arrival of New Query Specification
PSoup
(b) Arrival of new Query
Select *From RWhere R.a<=4 and R.b>=3
New query
ID48495051
R.a4730
3380
52 8 4
R.bID20212223
Predicate0<R.a<=5
R.a>4 and R.b=30>R.b>4
R.a=4 and R.b=3
Data StoreQuery Store
Sirish Chandrasekaran
Selections over Single Stream:Arrival of New Query Specification
PSoup
(c) Building Query Store
24R.a<=4 and R.b>=3
ID20212223
Predicate0<R.a<=5
R.a>4 and R.b=30>R.b>4
R.a=4 and R.b=3
ID48495051
R.a4730
3380
52 8 4
R.b
BUILD
Data StoreQuery Store
Sirish Chandrasekaran
(d) Probing Data Store
Selections over Single Stream:Arrival of New Query Specification
PSoup
matchmatch
24R.a<=4 and R.b>=3
ID20212223
Predicate0<R.a<=5
R.a>4 and R.b=30>R.b>4
R.a=4 and R.b=3
ID48495051
R.a4730
3380
52 8 4
R.b
PROBE
Data StoreQuery Store
Sirish Chandrasekaran
Selections over Single Stream:Arrival of New Query Specification
Results Structure
48495051
20????
52 ?
21
(e) Inserting Results
Results
Queries
Dat
a
22 23 2448
50
4
3
3
8
Sirish Chandrasekaran
Selections over Single Stream:Arrival of New Query Specification
Results Structure
48495051
20TFTF
52 F
21
(e) Inserting Results
Results
Queries
Dat
a
22 23 2448
50
4
3
3
8
Sirish Chandrasekaran
Selections over Single Stream:Arrival of New Data
Data Store
ID48495051
R.a4730
3380
52 8 4
R.b
PSoup
(a) Initial State
Query Store
ID20212223
Predicate0<R.a<=5
R.a>4 and R.b=30>R.b>4
R.a=4 and R.b=324R.a<=4 and R.b>=3
Sirish Chandrasekaran
PSoup
(b) Arrival of new Data
New data
24R.a<=4 and R.b>=3
Query Store
ID20212223
Predicate0<R.a<=5
R.a>4 and R.b=30>R.b>4
R.a=4 and R.b=3
Data Store
ID48495051
R.a4730
3380
52 8 4
R.b
53 3 6
Selections over Single Stream:Arrival of New Data
Sirish Chandrasekaran
Selections over Single Stream:Arrival of New Data
PSoup
(c) Building Data Store
24R.a<=4 and R.b>=3
Query Store
ID20212223
Predicate0<R.a<=5
R.a>4 and R.b=30>R.b>4
R.a=4 and R.b=3
Data Store
ID48495051
R.a4730
3380
52 8 4
R.b
53 3 6BUILD
Sirish Chandrasekaran
(d) Probing Query Store
Selections over Single Stream:Arrival of New Data
PSoup
24R.a<=4 and R.b>=3
ID20212223
Predicate0<R.a<=5
R.a>4 and R.b=30>R.b>4
R.a=4 and R.b=3
Query Store Data Store
ID48495051
R.a4730
3380
52 8 4
R.b
53 3 6
match
match
PROBE
Sirish Chandrasekaran
Selections over Single Stream:Arrival of New Data
Results Structure
48495051
20
52
21
(e) Inserting Results
Results
Queries
Dat
a
22 23 24
53 ? ? ? ? ?
24R.a<=4 and R.b>=3
20 0<R.a<=5
Sirish Chandrasekaran
Selections over Single Stream:Arrival of New Data
Results Structure
48495051
20
52
21
(e) Inserting Results
Results
Queries
Dat
a
22 23 24
53 T F F F T
24R.a<=4 and R.b>=3
20 0<R.a<=5
Sirish Chandrasekaran
Query Invocation
Results Structure
48495051
20TFTF
52 F
21
Queries
22 23 24
Dat
a
53 T F F F T
}
Curr
en
t W
ind
ow
BEGIN begin_time
END end_time
System returns the results corresponding to the current value of the BEGIN-END clause
Sirish Chandrasekaran
Joins over R and S:Arrival of New Query Specification
Query StoreID202122
PredicateR.a=5 and R.b<S.b
R.a>4 and R.b<S.b and S.a<10R.b=4 and R.a+5>S.a and S.b>2
ID10143148
R.a2349
5317
R.b
R-Data Store
(a) Initial State
PSoup
ID21253649
S.a2345
2345
S.bS-Data Store
Sirish Chandrasekaran
Joins over R and S:Arrival of New Query Specification
23R.a<5 and R.a>S.a and S.b>1(b) Arrival of new Query
PSoupNew query
Query StoreID202122
PredicateR.a=5 and R.b<S.b
R.a>4 and R.b<S.b and S.a<10R.b=4 and R.a+5>S.a and S.b>2
ID10143148
R.a2349
5317
R.b
R-Data Store
S-Data StoreID21253649
S.a2345
2345
S.b
Sirish Chandrasekaran
Joins over R and S:Arrival of New Query Specification
23R.a<5 and R.a>S.a and S.b>1
(c) Building Query Store
PSoup
ID202122
PredicateR.a=5 and R.b<S.b
R.a>4 and R.b<S.b and S.a<10R.b=4 and R.a+5>S.a and S.b>2
ID10143148
R.a2349
5317
R.b
R-Data Store
BUILD
S-Data StoreID21253649
S.a2345
2345
S.b
Query Store
Sirish Chandrasekaran
Joins over R and S:Arrival of New Query Specification
(d) Probing R-Data Store
PSoup
}
Matc
hes
23R.a<5 and R.a>S.a and S.b>1
ID202122
PredicateR.a=5 and R.b<S.b
R.a>4 and R.b<S.b and S.a<10R.b=4 and R.a+5>S.a and S.b>2
ID10143148
R.a2349
5317
R.b
R-Data Store
PROBE
S-Data StoreID21253649
S.a2345
2345
S.b
Query Store
Sirish Chandrasekaran
Joins over R and S:Arrival of New Query Specification
ID20212223
PredicateR.a=5 and R.b<S.b
R.a>4 and R.b<S.b and S.a<10R.b=4 and R.a+5>S.a and S.b>2
R.a<5 and R.a>S.a and S.b>1
ID10143148
R.a2349
5317
R.b
R-Data Store
(e) Constructing Hybrid Structs
PSoup
} Matc
hes
101431
23 2>S.a and S.b>1
Query Store
23 3>S.a and S.b>123 4>S.a and S.b>1
Hybrid StructsR.ID Q.ID Q.Predicate
S-Data StoreID21253649
S.a2345
2345
S.b
Sirish Chandrasekaran
Joins over R and S:Arrival of New Query Specification
(f) Probing S-Data Store
PSoup
Matc
hes
{
ID20212223
PredicateR.a=5 and R.b<S.b
R.a>4 and R.b<S.b and S.a<10R.b=4 and R.a+5>S.a and S.b>2
R.a<5 and R.a>S.a and S.b>1
S-Data Store
ID10143148
R.a2349
5317
R.b
R-Data Store
Query Store
101431
23 2>S.a and S.b>123 3>S.a and S.b>123 4>S.a and S.b>1
Hybrid StructsR.ID Q.ID Q.Predicate
PROBE???
R,S,QResults ID
21253649
S.a2345
2345
S.b
Sirish Chandrasekaran
Joins over R and S:Arrival of New Query Specification
(f) Probing S-Data Store
PSoup
Matc
hes
{
ID20212223
PredicateR.a=5 and R.b<S.b
R.a>4 and R.b<S.b and S.a<10R.b=4 and R.a+5>S.a and S.b>2
R.a<5 and R.a>S.a and S.b>1
S-Data Store
ID10143148
R.a2349
5317
R.b
R-Data Store
Query Store
101431
23 2>S.a and S.b>123 3>S.a and S.b>123 4>S.a and S.b>1
Hybrid StructsR.ID Q.ID Q.Predicate
PROBE14,21,2331,21,2331,25,23
R,S,QResults ID
21253649
S.a2345
2345
S.b
Sirish Chandrasekaran
Joins over R and S:Arrival of New Data
Query StoreID202122
PredicateR.a=5 and R.b<S.b
R.a>4 and R.b<S.b and S.a<10R.b=4 and R.a+5>S.a and S.b>2
ID475051
R.a453
338
R.b
R-Data Store
(a) Initial State
PSoup
23 R.a<4 and R.b<S.b
ID484952
S.a453
432
S.bS-Data Store
Sirish Chandrasekaran
Joins over R and S:Arrival of New Data
(b) Arrival of new Data
PSoup New data53 5 4
Query StoreID202122
PredicateR.a=5 and R.b<S.b
R.a>4 and R.b<S.b and S.a<10R.b=4 and R.a+5>S.a and S.b>2
ID475051
R.a453
338
R.b
R-Data Store
23 R.a<4 and R.b<S.b
ID484952
S.a453
432
S.bS-Data Store
Sirish Chandrasekaran
Joins over R and S:Arrival of New Data
(c) Building R-Data Store
PSoup
Query StoreID202122
PredicateR.a=5 and R.b<S.b
R.a>4 and R.b<S.b and S.a<10R.b=4 and R.a+5>S.a and S.b>2
ID47505153
R.a4535
3384
R.b
23 R.a<4 and R.b<S.b
R-Data Store
BUILD
ID484952
S.a453
432
S.bS-Data Store
Sirish Chandrasekaran
Joins over R and S:Arrival of New Data
(c) Probing Query Store
PSoup
Matc
hes
{
Query StoreID202122
PredicateR.a=5 and R.b<S.b
R.a>4 and R.b<S.b and S.a<10R.b=4 and R.a+5>S.a and S.b>2
ID47505153
R.a4535
3384
R.b
23 R.a<4 and R.b<S.b
R-Data Store
PROBE
ID484952
S.a453
432
S.bS-Data Store
Sirish Chandrasekaran
Joins over R and S:Arrival of New Data
(d) Constructing Hybrid Structs
PSoup
Matc
hes
{
?5353
? 4<S.b21 ?22 ?
Hybrid Structs
ID47505153
R.a4535
3384
R.bQuery StoreID202122
PredicateR.a=5 and R.b<S.b
R.a>4 and R.b<S.b and S.a<10R.b=4 and R.a+5>S.a and S.b>2
23 R.a<4 and R.b<S.b
R-Data Store
R.ID Q.ID Q.PredicateID484952
S.a453
432
S.bS-Data Store
Sirish Chandrasekaran
Joins over R and S:Arrival of New Data
(d) Constructing Hybrid Structs
PSoup
Matc
hes
{
535353
20 4<S.b21 4<S.b and S.a<1022 10>S.a and S.b>2
Hybrid Structs
ID47505153
R.a4535
3384
R.bQuery StoreID202122
PredicateR.a=5 and R.b<S.b
R.a>4 and R.b<S.b and S.a<10R.b=4 and R.a+5>S.a and S.b>2
23 R.a<4 and R.b<S.b
R-Data Store
R.ID Q.ID Q.PredicateID484952
S.a453
432
S.bS-Data Store
Sirish Chandrasekaran
Joins over R and S:Arrival of New Data
(e) Probing S-Data Store
PSoup
Matc
hes
}Hybrid Structs
ID47505153
R.a4535
3384
R.b
ID484952
S.a453
432
S.bS-Data Store
Query StoreID202122
PredicateR.a=5 and R.b<S.b
R.a>4 and R.b<S.b and S.a<10R.b=4 and R.a+5>S.a and S.b>2
23 R.a<4 and R.b<S.b
R-Data Store
PROBE535353
20 4<S.b21 4<S.b and S.a<1022 10>S.a and S.b>2
R.ID Q.ID Q.Predicate 53,48,2253,49,22
R,S,QResults
Sirish Chandrasekaran
Other QueriesN-way Joins
Similar to 2-way joins
Probe, generate hybrid structs, repeat
Can be executed without intermediate tables
AggregationsPerformed at query invocation
Uses n-ary ranked tree, clustered on time
Sirish Chandrasekaran
Telegraph Background: CACQCACQ [MSHR02]
Shared execution of multiple queries with one EddyTuple lineage
Query Indices
Queries and Data treated very differently
Only Landmark Continuous Queries
No support for disconnected operation
Sirish Chandrasekaran
Leverage SteMs to store and index queries
Changes to EddiesEncode queries as tuples
break Where clause into individual boolean factors (BF)
encode each BF as
R.a relop [R.b|S.b] [+|-] constant
Stream Prefix ConsistencyA new query or data tuple is completely processed before any other tuple: no holes in Result Structure.
Results Structure: to buffer the results.
PSoup in Telegraph
Sirish Chandrasekaran
Experiments and ResultsAlternatives
NoMat – No background processingPSoup-Partial – background processing, apply current window on invocation PSoup-Complete – current windows are also continuously applied in the background
Experimental ParametersUnloaded Server with two Intel Pentium III, 666 MHz processors with 768 MB RAMData arrives as fast as possible, in domain [0,255]Queries of form R.a relop C, where c in [0,255]Join Queries of form R.a relop S.b +/- C.
Sirish Chandrasekaran
Experiments: Response Time vs. Window Size
Interval Predicates, Selection Queries
Sirish Chandrasekaran
Equality Predicates, Selection Queries
Experiments: Response Time vs. Window Size
Sirish Chandrasekaran
PSoup in traditional query processor
PSoup = SQL QUERY over data and client query streams?
Joins = expression evaluators
NotesConventional QPs do not have tuple lineage
Conventional QPs always use intermediate tables
Sirish Chandrasekaran
Conclusions
Treating Queries and Data the sameCombines approaches for previously studied queries
Queries over the past and continuous queries
Allows new functionality – hybrid queries
Separating Result Generation and DeliveryMakes disconnected operation feasibleEfficient support for repeated query invocations