design considerations for high fan-in systems: the hifi approach
DESCRIPTION
Design Considerations for High Fan-in Systems: The HiFi Approach. Michael J. Franklin, Shawn R. Jeffery, Sailesh Krishnamurthy, Frederick Reiss, Shariq Rizvi, Eugene Wu, Owen Cooper, Anil Edakkunni, and Wei Hong. Presented by Shawn Jeffery CIDR‘05 1/7/05. - PowerPoint PPT PresentationTRANSCRIPT
Design Considerations for High Fan-in Systems: The HiFi Approach
Presented by Shawn JefferyCIDR‘05 1/7/05
Michael J. Franklin, Shawn R. Jeffery, Sailesh Krishnamurthy, Frederick Reiss, Shariq Rizvi,
Eugene Wu, Owen Cooper, Anil Edakkunni, and Wei Hong
UC Berkeley, Intel Research Berkeley
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Itinerary
• Introduction: High Fan-in Systems• HiFi Overview• Initial Prototype• Ongoing Work and Future Directions• Conclusions
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Introduction
• Receptors everywhere!• Wireless sensor networks, RFID technologies,
digital home, network monitors, ...
• Somehow need to make sense of this data to provide near real-time decision support
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
High Fan-in Systems
Large numbers of receptors = large data volumesHierarchical, successive aggregation
The “Bowtie”
Challenges in 3 dimensions:•Geography•Time•Resources
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Supply-Chain Management (SCM)
RFIDRFIDReceptors
Warehouses, Stores
Dock doors, Shelves
Regional Centers
Headquarters
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
State of the Art
• Not seen as a data management issue• Focus on protocol design• Different “data models” at each level• Reinventing “query languages” at each level
• Piecemeal/stovepipe approach• Each type of receptor (RFID, sensors, etc)
handled separately• Current solutions tend to be hand-coded,
script-based approaches
No end-to-end, integrated solution for managing distributed receptor data
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Itinerary
• Introduction: High Fan-in Systems• HiFi Overview• Initial Prototype• Ongoing Work and Future Directions• Conclusions
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
HiFi: Cascading Stream Processing in a High Fan-in System
• A data management infrastructure for high fan-in environments
• Uniform Declarative Framework • Every node is a data stream processor
that speaks SQL-ese stream-oriented queries at all levels
• Hierarchical, stream-based views as an organizing principle
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Hierarchical Query Processing
“I provide raw readings for Soda Hall”
“I provide avg daily values for Berkeley”
“I provide avg weekly values for California”
“I provide national monthly values for the US”
• Continuous and Streaming• Windows• Sharing
• Hierarchical• Temporal
granularity vs. geographic scope
SELECT S.area, AVG(S.temp)FROM SENSOR_STREAM S [range by ‘5 sec’ slide by ‘5 sec’]GROUP BY S.area
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Basic HiFi Architecture
HiFi GlueDSQP
HiFi GlueDSQP
MDR
• Hierarchical federation of nodes
• Each node:• Data Stream Query
Processor (DSQP)• HiFi Glue
• Views drive system functionality
• Metadata Repository (MDR)
HiFi GlueDSQP
DSQP
HiFi Glue•DSQP Management•Query Planning•Archiving•Internode coordination and communication
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
In the paper…
HiFi Design Considerations• Dealing with Real-World Data• Hierarchical Windowed Views with Sharing• System Management• Topological Fluidity• Query Planning and Data Placement• Complex Event Processing• Archiving and Prioritization• Privacy and Access Control
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Itinerary
• Introduction: High Fan-in Systems• HiFi Overview• Initial Prototype• Ongoing Work and Future Directions• Conclusions
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Envisioning HiFiBuilding HiFi
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
A Tale of Two Systems
• TelegraphCQ• Data stream processor• Continuous, adaptive query
processing with aggressive sharing
• TinyDB• Declarative query processing for
wireless sensor networks• In-network aggregation
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Initial Prototype
TelegraphCQ
TinyDB
Stargates
Sensor Networks &
RFID Readers
RFID Wrappers
PC
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Initial Prototype
Demoed @ VLDB ‘04
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
HiFi Design Considerations
• Dealing with Real-World Data• Hierarchical Windowed Views with
Sharing• System Management• Topological Fluidity• Query Planning and Data Placement• Complex Event Processing• Archiving and Prioritization• Privacy and Access Control
• Dealing with Real-World Data• Hierarchical Windowed Views with
Sharing• System Management• Topological Fluidity• Query Planning and Data Placement• Complex Event Processing• Archiving and Prioritization• Privacy and Access Control
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Processing RFID Data in HiFi
• RFID data is gross!• Lost readings• Errant readings• Duplicate readings
• Use queries to make the data usable• CSAVA: Clean Smooth Arbitrate Validate
Analyze
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Processing RFID Data in HiFi
Clean
CREATE VIEW cleaned_rfid_stream AS(SELECT receptor_id, tag_idFROM rfid_stream rsWHERE read_strength >= strength_T)
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Processing RFID Data in HiFi
Clean
SmoothCREATE VIEW smoothed_rfid_stream AS(SELECT receptor_id, tag_id FROM cleaned_rfid_stream [range by ’5 sec’, slide by ’5 sec’] GROUP BY receptor_id, tag_id HAVING count(*) >= count_T)
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Processing RFID Data in HiFi
Clean
Smooth
ArbitrateCREATE VIEW arbitrated_rfid_stream AS(SELECT receptor_id, tag_idFROM smoothed_rfid_stream rs [range by ’5 sec’, slide by ’5 sec’]GROUP BY receptor_id, tag_idHAVING count(*) >= ALL (SELECT count(*) FROM smoothed_rfid_stream [range by ’5 sec’, slide by ’5 sec’] WHERE tag_id = rs.tag_id GROUP BY receptor_id))
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Processing RFID Data in HiFi
Arbitrate
Validate
CREATE VIEW validated_tags AS(SELECT tag_name, FROM arbitrated_rfid_stream rs [range by ’5 sec’, slide by ’5 sec’], known_tag_list tlWHERE tl.tag_id = rs.tag_id
Clean
Smooth
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Processing RFID Data in HiFi
Validate
CREATE VIEW tag_count AS(SELECT tag_name, count(*) FROM validated_tags vt [range by ‘5 min’, slide by ‘1 min’]GROUP BY tag_name
Analyze
Arbitrate
Clean
Smooth
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Processing RFID Data in HiFi
Augment
Convert
Aggregate
Validate
Analyze
Arbitrate
Clean
Smooth
Augment
Convert
Aggregate
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
CSAVA: Bridging the Physical-Virtual Divide
• An example of HiFi processing, but instrumental in dealing with real world data
Arbitrate
Clean
Smooth Window
Single Tuple
Multiple Receptors
CSAVA Generalization
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Complexity of Hierarchical Windowed Query Processing
•Naïve dissemination (unchanged query) introduces a lag in query results
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Additive Lag in Hierarchical Windowed Query Processing
Level 0
Level 1
Level 2
Window
Event
Result Tuple(s)
Additive Lag!
Result Tuple(s)
Result Tuple(s)
Window
Window
SELECT S.area, AVG(temp)FROM SENSOR_STREAM S [range by ‘5 sec’ slide by ‘5 sec’]GROUP BY S.area
User
Time
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Sketch of a Solution
Level 0
Level 1
Level 2
Event
Result Tuple(s)
Result Tuple(s)
Window
SELECT S.area, AVG(temp)FROM SENSOR_STREAM S[range by ‘5 seconds’ slide by ‘5 seconds’]GROUP BY S.area
User•Solution is to use both time-based windows and NOW windows
Time
Result Tuple(s)
NOW window
NOW window
Time-based window
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
System Management
• Our small deployment:• 20+ individual devices
(4 types of devices)• 5 different platforms
(OS + Hardware)
Management nightmare• System-wide management is crucial
• Both coarse and fine-grained• Where we’re headed:
• System monitoring needed: turn the lens inwards to introspect on system state
• Use uniform declarative framework to provide failover and load balancing
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Itinerary
• Introduction: High Fan-in Systems• HiFi Overview• Initial Prototype• Ongoing Work and Future
Directions• Conclusions
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Ongoing Work and Future Directions
• Bridging the physical-virtual divide• Generalize CSAVA-type processing to
other receptors
• Hierarchical query processing• Query planning, dissemination
• Complex event processing• Unify event and data processing
• System deployment and management• Archiving and prioritization
1/7/05 Shawn Jeffery, HiFi Project, UCB EECS
Conclusions
• Receptors everywhere High Fan-In Systems
• Uniform declarative framework is the key to building these systems
• The HiFi project is exploring this approach• Our initial prototype
• Leveraged TelegraphCQ and TinyDB• Validated the HiFi approach• Identified research directions
• Broad in scope = much work to be done!
Questions?
hifi.cs.berkeley.edu