performance & scalability experiments dr. ray huetter, cto sensorconnect reid phillips,...
TRANSCRIPT
Performance & Scalability Experiments
Dr. Ray Huetter, CTO SensorConnect
Reid Phillips, University of Arkansas
© 2007 SensorConnect Inc.
www.sensorconnect.com 2
Overview
Data collection and modeling of large sensor networks (EPCIS level)
A key to Return on Investment (ROI) is performance & scalability
SensorConnect builds systems to maximize success thru ROI
University of Arkansas is helping verify these systems
Today we present Background on SensorConnect technology Test results from University of Arkansas
www.sensorconnect.com 3
Authors
Joe Hoag PhD Candidate University of Arkansas
Reid Phillips PhD Candidate University of Arkansas
Dr. Craig Thompson Professor and Database Chair University of Arkansas
Dr. Ray Huetter CTO SensorConnect
John Veizades VP Product Management SensorConnect
www.sensorconnect.com 4
Our View of RFID
RFID & sensors augment the physical world Goal: assist people and machines to make better use
of physical objects plan & observe use, identify misuse, predict service analyze systemic cause and effect
Succeed when ROI is demonstrated coincides with maximal assistance reduction in time, space, matter & energy of processes
This is common across many domains Supply chain, ePedigree, health care, MRO, logistics,
…
www.sensorconnect.com 5
Potential of RFID
RFID will make many contributions Economic, environmental, social (health)
Effectiveness & ROI will be substantial Physical optimization (more for less) Correct distribution, location and usage Safety and correctness Prevents harm (food safety) Reduction in resources, waste and errors Physical process improvements Lead to new opportunities…
www.sensorconnect.com 6
Best ROI Results
Successful pilot projects are showing 5 to 10 times ROI when end-to-end visibility occurs Single, accurate timely view Across physical & logical boundaries By multiple parties
Why? Able to see what happened and when Able to reason about it, as and when it happens Discover cause and effect Use it to ones advantage or correct it Optimize: time, space, energy & matter
www.sensorconnect.com 7
Control-Feedback Loop
Holistic View
Real-WorldSystems
ComputerSystem
Observe
Optimize
www.sensorconnect.com 8
Maximizing ROI
Maximal ROI occurs when optimization takes into account As much fine-grained detail as possible Of as many physical objects as possible Across as many boundaries as possible In as short a time-frame as possible For the least price possible
Conversely, ROI will be limited by coarse-grained, filtered / summarized, isolated,
untimely or expensive systems
www.sensorconnect.com 9
Not Possible Today
Most contemporary systems substantially constrain effectiveness & ROI Are expensive (relative to the cost of tags) Are isolated “stove-pipes” Are not real-time Do not support continuous operation Do not scale with hardware Do not cope with volume
Will be suboptimal There is a missing link here…
www.sensorconnect.com 10
SensorConnect Technology
Build systems & expertise to maximize ROI Collect sensor based-data (notably RFID) of arbitrarily large
physical systems in real-time Use that data to create fine-grained models of in real-time Enable new & existing applications / systems to securely
observe, reason & optimize physical systems by querying the current state and history of the model adjust the physical system continuously in real-time
Do this by supporting Real-time write back to tags Apply rules to produce actionable alerts in real-time Pushing changes to applications as they happen Applications querying history (prior state) as required Replay history of events as they occurred
www.sensorconnect.com 11
Holistic View of Physical Systemsreal world
Supply-Chain 1
Supply-Chain 2
Supply-Chain 3
SensorConnect
Model of Supply-Chain 1
Model of Supply-Chain 2
Model of Supply-Chain 3
History
Applications
tracking, planning, optimization, exception
management, reporting...
Events
(in-memory model)
(real-world system)
Queries
www.sensorconnect.com 12
SensorConnect System Qualities
High performance > 50,000 events per second per 64-bit CPU < 100 millisecond response time per event, including write-back Balance queries with ingestion maintain detailed history; replay event history
Indefinitely scalable Support models with billions .. trillions of physical objects
Widely compatible Devices & systems
Standards compliant EPCIS (repository)
Highly reliable Continuous operation via hot failover
Secure Access & authorization controls
www.sensorconnect.com 13
University of Arkansas
University of Arkansas invited to test SensorConnect core
Run tests indicative of loads of an entire supply chain
Motivations: Interested in scalable grid technology with application to sensor
networks and identity Have skills and technology to do synthetic data generation Longer term collaboration with RFID technology
www.sensorconnect.com 14
Proof of Concept Experiments
Purpose Test configuration Synthetic Data Generation (SDG) Descriptions, results, and analysis
www.sensorconnect.com 15
Purpose
Measure performance of the SensorConnect system while accepting data from an independent, outside source Ingestion (insertion) Balanced (concurrent ingestion and queries)
www.sensorconnect.com 16
Test Configuration
ACE four node grid (provided by NSF grant #0410966)
64-bit dual processor AMD Opterons 1.6 GHz 2 GB RAM 60 GB Hard Drive 1Gbps Ethernet Rocks 4.2, Linux Kernel 2.6.9
Part of the Open Science Grid
www.sensorconnect.com 17
Synthetic Data Generation (SDG)
Written in Java Accepts Synthetic Data Description Language (SDDL)
file as input Capable of generating data sequentially or in parallel Partitioning algorithms assure that the resulting data set
will be consistent regardless of the degree of parallelism used during generation
Capable of direct-to-database generation, but generating to intermediate text file is more common, and faster
www.sensorconnect.com 18
Synthetic Data Description Language (SDDL) SDDL Constraint Types
Min/Max/Step Probabilistic Distribution Pool Reference: basically a parameterized dictionary lookup.
Users can define their own dictionaries Formula: field value based on mathematical formulas involving
constants and other fields Iteration: iterate through a set of values. The value set could be
a sequence of integers, a record set from a query, or a set of dictionary values
Data types supported: integer, real, string, date, time, timestamp, boolean
www.sensorconnect.com 19
Synthetic Data Generation
SDG Operation Parallel processes all reference
the same SDDL file Each parallel process generates
a single text output file, containing a portion of the generated table
Database then imports the text files as data
Lack of inter-process dependencies make linear speedup a real possibility
Speed of SDG is only limited by number and speed of processors
Output is identical regardless of the number of generation processes utilized
www.sensorconnect.com 20
Application:Simple RFID Supply Chain Data
Problem: Generate synthetic RFID events (“arrive” and “depart”) for 10 million unique objects traversing 100 read points (total = 2 billion events)
Row: TagID, ReaderNum, BizEvt, Timestamp
Total data generated: 86 GB (2B rows)
Reader 1
Reader 2
Reader 3
Reader 100
. . .
www.sensorconnect.com 21
Experiments Run
Peak ingestion Event replay Query item Query history Query location description
www.sensorconnect.com 22
Peak Ingestion Test
Not a balanced test (no queries) Used to determine the sustained insertion rate of
the SensorConnect system All available data was ingested into the system First test terminated prematurely due to a
configuration problem Second test ran to completion in approximately
01:24:00
www.sensorconnect.com 23
Peak Ingestion Test
www.sensorconnect.com 24
Event Replay Test
This balanced test replays the events logged by the system during a specified time interval in the order the events were received
Replay rate must be greater than or equal to the ingestion rate
Models a store-and-forward supply chain Three runs replaying 10, 20, and 20 minutes
respectively
www.sensorconnect.com 25
Event Replay Test
www.sensorconnect.com 26
Query Item Test
A balanced test that returns a tag’s current, or most recent, location
www.sensorconnect.com 27
Query History Test
This balanced query returned the event history of a tag, or all records recording an “enter” or “leave” event for a given tag
www.sensorconnect.com 28
Query Location Description Test
A balanced test that returns all tags at a given location, or position, within a supply chain
www.sensorconnect.com 29
Experiment Conclusions
SensorConnect is designed for multi-core, multi-cpu System allows for an unbalanced 400,000 events/second
peak ingestion rate Balanced tests were able to query data at a rate greater
than ingestion Deployment of the SensorConnect system in a foreign
environment was accomplished with relative ease Ultimately the test results far exceeded expectations
indicating great promise for the system
www.sensorconnect.com 30
Summary
Goal of RFID is to assist people and machines to make better use of physical objects
Successful projects demonstrate ROI ROI coincides with maximal assistance SensorConnect is a high-volume real-time
EPCIS system which models the real-world Tests by University of Arkansas show peak
performance >400,000 events per sec