aquaq analytics kx event - datawatch presentation
TRANSCRIPT
Visual Data Discovery with and Datawatch
Jeremy Bentham
• 28 Aug 2013 –
• Datawatch Completes Acquisition of Panopticon
Datawatch History
• Founded in 1986, Public Since 1992 (NASDQ CM: DWCH)
• Global Operations and Support
US
EMEA: UK, Germany, France, Sweden
Asia Pac: Australia, Singapore, Hong Kong, India, Philippines
• Pioneer in Transforming All Types of Information
Structured (RDBMs, Data Warehouses)
Semi-Structured (PDF, Reports, Text …)
Unstructured (Log Files, EDI …)
• Over 40,000 customers worldwide
99 of the Fortune 100 & 487 of the Fortune 500
Large Number of SMB
Across All Verticals
What we do?
• Visual Data Discovery
Historically focussed on:
• Front & Mid Office
• Risk, Surveillance, Research, Sales & Trading
• For Buy & Sell Side, Regulators Exchanges & ECNs
Now Still Capital Markets plus:
• Energy & Utilities, Telco, Retail, Manufacturing, etc.
Which Means?
• Reducing the time taken to understand your data.
Effectively:
• Find the Weird Stuff
Using: Designer, Server & Web Client
So From:
To:
Visual Data Display
Time Series
Producing
Competing With
How we’re Differentiated
• Assume data is never at rest
• Capital Markets Focus
• Real Time Streaming
• Time Series
• High Density Visuals
• Embed (Java & .NET SDKs)
• Java & .NET Servers
• Connectivity
Kx Connectivity
kx Connectivity
Synchronous: Request / Response
• Issue Q & Retrieve either: • Table, Dictionary, Vector or Value
Asynchronous Subscribe
• Subscribe to Service, Table & Symbols
• Keeping latest, or scrolling time window
Request / Response Subscribekx Connectivity
Kx – How to Query?
Either:
• Retrieve all into Memory
• Parameterise queries, and pull back subsets
• Dynamically query (auto-generating q selects)
Retrieve:
• Summaries & Detail
• Sampled Time series
• Down to individual Ticks
Passing through:
• Parameter Values & Vectors of Values
• Time Windows
• Zoom Bounds
Problem vs. Competition
Assumed: Data in Motion
So Direct Data Access
• Implying Fast Data Access / Data Querying
So if the underlying data source is:
Slow
We appear:
Slow
Solution = Caching
• If data is not time sensitive
• (e.g. Typical data warehouse)
• Populate Cache on a one-off, or scheduled basis.
• Dynamically Querying of Cache
• Approach taken by:
• Tableau, Tibco Spotfire & Qlikview
• Their In-Memory Db = Proprietary Cache
Search for a Cache
We needed an in-memory cache that could:
• Load quickly
• Perform fast aggregation
• Perform fast filtering
• Work with big datasets
• Understand Time
• Small footprint
• Easy to OEM
• Windows & Linux
Dataset Characteristics
• Typically Sparse Timeseries
• Sensor Data
• Sales/Revenue Transactions
• Latency Data
• Machine Data
• Market Data & Trade Data (Orders & Executions)
• Everywhere we look across verticals, data seems similar
to trades & quotes
Way Forward
• Approached kx for OEM
• But our pricing ruled out usage within the Designer
Then:
• 2nd April – 32bit kx – Free for Commercial Use
Next Datawatch Release – Cache Options
• Designer – 32bit kx.• View Single Workbook at a time
• Server –32bit or 64bit kx Cores• Host Multiple Workbooks
• Cache up to the memory in the machine (if using 64bit cores)
Our Data Strategy
• If Fast underlying database.• Go Direct
• If Slowwwwww
• Cache into kx,
• Get the query performance that kx provides
More Information
Peter Simpson
Visual Data Discovery
TEL: +44 (0) 798 464 6544