"interactive deep analytics" dashboard
TRANSCRIPT
Named leader in report.
Founded in 2009 Acquired by AOL in 2014
Using Big Data stack since 2009
75 people - 30 R&D
1.5T of daily data >100 data sources
Helping marketers to optimize their spend
Across: Channels | Devices Online + Offline
Convertro
ü Clear
ü Actionable
ü Great UX/I
Successful Dashboard
Rendering time
Storage
Cost
UX/I
Considerations
Processing time
Insights
Comparison
Over 2me
Explore
“S2cky” Configura2on
RT metrics
Integrate
USE CASE #1
Speed Batch
Materialization to one table is too costly (belated massive updates)
Leverage Vertica’s sorted data structure
Join data in run time ( O(n) )
Query
Spend Batch
Revenue Speed
Query* Merged Structure
Spend Batch
Revenue Speed
* λ architecture
USE CASE #2
Different metrics with
1:N rela2onships
Avoid joins in query time ( if possible )
Pre joining and aggregate by dimensions
Pre joining does not necessarily explode
your data store
Visits, Conversions, Impressions
Conversions Impressions ⨝ ⨝
Σ
Visits
USE CASE #3
Many Dimensions
Limit number of returned records to screen – vizualize the most significant data
Allow to dump data with different QOS
Allow to choose up to X dimensions – not all
For each page allow to choose different relevant dimensions
Build different data structures for different pages
USE CASE #4
Same data different rendering
Same data different rendering
Query locality caching
Backend does data rendering
Shared configuration across widgets
MPP has a limited query schedulers
Table Query Cache
Σ Widget 1
Widget 2
USE CASE #5
Real Time data points 2cker
Sometimes you don’t have to be 100% accurate or consistent
try using:
Extrapolation
Sampling
Different data stores
Heuristics
logs Speed layer Ticker Every X minutes
Real 2me extrapola2on
Hydro – Data Rendering Service
Hydro
EXTRACT
TRNSFORM
RENDER
ETL Web/App Server
API
DB1
DB2
Connect to any data source Multi level caching and invalidation Applying data transformation and rendering Logic sharing
Understand the requirements
One technology doesn't fit all
One data structure doesn't fit all
Good UX takes into account Data and Technology considerations
Data Processing and Mining
Analytics DB - Vertica
Built for analy>cs Storage / Query engine / Op2mizer
Column oriented store Sorted
True MPP
Deals well with high cardinality and sparse data
*not an open source
Real Time metrics
Web Stack
Server Pandas Hydro
Client Backbone marioneVe RequireJs handlebars highcharts d3 underscore TwiVer Bootstrap SlickGrid ...
Architecture
Visualizaion