aquaq analytics kx event - data direct networks presentation
DESCRIPTION
TRANSCRIPT
Getting the most out of multi-year and multi-source trading historyGlenn Wright, EMEA Systems Architect DDN
June 2014
© 2013 DataDirect Networks, Inc.
ddn.com
Agenda
Uh? Who is DDN?
The Evolution of Data in Data Handling Market Systems
The Big Analytics Crunch
What’s hot, what not…. It’s Parallel Performance, stupid!
© 2013 DataDirect Networks, Inc.
ddn.com
DDN | The “Big” In Big Data
800%
Paypal accelerates stream processing and fraud analytics by 8x with DDN, saves $100Ms.
1TB/s
The world’s fastest file system, to power the US’s fastest supercomputer, is powered by DDN.
Tier 1Tier1 CDN accelerates the world’s video traffic using DDN technology to exceed customer SLAs.
3
© 2013 DataDirect Networks, Inc.
ddn.com
DDN | The Technology Behind The World’s Leading Data-Driven Organizations
HPC &Big Data Analysis
Cloud &Web Infrastructure
ProfessionalMedia
Security
© 2013 DataDirect Networks, Inc.
ddn.com
Big Data & Cloud Infrastructure DDN’s Award-Winning Product Portfolio
Analytics Reference
Architectures
EXAScaler™
10Ks of Clients1TB/s+, HSM
Linux HPC ClientsNFS & CIFS [2014]
Petascale Lustre® Storag
e
Enterprise Scale-Out File
Storage
GRIDScaler™
~10K Clients1TB/s+, HSM
Linux/Windows HPC ClientsNFS & CIFS
SFA™12KX48GB/s, 1.7M IOPS1,680 Drives in 2 RacksOptional Embedded Computing
SFA770012.5GB/s, 450K IOPS60 Drives in 4U228 Drives in 12U
Storage Fusion Architecture™ Core Storage Platforms
SATA SSD
Flexible Drive Configuration
SAS
SFX™ Automated Flash Caching
WOS® 3.032 Trillion Unique Objects
Geo-Replicated Cloud Storage256 Million Objects/Second
Self-Healing CloudParallel Boolean Search
Cloud Foundation
Big Data PlatformManagement
DirectMon™
CloudTiering
Infinite Memory Engine™ [Tech Preview]Distributed File System Buffer Cache
WOS700060 Drives in 4U
Self-Contained ServersAdaptive Transparent Flash Cache SFX API Gives Users Control [pre-staging, alignment, by-pass]
© 2013 DataDirect Networks, Inc.
ddn.com
1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 20110.0
20,000,000.0
40,000,000.0
60,000,000.0
80,000,000.0
100,000,000.0
120,000,000.0
140,000,000.0
160,000,000.0TOTALAmericasAsia - PacificEurope - Africa - Middle East
Evolution of Market Systems
SOURCE: World Federation of Exchanges 2011 Annual Report and Statistics
DASD
F DASD
Scale-out NAS
Parallel File System
© 2013 DataDirect Networks, Inc.
ddn.com
UNDERLYING ISSUE:Gaping Performance Bottlenecks
• Moore’s Law has out-stripped improvements to disk drive technology by two orders of magnitude during the last decade
• Analytics moved to HPC clusters
• Today’s servers are hopelessly unbalanced between the CPUs need for data and the HDDs ability to keep up
HDD vs. CPU Relative Performance Improvement
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014
20,000 x
1gb16gb
© 2013 DataDirect Networks, Inc.
ddn.com
Welcome to the Big Analytics Crunch
• 500TB to > 2PB of historical data for one TZ
• Distributed cache : online model reads data at 100s of GB/s IO (Tick DB application such as kdb+)
• 3D “cube” of in memory distributed data, online, realtime
• 100’s of services/servers working together in memory: low latency analytics w/ simplicity of persistent File system semantics
• Burst buffer low latency operation mainstream in FSI► Real time Back testing ► Real time intra-day risk positioning
© 2013 DataDirect Networks, Inc.
ddn.com
Why DDN & Why Parallel ?
In Production
Many systems deployed W/W
@ Global Investment Banks and Hedge Funds
Performance and Consolidation
Back test in a few seconds is much closer to the trade event
Mix online history and real time trade analytics
Consolidate in-memory databases against on copy of data
At Scale Flash – is NOT scale @ capacity
Single namespace, history and real-time
© 2013 DataDirect Networks, Inc.
ddn.com
Limitless Scale up and Scale out with kdb+…
Compute Fabric
KDB+ (1) KDB+ (2) KDB+ (3) KDB+ (16)
MDS Primary
MDS Replica
OSS1
MDT DDN SFA7700
DDN SFA7700
OSS2 OSS3 OSS4
© 2013 DataDirect Networks, Inc.
ddn.com
What we changed:
export SLAVECOUNT=160 # number of kdb+ client tasksExport CLIENTCOUNT=10 # number of processes per kdb server
Q script Query:
\l beforeeach.qR1S:rrdextras flip`k`v!(" S*";",")0:`:rrd.csv/ year-hibidoutp t:”YRHIBID";fn:{[f;s;d] flip`date`sym`a!flip raze(f each s)peach d};NRS:.tasks.rxsg[H;`$t;1;(fn[hb];apickAs[R1S;`Symbol];reverse ALLDATES2011)];\l aftereach.q
symbols: $glenn head rrd.csv1,Symbol,LKQQ1,Symbol,LHDE1,Symbol,LNJO1,Symbol,LLTR1,Symbol,LRFC1,Symbol,LQGA1,Symbol,LTNQ1,Symbol,LSAG1,Symbol,LQIA1,Symbol,LKSJ
… x850 symbols vs 84
© 2013 DataDirect Networks, Inc.
ddn.com
glenn$ more hostport.txt127.0.0.1:5000127.0.0.1:5001127.0.0.1:5002127.0.0.1:5003127.0.0.1:5004127.0.0.1:5005127.0.0.1:5006127.0.0.1:5007127.0.0.1:5008127.0.0.1:5009
What we changed (2):
# replace $QEXEC initdb.k -g 1 -p $((baseport+i)) </dev/null &>log$((baseport+i)).log&for i in `seq 20000 20009`do for j in `seq 0 15` do echo ssh server-$j "cd $HOME;QHOME=/home/glenn/q $HOME/l64/q initdb.k -p $i -g 1 </dev/null &> $i-$j.log &" ssh gp-2-$j "cd $HOME;QHOME=/home/mpiuser/q $HOME/l64/q initdb.k -p $i -g 1 </dev/null &> $i-$j.log &" while ! nc -z "gp-2-$j" $i; do sleep 0.1; done donedone# get ready ??echo `date -u` $SLAVECOUNT slave tasks started# then start the servers aimed at the slavesbaseport=5000for ((i=0; i<$CLIENTCOUNT; i++));do $QEXEC initdb.k -g 1 -s -$SLAVECOUNT -p $((baseport+i)) </dev/null &>log$((baseport+i)).log& while ! nc -z localhost $((baseport+i)); do sleep 0.1; doneDone
# check that everything can startup : $QEXEC startdb.q -s -$SLAVECOUNT -q
© 2013 DataDirect Networks, Inc.
ddn.com
What we changed (3):
Startdb.q ……/ check all servers are there/{hopen(x;500)}each("I"$getenv`BASEPORT)+til"I"$getenv`SLAVECOUNT;{hopen(x;2500)}each hsym`$read0`:slavehostport.txt;\l initdb.k{hopen(x;500)}each 5000+til"I"$getenv`CLIENTCOUNT;\\
Cat slavehostport.txt:192.168.3.51:20000192.168.3.51:20001192.168.3.51:20002192.168.3.51:20003192.168.3.51:20004192.168.3.51:20005192.168.3.51:20006192.168.3.51:20007192.168.3.51:20008192.168.3.51:20009192.168.3.52:20000192.168.3.52:20001192.168.3.52:20002…. 160 times
© 2013 DataDirect Networks, Inc.
ddn.com
Slave (1) slave (2) Slave (3) Slave n
Lustre/DDN Service
/mnt/onefilesystem
Q clients:
Slave x10Slave x10
Slave x10 Slave x10
Up to 1TB/sec… “n” way server striping or by date/sym
© 2013 DataDirect Networks, Inc.
ddn.com
Results of Scaling the service ….
Single Thread Lustre0
50
100
150
200
250
Latency reduction (number of seconds for query) *Lower is bet-
ter
The Parallel FS solution shows a near linear scalability model for one instance running over many nodes, as measured from kdb+. Latency is the time to wait from the kdb+ query of 245GB of data. To put this in context, these nodes were only equipped with 64GB of memory.
© 2013 DataDirect Networks, Inc.
ddn.com
Some of the many Benefits of kdb+ on Parallel FS1. Significant decrease in operational latency per kdb+ query, especially when running queries that search
through significant amounts of historical market information. Achieved by balancing content around
multiple file system servers
2. Parallelization of kdb+ query “threads” in a single shared namespace, allowing a user to treat any data
workload independently from other data workloads. “query from hell” on production system is now OK?”
3. Simultaneous read/write operations on a single namespace for the entre database and for any
number of kdb+ clients, (e.g. end of day data consolidations into a hdb instance)
4. Sharing of data amongst different independent hdb/rdb instances. Many instances of kdb can view the
same data, meaning that strategies for data sharing and private data segments may be
consolidated onto the same space. Avoids the need for kdb+ admins to physically copy data around
the network or disks
5. Kdb+ context can be “striped” around all FS servers, or can be allocated in a round robin fashion against
each server. Striping allows the opportunity for some files to attain maximal I/O rates for a single kdb+
“object”.
© 2013 DataDirect Networks, Inc.
ddn.com
Next Steps?