xldb2011 wed 1415_andrew_lamb-buildingblocks
TRANSCRIPT
![Page 1: Xldb2011 wed 1415_andrew_lamb-buildingblocks](https://reader036.vdocuments.us/reader036/viewer/2022081403/55581ea3d8b42a5e468b4eb4/html5/thumbnails/1.jpg)
VERTICA CONFIDENTIAL DO NOT DISTRIBUTE
Building Blocks for LargeAnalytic SystemsAndrew Lamb ([email protected])Vertica Systems, an HP CompanyOct 19, 2011
![Page 2: Xldb2011 wed 1415_andrew_lamb-buildingblocks](https://reader036.vdocuments.us/reader036/viewer/2022081403/55581ea3d8b42a5e468b4eb4/html5/thumbnails/2.jpg)
2
Outline
• Emergence of new hardware, software and data volume is driving the wave of new analytic systems in spite of mature existing systems.
• Common architectural choices when building these new systems
• Choices we made when building the Vertica Analytic Database
• Broadly applicable (and widely used)
![Page 3: Xldb2011 wed 1415_andrew_lamb-buildingblocks](https://reader036.vdocuments.us/reader036/viewer/2022081403/55581ea3d8b42a5e468b4eb4/html5/thumbnails/3.jpg)
3
Architectural Changes
• Driven by changing – Hardware: (really) cheap
x86_64 servers, high capacity disk, high speed networking
– Software: Linux, Open Source
– Requirements: Data Deluge
• Hard for Legacy Software – Compatibility is brutal – Linux x86_64 today– Solaris, HPUX, IRIX, etc. 10
years ago
TimeB
ette
r-ne
ss
• Storage Organization and Processing Principles
![Page 4: Xldb2011 wed 1415_andrew_lamb-buildingblocks](https://reader036.vdocuments.us/reader036/viewer/2022081403/55581ea3d8b42a5e468b4eb4/html5/thumbnails/4.jpg)
4
[Storage + Processing] MPP (no shared disk)
• Any modern system should run on a cluster of nodes, scale up/down
• Mid range servers are really cheap (to rent or own)
• Aggregate available resources are enormous and scalable:
– I/O Bandwidth (Disk and Network)– Cores, Memory, etc.
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
System Bus
Main Memory/Cache
SharedDisk
SMP Server
MPP Servers
Local Disk
Network
![Page 5: Xldb2011 wed 1415_andrew_lamb-buildingblocks](https://reader036.vdocuments.us/reader036/viewer/2022081403/55581ea3d8b42a5e468b4eb4/html5/thumbnails/5.jpg)
5
[Storage] Keep Your Data Sorted
• For large data sets, extra indexes are expensive to maintain
• Much better to index the data itself by sorting it
• Easy to find what you are looking for, reduces seeks
![Page 6: Xldb2011 wed 1415_andrew_lamb-buildingblocks](https://reader036.vdocuments.us/reader036/viewer/2022081403/55581ea3d8b42a5e468b4eb4/html5/thumbnails/6.jpg)
6
[Storage] Distribute Data by Value (not chunks)
• “Sharding” -- Distribute data so you can easily find it again (not round robin)
• Segmenting in analytics layer simplifies app layer• Round Robin computations won't scale: need to swizzle
data around the cluster to do most useful thing• Replication for high availability: not logs
A3
B3
C3
A2
B2
C2
B1
A1
C1
B2
A2
C2
B1
A1
C1
A3
B3
C3
A1
B1
C1
B3
A3
C3
![Page 7: Xldb2011 wed 1415_andrew_lamb-buildingblocks](https://reader036.vdocuments.us/reader036/viewer/2022081403/55581ea3d8b42a5e468b4eb4/html5/thumbnails/7.jpg)
7
[Storage] Store the Data in Columns
• Rarely do all fields of a data set appear in analytic queries
• Really don't want to waste I/O for data you don't need
• Columns let you be clever about applying predicates individually, further reducing I/O
• Not appropriate for row lookup or transaction processing systems
![Page 8: Xldb2011 wed 1415_andrew_lamb-buildingblocks](https://reader036.vdocuments.us/reader036/viewer/2022081403/55581ea3d8b42a5e468b4eb4/html5/thumbnails/8.jpg)
8
[Storage] Write it Once, Don't Modify
• Physical use of disk objects should be write once (append only)
• System should present the illusion of mutability
• Immutable storage drastically simplifies coherence in a distributed system
![Page 9: Xldb2011 wed 1415_andrew_lamb-buildingblocks](https://reader036.vdocuments.us/reader036/viewer/2022081403/55581ea3d8b42a5e468b4eb4/html5/thumbnails/9.jpg)
0
5
10
15
20
25
30
35
40 Random vs Sequential Reads
RandomSequential
MB
/s
9
[Processing] Use Large Sequential IOs
• Spinning disks are very good at large sequential IOs• You really don't want to whipsaw the read head
(another reason why secondary indexes are bad)• Especially useful with sorted data
![Page 10: Xldb2011 wed 1415_andrew_lamb-buildingblocks](https://reader036.vdocuments.us/reader036/viewer/2022081403/55581ea3d8b42a5e468b4eb4/html5/thumbnails/10.jpg)
10
[Processing] Trade CPU for I/O Bandwidth
• Use very aggressive compression, even if it seems/is wasteful (keep getting more cores)
• Example: data type specific encoding, then LZO before actually writing to disk
• Hide additional latency with execution pipeline parallelism
![Page 11: Xldb2011 wed 1415_andrew_lamb-buildingblocks](https://reader036.vdocuments.us/reader036/viewer/2022081403/55581ea3d8b42a5e468b4eb4/html5/thumbnails/11.jpg)
11
[Processing] Mess with the Data where it Lives
• Bring your processing to the data, not data to the processing
• System should push computation down close to data (even if calculation turns out to be redundant)
• Example: multi-phase aggregation, each phase tries to aggregate intermediates before passing up the memory / node hierarchy
![Page 12: Xldb2011 wed 1415_andrew_lamb-buildingblocks](https://reader036.vdocuments.us/reader036/viewer/2022081403/55581ea3d8b42a5e468b4eb4/html5/thumbnails/12.jpg)
12
[Processing] Declarative & Extendable
• Give users a declarative query language for most tasks• Writing procedural code for simple queries is wasteful• Provide procedural extensions for complex analysis
![Page 13: Xldb2011 wed 1415_andrew_lamb-buildingblocks](https://reader036.vdocuments.us/reader036/viewer/2022081403/55581ea3d8b42a5e468b4eb4/html5/thumbnails/13.jpg)
13
Conclusion
• Emergence of new hardware, software and data volumes implies certain architectural choices in modern big data analytic systems.
• Commonly observed in new systems• Vertica (unsurprisingly) features all of them
![Page 14: Xldb2011 wed 1415_andrew_lamb-buildingblocks](https://reader036.vdocuments.us/reader036/viewer/2022081403/55581ea3d8b42a5e468b4eb4/html5/thumbnails/14.jpg)
14
Introducing Community Edition
• Free version of Vertica– help steward better analytics and the democratization of data-
closed source, but open access!
• All of the features and advanced analytics of Vertica Enterprise Edition
• Seamless upgrade to Enterprise Edition• Limited to 1 TB raw data on 3-node hardware cluster• Revamping Community area of Vertica’s website for
knowledge sharing, third party tools, and code sharing• Launching academic and non-profit research use
program as well• Sign up at: www.vertica.com/community