petapg

19

Click here to load reader

Upload: andrew-pantyukhin

Post on 16-Jun-2015

365 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: PetaPG

Dispatching Petabyteswith PostgreSQL

Andrew [email protected]

www.princexml.com
Prince - Non-commercial License
This document was created with Prince, a great way of getting web content onto paper.
Page 2: PetaPG

15M media objects3PB raw data

storage, streaming, processing

Page 3: PetaPG

HDFS? Isilon?custom solution

Page 4: PetaPG

1000s hard drivesfile system per drive

filename = sha256(file)

Page 5: PetaPG

dispatchingingestion, rebalancing

encoding, analysis

Page 6: PetaPG

PostgreSQL!(of course)

Page 7: PetaPG

entitiessha (asset), hdd, chassismetadata, actions, status

Page 8: PetaPG

15M master objects25M derivatives

70M copies

Page 9: PetaPG

200GB core500GB XML processing

2TB+ overall

Page 10: PetaPG

custom typesenum

native/wrappers

Page 11: PetaPG

hashtypesshatypes

+ crc32, bugfixes

Page 12: PetaPG

actionsfully async, fail-over

dumb polling

Page 13: PetaPG

smart lockingupdate set t=now()where t old

update returning

Page 14: PetaPG

XMLthird-party metadata

stored, processed in PG

Page 15: PetaPG

researchlarge-scale action logging

Page 16: PetaPG

productionaggregated views of dispatcher

Page 17: PetaPG

distributed logicdispatcher, XML processing,

production, researchfull-mesh data exchange

Page 18: PetaPG

table data transferslow or inflexible

simple custom scripts, diff

Page 19: PetaPG

dream industriesdisruptive innovation lab

funding, collaboratinginviting, hiring