variety, velocity and volume · cap theorem nosql databases, cloud computing agility polyglot...
TRANSCRIPT
![Page 1: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/1.jpg)
© 2011 by The 451 Group. All rights reserved
Matthew Aslett Senior analyst, enterprise software [email protected]
Variety, Velocity and Volume Meeting the performance challenges of Big Data in the enterprise
![Page 2: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/2.jpg)
© 2011 by The 451 Group. All rights reserved
Big Data, Total Data
2
What is it?
Current data management trends
What technologies are involved?
When to use them
The drivers behind emerging technology choices
![Page 3: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/3.jpg)
© 2011 by The 451 Group. All rights reserved © 2011 by The 451 Group. All rights reserved
451 Research is focused on the business of enterprise IT innovation. The company’s analysts provide critical and timely insight into the competitive dynamics of innovation in emerging technology segments.
The 451 Group
Tier1 Research is a single-source research and advisory firm covering the multi-tenant datacenter, hosting, IT and cloud-computing sectors, blending the best of industry and financial research.
The Uptime Institute is ‘The Global Data Center Authority’ and a pioneer in the creation and facilitation of end-user knowledge communities to improve reliability and uninterruptible availability in datacenter facilities.
TheInfoPro is a leading IT advisory and research firm that provides real-world perspectives on the customer and market dynamics of the enterprise information technology landscape, harnessing the collective knowledge and insight of leading IT organizations worldwide.
ChangeWave Research is a research firm that identifies and quantifies ‘change’ in consumer spending behavior, corporate purchasing, and industry, company and technology trends.
![Page 4: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/4.jpg)
© 2011 by The 451 Group. All rights reserved
Coverage areas
Matthew Aslett Senior analyst, enterprise software
With The 451 Group since 2007
www.twitter.com/maslett
Commercial Adoption of Open Source (CAOS) Adoption by enterprises
Adoption by vendors
Information Management Database
Data warehousing
Data caching
4
![Page 5: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/5.jpg)
© 2011 by The 451 Group. All rights reserved
Data Management Trends
![Page 6: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/6.jpg)
© 2011 by The 451 Group. All rights reserved
Current data management trends
6
The volume, variety and velocity of data is growing rapidly
Data processing capabilities have never been better
The value of data has never been better understood
The data deluge problem is also a big data opportunity
RISK OPPORTUNITY
![Page 7: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/7.jpg)
© 2011 by The 451 Group. All rights reserved
What is Big Data?
7
More than just rising data volumes
Big Data ≠ Volume
![Page 8: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/8.jpg)
© 2011 by The 451 Group. All rights reserved
What is Big Data?
8
Also variety of data types/sources and velocity of data updates
Big Data = Volume ± Variety ± Velocity
![Page 9: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/9.jpg)
© 2011 by The 451 Group. All rights reserved
Current data management trends
9
The volume, variety and velocity of data is growing rapidly
Data processing capabilities have never been better
The value of data has never been better understood
‘Big Data’ covers a diverse set of products that can be applied to different problems
‘Big Data’ highlights the problem – volume/variety/velocity,
and promises a solution – value,
but doesn’t provide a path in between
RISK OPPORTUNITY
![Page 10: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/10.jpg)
© 2011 by The 451 Group. All rights reserved
Total Data
![Page 11: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/11.jpg)
© 2011 by The 451 Group. All rights reserved
What is Total Data?
11
Not just another name for Big Data
A concept defined by The 451 Group to describe new approaches to data management – beyond restrictive silos
Reflects the changing data management landscape as pragmatic choices are being made about data storage and analysis techniques
Inspired by ‘Total Football’
![Page 12: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/12.jpg)
© 2011 by The 451 Group. All rights reserved
What is Total Data?
12
Also the desire of the user to store and process all their data
Value = (Volume ± Variety ± Velocity) x Totality
![Page 13: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/13.jpg)
© 2011 by The 451 Group. All rights reserved
What is Total Data?
13
Within tolerable time frames
Value = (Volume ± Variety ± Velocity) x Totality
Time
Data
volume/
variety/
velocity
Rate of query
Value of data
![Page 14: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/14.jpg)
© 2011 by The 451 Group. All rights reserved
What is Total Data?
14
Within tolerable time frames
Value = (Volume ± Variety ± Velocity) x Totality
Time
Total Data is making the most efficient use of existing and new data management resources to deliver value from data
The technologies deployed depend on which factor is most significant to the problem and the nature of the query
![Page 15: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/15.jpg)
© 2011 by The 451 Group. All rights reserved
Technology choices
![Page 16: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/16.jpg)
© 2011 by The 451 Group. All rights reserved
Application stack
Hardware
Database
Users
Application
![Page 17: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/17.jpg)
© 2011 by The 451 Group. All rights reserved
Traditional scalability
Database
Users
Application
Users
Application
Users
Application
Hardware
![Page 18: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/18.jpg)
© 2011 by The 451 Group. All rights reserved
Commodity hardware
Hardware Hardware Hardware Hardware Hardware Hardware Hardware Hardware
Database
Application Application Application
Users Users Users
![Page 19: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/19.jpg)
© 2011 by The 451 Group. All rights reserved
User explosion
Users Users Users Users Users Users Users Users
Hardware Hardware Hardware Hardware Hardware Hardware Hardware Hardware
Database
Application Application Application
![Page 20: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/20.jpg)
© 2011 by The 451 Group. All rights reserved
Application scalability
Users Users Users Users Users Users Users Users
Application Application Application Application Application Application
Hardware Hardware Hardware Hardware Hardware Hardware Hardware Hardware
Database
![Page 21: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/21.jpg)
© 2011 by The 451 Group. All rights reserved
Data management use cases
Database
Operational
Analytic
![Page 22: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/22.jpg)
© 2011 by The 451 Group. All rights reserved
Data management use cases
Data management
real-time transaction
and data ingestion
large scale data storage and analysis
![Page 23: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/23.jpg)
© 2011 by The 451 Group. All rights reserved
Data management requirements
real-time transaction
and data ingestion
large scale data storage and analysis
Data ingestion/analysis • random reads and writes • real-time • low, predictable latency • high performance Data storage/analysis • lower-cost storage • large-scale analytics • read heavy • batch processing
![Page 24: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/24.jpg)
© 2011 by The 451 Group. All rights reserved
Data management requirements
real-time transaction
and data ingestion
large scale data storage and analysis
Data ingestion/analysis • MySQL • Data caching • NoSQL, NewSQL databases • Stream processing Data storage/analysis • Data warehouse/marts • In-database analytics • Online repository • Hadoop
![Page 25: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/25.jpg)
© 2011 by The 451 Group. All rights reserved
Emerging scalability
Users Users Users Users Users Users Users Users
Application Application Application Application Application Application
Hardware Hardware Hardware Hardware Hardware Hardware Hardware Hardware
Data ingestion/
analysis
Data
storage/analysis
Data ingestion/
analysis
Data storage/a
nalysis
Data ingestion/
analysis
Data storage/a
nalysis
Data ingestion/
analysis
Data
storage/analysis
Data ingestion/
analysis
Data storage/a
nalysis
Data ingestion/
analysis
Data storage/a
nalysis
Data storage/a
nalysis
Data storage/a
nalysis
![Page 26: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/26.jpg)
© 2011 by The 451 Group. All rights reserved
Database SPRAIN
26
Scalability Hardware economics
NoSQL, NewSQL databases, Hadoop, cloud computing
Performance Database limitations
Data caching, in-memory, virtual machine, stream/event processing
Relaxed consistency
CAP Theorem NoSQL databases, cloud computing
Agility Polyglot persistence
Agile development, schema-free, non-relational, in-memory
Intricacy Big data, total data
Non-relational database, NoSQL, Hadoop, in-database analytics, memory storage
Necessity Open source The failure of incumbent vendors to address emerging requirements
![Page 27: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/27.jpg)
© 2011 by The 451 Group. All rights reserved
Emerging scalability
Users Users Users Users Users Users Users Users
Application Application Application Application Application Application
Hardware Hardware Hardware Hardware Hardware Hardware Hardware Hardware
Data ingestion/
analysis
Data
storage/analysis
Data ingestion/
analysis
Data storage/a
nalysis
Data ingestion/
analysis
Data storage/a
nalysis
Data ingestion/
analysis
Data
storage/analysis
Data ingestion/
analysis
Data storage/a
nalysis
Data ingestion/
analysis
Data storage/a
nalysis
Data storage/a
nalysis
Data storage/a
nalysis
Virtual machine
Virtual machine
![Page 28: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/28.jpg)
© 2011 by The 451 Group. All rights reserved
Relevant reports
Total Data
Explaining the total data management approach to dealing with the impact of big data on the data management landscape
Coming late 2011
Including the growing Hadoop ecosystem and real-time
COMING LATE 2011
![Page 29: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/29.jpg)
Gil Tene
CTO, Azul Systems
Enterprise Big Data
Java, Building Blocks and Performance Impacts
![Page 30: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/30.jpg)
©2011 Azul Systems, Inc. 2 2
What Azul does for the Big Data space
• We provide Java runtimes that scale consistently
• We eliminate the Garbage Collection problem
• Our runtimes are elastic in nature
• Our runtimes are an ideal building block for Big Data
![Page 31: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/31.jpg)
©2011 Azul Systems, Inc. 3 3
Many technologies support Big Data
• Operational databases / data warehouses/marts
• Data integration / data virtualization
• Business intelligence tools
• Hadoop / equivalent / alternative technology
• Relational / non-relational databases
• In-memory databases
• Data caching
• Stream / event processing
• In-database analytics
• Disk / memory storage
![Page 32: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/32.jpg)
©2011 Azul Systems, Inc. | Azul Company Confidential 4
Value and infrastructure
![Page 33: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/33.jpg)
©2011 Azul Systems, Inc. 5 5
Value inflection points, Data
Data
Value
Volume, Velocity,
Variety
Value of data
Value = (Volume ± Variety ± Velocity) x Totality
Time
![Page 34: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/34.jpg)
©2011 Azul Systems, Inc. 6 6
Volume / Velocity/ Variety
translation to underlying capacity metrics
• Higher Volume = larger data set sizes, amounts of state
• Higher Velocity = higher processing rate, throughput
• Higher Variety = more metadata, indexes, cross-linking
![Page 35: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/35.jpg)
©2011 Azul Systems, Inc. | Azul Company Confidential 7
Timeliness
![Page 36: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/36.jpg)
©2011 Azul Systems, Inc. 8 8
Inflection points – Timeliness
Data
Value
1/t
Value of data
Value = (Volume ± Variety ± Velocity) x Totality
Time
![Page 37: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/37.jpg)
©2011 Azul Systems, Inc. 9 9
Real world example of timeliness:
Affecting shopping decisions
I went [online] shopping for a camping trip last week
• Spent multi-$100: tent, sleeping bags, other items.
• Researched and shopped around for hours, across
multiple sites
• Relevant ads that showed up within my decision
window affected my shopping
• Ads that showed up a day later did not…
![Page 38: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/38.jpg)
©2011 Azul Systems, Inc. 10 10
Big Data’s value
real-time
transaction
and data
ingestion
large scale
data storage
and analysis
Interactive application
• random reads and writes
• real-time
• low, predictable latency
• high performance
Data storage/analysis
• lower-cost storage
• large-scale analytics
• read heavy
• batch processing
![Page 39: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/39.jpg)
©2011 Azul Systems, Inc. 11 11
Big Data’s value: enhanced by timeliness
real-time
transaction
and data
ingestion
large scale
data storage
and analysis
Interactive application
• random reads and writes
• real-time
• low, predictable latency
• high performance
Data storage/analysis
• lower-cost storage
• large-scale analytics
• read heavy
• batch processing?
Increased
Value
5 seconds
5 minutes
5 hours
1 day
1 week
![Page 40: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/40.jpg)
©2011 Azul Systems, Inc. | Azul Company Confidential 12
Building blocks
![Page 41: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/41.jpg)
©2011 Azul Systems, Inc. 13 13
Choice of Building blocks size
Building block granularity
![Page 42: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/42.jpg)
©2011 Azul Systems, Inc. 14 14
Inflection points – building block size
Data
Value
Building
Block Size
Value of data
Value = (Volume ± Variety ± Velocity) x Totality
Time
![Page 43: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/43.jpg)
©2011 Azul Systems, Inc. 15 15
Building blocks
• There are value inflection points to building block size
─ At some size levels, orders-of-magnitude improvement occur
─ E.g. when entire index fits in each replica/memory space
─ E.g. when partition sizes are big enough
• Typical physical building block
─ 12-24 core, dual-socket servers
─ 24-96 GB DRAM
• Typical process level building block
─ A Java JVM
─ 1 - 4GB of memory, 1-2 cores
• Why the mismatch?
![Page 44: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/44.jpg)
©2011 Azul Systems, Inc. 16 16
Productivity with a scale challenge
• Variety of languages and technologies in Big Data
• Java/JVM is the enterprise default, highly productive
• But…
• Building blocks are practically limited in size
• Main limitation reason: Garbage Collection
• Main Garbage Collection issue: GC pause times
• Imposes practical limits due to stability/responsiveness
• Physical building blocks broken into 10s of tiny pieces
![Page 45: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/45.jpg)
©2011 Azul Systems, Inc. | Azul Company Confidential 17
Critical infrastructure units
![Page 46: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/46.jpg)
©2011 Azul Systems, Inc. 18 18
Critical unit example - HDFS
Source: Apache Hadoop documentation
![Page 47: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/47.jpg)
©2011 Azul Systems, Inc. 19 19
Critical Infrastructure units
Size of critical infrastructure units drive cluster scale
• Central metadata nodes
• Central in-memory index nodes
• Central authoritative data nodes
• Graph-DB and dense relationship nodes
![Page 48: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/48.jpg)
©2011 Azul Systems, Inc. 20 20
Summary: Azul’s Impact on Big Data
Break the building block size barrier
• Each instance can elastically fill an entire physical unit
• Expose/leverage value inflection points
• Remove/Expand cluster scale limitations
• Reduce/Consolidate instance counts
• Address tuning challenges
• Ensure predictable performance
![Page 49: Variety, Velocity and Volume · CAP Theorem NoSQL databases, cloud computing Agility Polyglot persistence Agile development, schema-free, non-relational, in-memory Intricacy Big data,](https://reader035.vdocuments.us/reader035/viewer/2022062507/5fce6f211b0cfc50554dcb40/html5/thumbnails/49.jpg)
For More Information: Azul Systems Web site: www.azulsystems.com
Technical resources: ./resources
Zing trial: ./trial
451 Group: www.the451group.com
blogs.the451group.com/information_management/
Webinar replay:
www.azulsystems.com/resources/webinars