big data analytics on hadoop rainstor infographic

1
Compression Tames the True Cost of Running Big Data Analytics on Hadoop Enterprise Multi-structured Data Growth Sky-Rockets In 2011, 1.4m Zettabytes of Data Generated by The Enterprise (1) Financial Services Analytics will drive 70% of investments in expansion & modernization of infrastructure to 2015. (2) 2011 2015 70% Communications Mobile data growing at 92% - reaching 6.3 exabytes /month by 2015. (3) 2011 2015 92% Utilities Technologies Enabling SmartGrid will Grow To $34b by 2020 (4) 2011 2020 $34b Retail 60% increase in retailers’ operating margins possible with big data (5) 2011 2015 60% Zettabytes 1.4m $$ 40x Compression = 97.5% Node Reduction Savings of over $1m (Node purchase & operating cost for 3 years) Hadoop Nodes Reduction 97.5% At 40x Compression 75 Nodes 3yr cost $1.05m 2 Nodes 3yr cost $28k Big Data on Hadoop Requires Lots of Storage and Nodes In Next Decade, Data Center Information Growth Is 50x (6) Growth 50x 50% Running Hadoop in next 5 years (7) 50% 1 Zettabyte = 268 million nodes! @12TB/Node, 3x Replication = 4TB user data/node 1ZB Extreme Data Compression Drives Down Nodes Value and Pattern De-duplication Gives Optimal Compression Unique data format stored on Hadoop, which eliminates duplicates while retaining full original values and structure upon access - no re-inflation required. (9) 0 5 10 15 20 25 30 35 40 45 50 HADOOP LZO COMPRESSED RELATIONAL FLATFILE GZIP COLUMNAR VALUE & PATTERN DE- DUPLICATION 3X 6X 7X 8X 40X Compression 40x Total Cost of Hadoop Includes Buying & Operating Nodes Real-World: 300TB user data = 75 Nodes . ² 3XUFKDVH (8) . ² 2SHUDWH RYHU \UV Total Cost = $1.05m 300TB For 3 years . . Operate Buy FOOTNOTES 1) IDC & CosmoBC.com: http://techblog.cosmobc.com/2011/08/26/data-storage-infographic/ 2) Gartner Predicts 2012: Information Infrastructure and Big Data (Nov 2011) 3) Cisco Visual Networking Index: Forecast and Methodology, 2010-2015 4) Lux Research, Jan 2011: The Surprise Winners in the $34 Billion Smart Grid Market 5) McKinsey Global Institute May 2011: Big data: The next frontier for innovation, competition, and productivity 6) IDC: Extracting Value from Chaos, June 2011 7) Research by Internet Research Group & Infineta (Dec 2011) - http://www.datacenterknowledge.com/archives/2011/12/13/will-big-data-clog-networks-with-big-traffic/ 8) Based on industry pricing and market feedback. 9) Internal benchmarks conducted by RainStor using customer & partner data (2011) OTHER FACTOIDS 1) 37% of those surveyed named system performance and scalability as the second biggest challenge for them in the coming year (source: http://www.computerworld.com/s/article/9194283/Data_growth_remains_IT_s_biggest_challenge_Gartner_says ) Find Out How Your Data Compresses: [email protected] Hadoop Big Data Compressed (and Sitting Pretty)

Upload: rainstor

Post on 12-Jun-2015

94 views

Category:

Software


1 download

DESCRIPTION

A look at how RainStor's compression helps solve the Cost, Complexity and Compliance Risk challenges of managing big data on Hadoop. RainStor runs natively on Hadoop, integrates with YARN and Hue. Can be accessed through Hive, Pig or MapReduce.

TRANSCRIPT

Page 1: Big Data Analytics on Hadoop RainStor Infographic

Compression Tames the True Cost of Running Big Data Analytics on Hadoop

Enterprise Multi-structured Data Growth Sky-RocketsIn 2011, 1.4m Zettabytes of Data Generated by The Enterprise (1)

Financial ServicesAnalytics will drive 70%

of investments in expansion

& modernization of

infrastructure to 2015. (2)

2011 2015

70%

Communications

Mobile data growing at

92% - reaching 6.3

exabytes /month by 2015. (3)

2011 2015

92%

Utilities

Technologies Enabling

SmartGrid will Grow

To $34b by 2020 (4)

2011 2020

$34b

Retail

60% increase in

retailers’ operating

margins possible

with big data (5)

2011 2015

60%

Zettabytes

1.4m

$$

40x Compression = 97.5% Node Reduction

Savings of over $1m(Node purchase & operating cost for 3 years)

Hadoop

Nodes Reduction

97.5%

At 40x Compression

75 Nodes 3yr cost$1.05m

2 Nodes 3yr cost$28k

Big Data on Hadoop Requires Lots of Storage and Nodes

In Next Decade, Data Center Information Growth Is 50x (6)

Growth

50x

50% Running Hadoopin next 5 years (7)

50%

1 Zettabyte = 268 million nodes! @12TB/Node, 3x Replication = 4TB user data/node

1ZB

Extreme Data Compression Drives Down Nodes

Value and Pattern De-duplicationGives Optimal Compression

Unique data format stored

on Hadoop, which eliminates

duplicates while retaining

full original values and

structure upon access -

no re-inflation required. (9)

05101520253035404550

HADOOPLZO

COMPRESSEDRELATIONAL

FLATFILEGZIP

COLUMNAR VALUE &PATTERN DE-DUPLICATION

3X 6X 7X 8X

40X

Compression

40x

Total Cost of Hadoop Includes Buying & Operating Nodes

Real-World: 300TB user data = 75 Nodes

(8)

Total Cost = $1.05m

300TB For 3 yearsOperate

Buy

FOOTNOTES

1) IDC & CosmoBC.com: http://techblog.cosmobc.com/2011/08/26/data-storage-infographic/

2) Gartner Predicts 2012: Information Infrastructure and Big Data (Nov 2011)

3) Cisco Visual Networking Index: Forecast and Methodology, 2010-2015

4) Lux Research, Jan 2011: The Surprise Winners in the $34 Billion Smart Grid Market

5) McKinsey Global Institute May 2011: Big data: The next frontier for innovation, competition, and productivity

6) IDC: Extracting Value from Chaos, June 2011

7) Research by Internet Research Group & Infineta (Dec 2011) - http://www.datacenterknowledge.com/archives/2011/12/13/will-big-data-clog-networks-with-big-traffic/

8) Based on industry pricing and market feedback.

9) Internal benchmarks conducted by RainStor using customer & partner data (2011)

OTHER FACTOIDS

1) 37% of those surveyed named system performance and scalability as the second biggest challenge for them in the coming year

– (source: http://www.computerworld.com/s/article/9194283/Data_growth_remains_IT_s_biggest_challenge_Gartner_says )

Find Out How Your Data Compresses: [email protected]

Hadoop Big Data Compressed (and Sitting Pretty)