hadoop as a service presented by ajay jha at houston hadoop meetup

20
Altiscale Big Data-as-a- Service Paul Tibaldi RSD & Ajay Jha SA

Upload: mark-kerzner

Post on 06-Jan-2017

480 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

AltiscaleBig Data-as-a-ServicePaul Tibaldi RSD & Ajay Jha SA

Page 2: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

2

• Market Background• Who is Altiscale?• Why are we different/better?• Hadoop Admin• Apache Hadoop Stack • Platform/Access/Demo• Q/A

Big Data As A Service

Page 3: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

Market Background

Page 4: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

4

Interest in Big Data is growing fast

Page 5: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

5

Big Data in The Cloud is Accelerating

On-Premises

32%

Cloud Only

23%

Cloud Plus On-Premises

29%

Source: “Hadoop Expansion Boosts Cloud and Unsupported On-Premises Deployments,” Merv Adrian, Nick Huedecker, 3 September 2015

Page 6: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

But the journey has dangers

Gartner: 70% of independent Big Data implementations will fail to meet revenue and cost objectives, through 2018.

Page 7: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

Who is Altiscale?

Page 8: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

Altiscale Data Cloud GA in 2014

Financed by top-tier technology investors

Recognized innovator in Hadoop-as-a-Service

About Altiscale

Page 9: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

About Altiscale

Led by experienced, renowned Hadoop team from Yahoo!• Raymie Stata, CEO. Former Yahoo! CTO,

well-known advocate of Apache Software Foundation

• David Chaiken, CTO. Former Yahoo! Chief Architect

Built and managed by veterans of Big Data, SaaS, and enterprise software• From Google, Netflix, LinkedIn, VMware, Oracle, and Yahoo!

40,000 nodes500 PB1,000 users$ billions at stake

Raymie Stata, CEO David Chaiken, CTO Ricardo JenezVP of Engineering

Charles Wimmer Head of Operations

Page 10: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

Big data built for speed

Fast time to value—days not months

Easier, faster scalability—with elastic scaling

Operations support—so your jobs get done

Lower TCO—for fast investment payback

Page 11: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

11

Unmatched Security

Altiscale is the only provider that delivers integrated security

encompassing its Big Data platform offering

Page 12: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

Complete best of breed

Page 13: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

Big Data is complex.It gets more complicated as you scale.

Page 14: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

Big Data-as-a-Service

Page 15: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

The Altiscale Data Cloud Core

Page 16: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

Altiscale Data Cloud is 100% based on Apache open source.

Our current Altiscale Data Cloud 4.0 release is composed of the following Apache components and versions:

• Apache Hadoop 2.7.1 • Apache Spark 1.5* • Apache Hive (& HCatalog) 1.2 • Apache Tez 0.7.0 • Apache Pig 0.15.1• Apache Oozie 4.2.0 • Apache Flume 1.5.2 • Avro 1.7.4 • JDK/JRE 7 (Sun/Oracle version) • HttpFS

In addition to the above, we also support the three latest versions of Spark to our customers. That allows our customers the options of a conservative approach as well as a the option to work with the “bleeding edge” fast moving Spark community.

Concurrency with Apache Versioning

Page 17: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

Hire an expert to take care of the cluster

• Hardware setup and Cluster installation

• Address hardware failure

• Upgrade Hadoop stack

• Tuning config parameters

• yarn-site.xml ex : yarn.nodemanager.resource.memory-mb

• mapred-site.xml ex : mapreduce.task.io.sort.mb

• hdfs-site.xml ex : dfs.blocksize

Hadoop Administration

Page 18: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

Accessing the cloud

Page 19: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

Spark example

• Build Spark code laptop using maven

• Build the jar and copy over Altiscale’s workbench (Gateway) node.

• Launch Spark job on YARN.

• Monitor using Resource Manager

Quick Spark Demo

Page 20: Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup

20

Thank You!