whither the hadoop developer experience, june hadoop meetup, nitin motgi

21
@nmotgi Nitin Motgi Whither the Hadoop Developer Experience ?

Upload: felicia-haggarty

Post on 02-Aug-2015

54 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

@nmotgi

Nitin  Motgi

Whither  the  Hadoop  Developer  Experience  ?

Page 2: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

PROPRIETARY & CONFIDENTIAL2

• Introduction  to  data  applications  

• Challenges  with  building  operational  data  applications  on  Hadoop  

• Motivation  and  Goals  for  CDAP  

• Use-­‐cases  

• Introduction  to  CDAP  and  Architecture  Overview  

• Demo

Agenda

Page 3: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

PROPRIETARY & CONFIDENTIAL3

Applications  that  use  data  insights  to  enhance  the  customers/user  experience,  achieve  a  business  objective  or  improve  a  business  process.

What are Data Applications?

Page 4: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

PROPRIETARY & CONFIDENTIAL4

• 360-­‐Degree  Customer  View  

• Recommendation  Engine  

• Predictive  Modeling  

• Fraud  Analysis  

• Network  Threat  Detection  

• Telemetry  Analysis  

• Time  Series  Analysis  

• Data  Processing  -­‐  ETL  

• And  many  more

Examples

Page 5: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

Challenges

Page 6: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

Technology Explosion

Core HadoopHDFS, MR

2006

HbaseZooKeeper

Core Hadoop

2008

HivePig

MahoutHbase

ZooKeeperCore Hadoop

2009

SqoopWhirrAvroHivePig

MahoutHbase

ZookeeperCore Hadoop

2010

FlumeBigtopOozie

MRUnitHCatalog

SqoopWhirrAvroHivePig

MahoutHbase

ZookeeperCore Hadoop

2011

SparkImpala

SolrKafkaFlumeBigtopOozie

MRUnitHCatalog

SqoopWhirrAvroHivePig

MahoutHbase

ZookeeperCore Hadoop

2012

SentryTez

ParquetYARNSparkYARNImpala

SolrKafkaFlumeBigtopOozie

MRUnitHCatalog

SqoopWhirrAvroHivePig

MahoutHbase

ZookeeperCore Hadoop

Knox

Present

Page 7: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

APPLICATION

COMPLEXITY

MANY DOMAINS TO

BRIDGE

LOTS OF

BOILERPLATEINCONSISTENT

APIS

NO

REUSABILITY LACK OF DEVELOPER

PRODUCTIVITY

Challenges

Page 8: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

Application Complexity

Page 9: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

Mo:va:on

Page 10: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

Motivation• Simple  yet  powerful  platform  for  developers  to  build  applications  on  Hadoop  

• Expose  capabilities  rather  than  features  

•Make  Hadoop    accessible  to  developers  with  no  Hadoop  knowledge

Page 11: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

Goals• Unified  platform  for  building  solutions  on  Hadoop  

• Simpler  application  development  lifecycle  

• Reusable  Data  and  Processing  Patterns  with  Abstractions  

• Framework  level  correctness  and  consistency  

• Easy  to  use  developer  APIs

Page 12: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

PROPRIETARY & CONFIDENTIAL12

• Reliable  and  scalable  real-­‐time  business  critical  analytics  

• Closed  Loop  Recommendation  and  Analytics  

• Data  Ingestion  As  A  Service  

• Extendable  and  Reusable  use-­‐case  blueprints  

• ETL  Automation  -­‐  Real-­‐time  and  Batch  

• Data  As  A  Service  

• Reduce  development  and  operational  complexity  of  Hadoop

Typical Customer Use-cases

Which  one  of  these  are  applicable  to  you  ?

Page 13: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

Introduc:on  toCask  Data  Applica:on  PlaCorm

Page 14: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

An open source, integrated, distributed and extensible platform for building data applications on Hadoop.

Cask Data Application Platform

Page 15: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

Provides

Page 16: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

Supports developers, operations, and organizations through the entire enterprise data application lifecycle.

CASK DATA APP PLATFORM

Data Lifecycle

Ingest

Explore

Transform

Serve

Application Lifecycle

Develop

Test

Deploy

Scale

EnterpriseLifecycle

Secure

Manage

Monitor

Operate

Supports

Page 17: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

17

ServeTransformExploreIngest

Unification

ACID

Dataset

Streams

Realtime - Tigon

JDBC

Query

RPC

SparkMR Dataset

Dataset

MR

Spark

Ad-hocquery

Dataset API, SPI & Management Services

Application Structure

Page 18: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

Building Blocks

Dataset Program

Encapsulated  data  access  paEerns  and  data  model  in  a  reusable,  domain-­‐specific  API

Standardized  containers  for  processing  paradigms  

ProgramaUc  abstracUon  for  composing  mulUple  Datasets    and  Programs  that  integrates  ingesUon,  exploraUon,  transformaUon  and  serving

Application

Dataset ProgramProgramDataset

Page 19: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

19

Deployment Architecture

• Services• Master• Router • Auth Server

CDAP Server• Highly Available (HA)• Installed on edge node(s)• Supports Kerberos - Impersonation & Permitter Security• Manager system services in YARN

CDAP Server

System Services (Twill Containers)• Transactions (Tephra)• Metrics Aggregation• Log Aggregation• Dataset Services• Metadata Management Service• Explore Service• Stream Management Service & more

Page 20: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

Want to Learn More?

Open-source (Apache License v2)

Website: http://cdap.io

Mailing List: [email protected] [email protected]

IRC: #cdap on freenode.net

Page 21: Whither the Hadoop Developer Experience, June Hadoop Meetup, Nitin Motgi

QUESTIONS?