go beyond debug wire tap your app for knowlege

19
© Hortonworks Inc. 2012 Go beyond debug Wire Tap your App for knowlege with Hadoop Tom McCuch Solution Engineering @ Hortonworks Twitter: tmccuch Oleg Zhurakousky Principal Architect @ Hortonworks Twitter: z_oleg

Upload: malaya

Post on 17-Mar-2016

31 views

Category:

Documents


0 download

DESCRIPTION

Go beyond debug Wire Tap your App for knowlege. with Hadoop. Tom McCuch Solution Engineering @ Hortonworks Twitter: tmccuch Oleg Zhurakousky Principal Architect @ Hortonworks Twitter: z_oleg. The Application Development Dilemma. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Go beyond debug Wire  Tap your  App for  knowlege

© Hortonworks Inc. 2012

Go beyond debugWire Tap your App for knowlege

with Hadoop

Tom McCuchSolution Engineering @ HortonworksTwitter: tmccuch

Oleg ZhurakouskyPrincipal Architect @ HortonworksTwitter: z_oleg

Page 2: Go beyond debug Wire  Tap your  App for  knowlege

© Hortonworks Inc. 2012© Hortonworks Inc. 2012

The Application Development Dilemma

• Today, application developers devote roughly 80% of their code to persisting roughly 20% of the total data flowing through their applications

–80% of the data flowing through our applications is at best lost in rolling log files, at worst never collected -- without ever being analyzed or accounted for

–For the remaining 20% we do currently collect – application-level database programming, licensing, storage, administration, and ETL processing have maxed out IT operations budgets and have constrained app development teams from keeping pace with the rate of change in the business

Page 2

Page 3: Go beyond debug Wire  Tap your  App for  knowlege

© Hortonworks Inc. 2012© Hortonworks Inc. 2012

Example: Data Available During Ingest

• Record count• Highest/Lowest record length• Average record length• Compression ratio

But with a little more work. . .• Field parsing

–Unique values–Unique values per field–Access to values of each field independently from the record–Relatively fast field-based searches, without indexing–Value encoding–Etc…

These are cross-cutting concerns!Page 3

Page 4: Go beyond debug Wire  Tap your  App for  knowlege

How do we address cross-cutting concerns without disturbing the

existing process flow?

Page 4

Page 5: Go beyond debug Wire  Tap your  App for  knowlege

© Hortonworks Inc. 2012© Hortonworks Inc. 2012

Wire Tap Defined

Page 5

Page 6: Go beyond debug Wire  Tap your  App for  knowlege

© Hortonworks Inc. 2012© Hortonworks Inc. 2012

Wire Tap is an Enterprise Integration Pattern

Page 6

Page 7: Go beyond debug Wire  Tap your  App for  knowlege

TransformerConvert payload or modify headers

FilterDiscard messages based on boolean evaluation

RouterDetermine next channel based on content

SplitterGenerate multiple messages from one

AggregatorAssemble a single message from multiple

Other Enterprise Integration Patterns

Page 7

Page 8: Go beyond debug Wire  Tap your  App for  knowlege

© Hortonworks Inc. 2012

The Business Case

Page 9: Go beyond debug Wire  Tap your  App for  knowlege

© Hortonworks Inc. 2013

6 Key Hadoop DATA TYPES

1. SentimentUnderstand how your customers feel about your brand and products – right now

2. ClickstreamCapture and analyze website visitors’ data trails and optimize your website

3. Sensor/MachineDiscover patterns in data streaming automatically from remote sensors and machines

4. GeographicAnalyze location-based data to manage operations where they occur

5. Server LogsResearch logs to diagnose process failures and prevent security breaches

6. TextUnderstand patterns in text across millions of web pages, emails, and documents

Page

Value

Page 10: Go beyond debug Wire  Tap your  App for  knowlege

© Hortonworks Inc. 2013

20 Apache Hadoop Enterprise Use Cases

Page

Vertical Use Case Data Type

Financial Services

New Account Risk Screens Text, Server Logs

Fraud Prevention Server Logs

Trading Risk Server Logs

Maximize Deposit Spread Text, Server Logs

Insurance Underwriting Geographic, Sensor, Text

Accelerate Loan Processing Text

Telecom

Call Detail Records (CDRs) Machine, Geographic

Infrastructure Investment Machine, Server Logs

Next Product to Buy (NPTB) Clickstream

Real-time Bandwidth Allocation Server Logs, Text, Sentiment

New Product Development Machine, Geographic

Retail

360° View of the Customer Clickstream, Text

Analyze Brand Sentiment Sentiment

Localized, Personalized Promotions Geographic

Website Optimization Clickstream

Optimal Store Layout Sensor

Manufacturing

Supply Chain and Logistics Sensor

Assembly Line Quality Assurance Sensor

Proactive Maintenance Machine

Crowdsourced Quality Assurance Sentiment

Page 11: Go beyond debug Wire  Tap your  App for  knowlege

© Hortonworks Inc. 2012

Fraud Prevention

Business Problem• Financial institutions are always at risk of fraud• Fraudsters test bank systems for vulnerabilities• This testing leaves subtle patterns often undetected by bank

employees or law enforcement• Fraud losses costs banks millions

Solution• HDP reduces the cost to detect fraudulent activity• HDP stores more types of data for longer• Analysis of data in the “data lake” exposes fraudulent patterns that

would have gone undetected

Financial Services Data: Server Logs

Page 12: Go beyond debug Wire  Tap your  App for  knowlege

12

Credit Request Process Flow - Before

Credit Request Processing• Credit Request arrives on a Gateway• Credit Request is sent over a Channel • Credit Request Processor

• Receives Request• Processes the Request• Issues a Response

Page 13: Go beyond debug Wire  Tap your  App for  knowlege

• Credit Scoring• Fraud Detection• Gathering Data Available during Credit

Request Process Flow

Cross-Cutting Concerns

Page 14: Go beyond debug Wire  Tap your  App for  knowlege

© Hortonworks Inc. 2012

Demo

Page 15: Go beyond debug Wire  Tap your  App for  knowlege

15

Credit Request Processing Flow - After

HDP

Page 16: Go beyond debug Wire  Tap your  App for  knowlege

16

Example: HTTP Header Collection

Page 17: Go beyond debug Wire  Tap your  App for  knowlege

© Hortonworks Inc. 2012© Hortonworks Inc. 2012

Example: Data Available During Ingest

• Record count• Highest/Lowest record length• Average record length• Compression ratio

But with a little more work. . .• Field parsing - unstructured data is not all that unstructured…

–Unique values–Unique values per field–Access to values of each field independently from the record–Relatively fast field-based searches, without indexing–Value encoding–Etc…

These are cross-cutting concerns!Page 17

Page 18: Go beyond debug Wire  Tap your  App for  knowlege

© Hortonworks Inc. 2012

Demo

Page 19: Go beyond debug Wire  Tap your  App for  knowlege

© Hortonworks Inc. 2012

Thank You!Questions & Answers

Follow: @tmccuch, @z_oleg, @hortonworks

Page 19