adding hadoop to your analytics mix?

20
MAKING BIG DATA COME ALIVE Adding Hadoop to Your Analytics Mix: Challenges and Strategies Madina Kassengaliyeva July 23, 2015

Upload: think-big-a-teradata-company

Post on 18-Aug-2015

122 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Adding Hadoop to Your Analytics Mix?

MAKING BIG DATA COME ALIVE

Adding Hadoop to Your Analytics Mix: Challenges and Strategies

Madina KassengaliyevaJuly 23, 2015

Page 2: Adding Hadoop to Your Analytics Mix?

2© 2015 Think Big, a Teradata Company 04/18/2023

Madina KassengaliyevaDirector, Client Services, Think Big

Madina Kassengaliyeva is responsible for ensuring successful delivery of Think Big’s service engagements. Madina has led strategy, engineering and data science engagements in a variety of areas, including recommendation engines, customer interactions optimization, marketing analytics and compliance. Madina holds an MBA from the University of Chicago and a BA in International Studies from American University.

Presenters

Paul BarschDirector, Services Marketing, Think Big

Paul Barsch directs marketing programs for Think Big, a Teradata Company. Paul has been in IT for 15+ years in variety of roles for Teradata, HP Enterprise Services and KPMG Consulting.

Page 3: Adding Hadoop to Your Analytics Mix?

3

Housekeeping

Use the widget bar below to…

Get valuable resources & complete exit survey

Ask Questions to the Presenters

Request online technical help

Go social….

…and follow the conversation © 2015 Think Big, a Teradata Company 04/18/2023

Page 4: Adding Hadoop to Your Analytics Mix?

4

• Hadoop Adoption Path

• Key Challenges – Data, Organization, Capabilities

• Ideas for Solutions

Agenda

Page 5: Adding Hadoop to Your Analytics Mix?

5© 2015 Think Big, a Teradata Company 04/18/2023

Common Hadoop Adoption Path

1. Address Immediate

Needs

2. Establish a Data

Repository

3. Initial Analytics

Exploration

4. Integrate Hadoop into the Analytics Capabilities

• Hadoopusedtorelieveatechnologypainpoint

• Reducedatawarehousecosts

• SpeedupETL

• Theonlyusersareintechnologyteams

• MoreandmoredatagetsaddedtoHadoopasaresultofPhase1

• Greaterdatavariety,morerawdata,deeperhistory

• Initialdatatransfer,security,andgovernancepracticesareestablished

• Stillperceivedaslargelyatechnologyplatform

• LimitednumberofpeopleorteamsconductPOCsusingHadoop

• Analyticstechniquesnotavailableontraditionalplatformsareapplied

• Earlywinsindicatepromisingbusinessimpactandexcitementbuilds

• MultipleteamsuseHadoopaspartoftheanalyticsinfrastructure

• Techniques,methods,bestpracticesandaccesspatternsgetcodified

• Businessbeginstocaptureconsistentvalue

TransitionfromPhase3toPhase4iswhenkeychallengesemerge

Page 6: Adding Hadoop to Your Analytics Mix?

6© 2015 Think Big, a Teradata Company 04/18/2023

Hadoop Adoption – Critical Point

Page 7: Adding Hadoop to Your Analytics Mix?

7© 2015 Think Big, a Teradata Company 04/18/2023

Key Challenges

Data

Organization

Capabilities

• Impact of schema on read

• Consistent taxonomies and reference data

• Architecture - access patterns and flows

• Skills, roles and responsibilities

• Lack of common vocabulary

• Knowledge capture and sharing

• Foundational capabilities at the whim of changing business priorities

• Future that’s hard to envision is hard to build

Page 8: Adding Hadoop to Your Analytics Mix?

8© 2015 Think Big, a Teradata Company 04/18/2023

Organization – Key Challenges

• Skills, roles and responsibilitieso Significant skills gaps between what’s currently available and what is

neededo Both business and technology do analytics and often engineering,

blurring lines of responsibility or ownershipo “Throw over the wall” doesn’t work

• Lack of common vocabularyo Every BU (and every leader) have their own understanding of the same

wordso This is rarely discussed

• Knowledge capture and sharingo Multiple teams work with the same data and similar techniqueso Organization silos do not naturally support broad knowledge transfer

Page 9: Adding Hadoop to Your Analytics Mix?

9© 2015 Think Big, a Teradata Company 04/18/2023

• Cross-BU committee to guide organizational change, define common vocabulary, defend the effort to executive leadership and share success

• Thorough, honest skills assessments to identify gaps, training needs, augmentation needs, map to roles and responsibilities

• Documented tools requirements based on current and projected skills

• Collaboration architecture

• Plug into existing knowledge transfer practices and tools and allow for informal information exchange based on data access privileges

Organization – Ideas for Solutions

Page 10: Adding Hadoop to Your Analytics Mix?

10© 2015 Think Big, a Teradata Company 04/18/2023

Organization – Key Functions

Strategy

Data Management & Governance

Architecture Tools Market Research

Roadmap Planning

Value Realization

Future Data Sources

Services

Support

Visualization & ReportingData SME’s

Core Platform Development Testing

Operations

Core Platform Management

Metrics Tracking & Reporting Platform Integration

Program Management

Roadmap Execution

Cross Group Coordination

Financial Management

Small Project Prioritization

Communication & Change

Management

Application Development

Analytic Sandbox

Data Science

Integration, Interfaces &

Ingestion

Training

Incident Management Config, Change, Release ManagementProblem Management

Help DeskKnowledge Management

Technology Governance

Data Quality & Metrics

Access Controls

Data Governance

Metadata Management

Page 11: Adding Hadoop to Your Analytics Mix?

11© 2015 Think Big, a Teradata Company 04/18/2023

• Foundational capabilities at the whim of changing business priorities

• Lack of consensus on what are foundational capabilities

• Let’s be honest, the “Top Project” changes often and the resources go with it

• Foundational capabilities do not immediately impact the bottom line

• Future that’s hard to envision is hard to build

• Lack of shared vision

• Clarity needed at multiple levels – strategy, operational details, day to day

Capabilities – Key Challenges

Page 12: Adding Hadoop to Your Analytics Mix?

12© 2015 Think Big, a Teradata Company 04/18/2023

• Consolidate ownership in a team that has organizational influence and includes representatives from the business, the infrastructure, architecture, data, and analytics

• Back to vocabulary – agree on what capabilities mean for your business unit and your technology partners

• Roadmaps are useful – visual representations of high-level goals against a time line that should define your projects

• Dedicate resource to capabilities and protect them

• Check in with your roadmap – does it still reflect your vision?

Capabilities – Ideas for Solutions

Photo courtesy of Flickr. Creative Commons. By E.Bass.

Page 13: Adding Hadoop to Your Analytics Mix?

13© 2015 Think Big, a Teradata Company 04/18/2023

Capabilities Pyramid

Page 14: Adding Hadoop to Your Analytics Mix?

14© 2015 Think Big, a Teradata Company 04/18/2023

Capabilities: Roadmap Example

Analytics standardizedmethods,code,tools,teamroles

Operations standardizedprocesses,tools,teamroles

Skills and roles matrix

Data Ingestion, Transfer, Structuring,

and Governance approach

Unified Model Management

Integrated Data Science

Variablesbasedonsinglesourcestructureddata

VariableselectioninHadoop

Integrationwithexistingscoringengine

BatchdataprocessinginHadoopIntegration Cross-channeland intraday variables generation

BatchscoringinHadoop

Naturallanguageprocessingtoanalyzetextandvoice

Initialreal-timescoring

Execution Methodology and project management

Data and Models

Organization and Management

Analytics Knowledge Management

Scoring Architectural and Analytical design

Data Lifecycle Management

Real-time scoring design

Statisticalandmachine-learning-basedmodeling

DataExplorationofunstructureddatacomponents(e.g.URL,chattext)

DataExplorationofstructureddatacomponents(e.g.pageviews,

Cross-channelvariables,variablesfromunstructureddata+intradayvariables

Page 15: Adding Hadoop to Your Analytics Mix?

15© 2015 Think Big, a Teradata Company 04/18/2023

• Impact of schema on read

• Hadoop supports a variety of data structures, which simplifies data ingestion and allows data users to define preferred schemas

• This shifts the burden of defining the schema to the data users

• Consistent taxonomies and reference data

• Meaningful data analysis requires known and consistent taxonomy

• New taxonomies can get created by individual teams

• Reference data changes

• Architecture - access patterns and flows

• Data flows across platforms, regular updates, physical and virtual constraints

• Decisions on what should be done where

Data – Key Challenges

Page 16: Adding Hadoop to Your Analytics Mix?

16© 2015 Think Big, a Teradata Company 04/18/2023

• Big issue with lots of opinions – see Data Lake et. al

• Test and define common data manipulation patterns for different use cases – aggregations, reductions, basic statistical derivations

• Centralize the responsibility for data governance, data architecture, taxonomy, and maintenance

• Establish knowledge sharing for data post-analytics

Data – Ideas for Solutions

Photo courtesy of Flickr. Creative Commons. By Renzo Ferrante

Page 17: Adding Hadoop to Your Analytics Mix?

17© 2015 Think Big, a Teradata Company 04/18/2023

• Data management, knowledge, architecture, and processing assurance

• Investment justification, research, knowledge sharing

• Data aggregation and enhancement

Client Example – Centralized Data Group

Data Source 1

Data Source 2

Data Source 3

Data Source 3

Business Group

Product Group

Central Tech Group

Page 18: Adding Hadoop to Your Analytics Mix?

18© 2015 Think Big, a Teradata Company 04/18/2023

Conclusions

Data

Organization

Capabilities

• Centralize data management• Knowledge of data = knowledge of

business

• Technology is not enough – need the right people and processes

• Executive commitment is key

• Tough conversations can yield much better alignment

• Dedicate and protect resources to build capabilities

Page 19: Adding Hadoop to Your Analytics Mix?

19

• 100% Big Data Focus

• Founded in 2010 with100+ engagements across 70 clients

• Unlock value of big data with data science and data engineering services

• Proven vendor-neutral open source integration expertise

• Agile team-based development methodology

• Think Big Academy for skills and organizational development

• Global delivery model

Who is Think Big?

Page 20: Adding Hadoop to Your Analytics Mix?

20

Questions and Answers

Questions and Answers

Thank You!