taming the beast: extracting value from hadoop
TRANSCRIPT
John L Myers
Enterprise Management Associates
Managing Research Director
@johnlmyers44
Taming the Beast:
Extracting Value from Hadoop
Ingo Mierswa
RapidMiner
Founder & CTO
Panel Moderator
Lyndsay Wise, Research Director, EMA
Lyndsay has over 10 years experience in software
research, BI consulting, and strategy development,
specializing in software evaluation and best-fit solution
selection. Her focus at EMA is on data integration, data
governance, cloud technologies, data visualization,
analytics, and collaboration.
Slide 2 © 2015 Enterprise Management Associates, Inc.
Featured Speakers
John Myers, Managing Research Director, EMA
John has over 10 years of experience working in areas related to business
analytics in professional services consulting and product development
roles. Additionally, John helps organizations solve their business analytics
problems, whether they relate to operational platforms – such as customer
care or billing – or applied analytical applications – such as revenue
assurance or fraud management.
Ingo Mierswa, Founder & CTO, RapidMiner
Ingo, an industry-veteran data scientist, is the founder and CTO of
RapidMiner, the industry’s #1 open source platform for predictive
analytics. Ingo is passionate about the technological innovation enabled
by the open source community and envisions a world where easy-to-use
predictive analytics software empowers all business analysts and data
scientists. Ingo is the author of numerous award-winning publications
about predictive analytics and big data, and has spoken at countless
industry events.
Slide 3 © 2015 Enterprise Management Associates, Inc.
A PDF of the PowerPoint
presentation will be available
Event Presentation
Logistics for Today’s Webinar
Slide 4 © 2015 Enterprise Management Associates, Inc.
An archived version of the event recording will be
available at www.enterprisemanagement.com
• Log questions in the Q&A panel located on the
lower right corner of your screen
• Questions will be addressed during the Q&A
session of the event
Questions
Event Recording
Join the Conversation…
Submit your questions or comments to the panel
using: @wiseanalytics @johnlmyers44 @rapidminer
#predictiveanalytics
Slide 5 © 2015 Enterprise Management Associates, Inc.
Topic #1:
Issues With Data Lakes
Adoption of Hadoop-based Data Lake Architectures
Slide 7 © 2015 Enterprise Management Associates, Inc.
Topic #2:
Obstacles Implementing
Analytics On Hadoop
Obstacles Implementing Analytics
Slide 9 © 2015 Enterprise Management Associates, Inc.
Topic #3:
Processing Requirements for
Predictive Analytics
Required Processing and Compute Latency
for Big Data Projects
Slide 11 © 2015 Enterprise Management Associates, Inc.
©2015 RapidMiner, Inc. All rights reserved. - 12 -
Architecture of Hadoop
Orchestration node
Worker nodes
©2015 RapidMiner, Inc. All rights reserved. - 13 -
Leverage Hadoop’s Compute Capacity
• Design advanced analytics workflows in your predictive analytics platform
• Ensure your solution automatically translates predictive analytics needs into native Hadoop code, e.g., MapReduce, Hive, Pig, Spark, etc.
• Push predictive analytic instructions into your Hadoop
• Hadoop performs calculations across the entire Hadoop cluster for a holistic view of your data
• Data remains in Hadoop Results are delivered to the business
• Recommendations
– GUI workflow language (code-free)
– Don’t forget about security
ResultsAnalytic instructions
translated to native
Hadoop
Calculations
Results
operationalized in
business processes
Predictive Analytics Platform
Topic #4:
Successful Big Data Analytics
Projects
Project Success
Slide 15 © 2015 Enterprise Management Associates, Inc.
©2015 RapidMiner, Inc. All rights reserved. - 16 -
©2015 RapidMiner, Inc. All rights reserved. - 17 -
OPERATIONALIZEPredictive Decisions
Close the Loop BetweenInsight and Action
Embed predictive models into critical business processes
Recommend best options for human or automated actions
©2015 RapidMiner, Inc. All rights reserved. - 17 -
Topic #5:
Best Practices For
Implementing
Advanced/Modern Analytics
©2015 RapidMiner, Inc. All rights reserved. - 19 -
EFFORTLESS Predictive Analytics
Immediately Empower Analysts to Anticipate
Opportunity & Risk
Easily Combine Any Data at Unlimited Scale with Any Model
Code-Free, Lightning-Fastand Intuitive
©2015 RapidMiner, Inc. All rights reserved. - 19 -
Topic #6:
Use Of Mixed Environments
For Implementation Of Big
Data Analytics
Growing Importance of Cloud Resources
Slide 21 © 2015 Enterprise Management Associates, Inc.
©2015 RapidMiner, Inc. All rights reserved. - 22 -
- 22 -
Design Once, Deploy ANYWHERE
Leverage Investments in Existing and Future Systems
Design predictive analytics independent of platforms
Seamlessly execute predictive analytics in-memory or in any source, including
data-at-rest or data-in-motion
- 22 -©2015 RapidMiner, Inc. All rights reserved.
Topic #7:
Evolving Role of
the Data Consumer
What We Used to Think
of Analytical Users
Slide 24 © 2015 Enterprise Management Associates, Inc.
Empowering the Line of Business
Slide 25 © 2015 Enterprise Management Associates, Inc.
Topic #8:
Use Cases – Monetizing
Insights Buried In Your
Multi-Structured Data
©2015 RapidMiner, Inc. All rights reserved. - 27 -
Challenge Better understand TV viewing habits to prevent churn and optimize advertising
“RapidMiner allows us to leverage Big Data, in real-time.”
-- Avi BernsteinProfessor at the University of Zurich, Department of Informatics
Drive Broadcast Revenue and Customer Retention
<5stime to generate high value activities based
on predictive analytics
Solution Process Big Data from three million TV viewers, in real-time, to make program recommendations and personalized advertising
©2015 RapidMiner, Inc. All rights reserved. - 28 -
Challenge Monitor corporate performance data in real time to identify correlations, outliers, and economic drivers
“We benefit from the availability of community extensions via the RapidMiner Marketplace. We can easily search for what others have designed in RapidMiner, and use the extensions that are a fit for us.”
-- Tom GattenCEO
Track Data from Millions of Companies to Identify Critical Economic Drivers
4.5 Msubject matter experts’
content analyzed in the United Kingdom
every single day
Solution Use RapidMiner to mashup data of UK businesses, rapidly prototype predictive models & identify outlying, unusual, data
Where To Go From Here?
Slide 29 © 2015 Enterprise Management Associates, Inc.
• Data lakes are an emerging data management architecture
• There are issues fully realizing value from data lakes
• Following best practice/pattern helps
Join the Conversation…
Submit your questions or comments to the panel
using: @wiseanalytics @johnlmyers44 @rapidminer
#predictiveanalytics
Slide 30 © 2015 Enterprise Management Associates, Inc.
Q&A – Please Log Questions in the Q&A Panel
Slide 31 © 2015 Enterprise Management Associates, Inc.
• Visit RapidMiner.com to learn more about
Effortless Predictive Analytics
• Learn more about leading IT analyst firm Enterprise
Management Associates (EMA) at
enterprisemanagement.com