pentaho business analytics & data integration amjad.akkawi@zaponet.com
Post on 25-Feb-2016
52 Views
Preview:
DESCRIPTION
TRANSCRIPT
Pentaho business analytics & data integration
Amjad.akkawi@zaponet.com
About US – Zaponet data science solutions
Zaponet is a service integrator and development shop providing solutions & professional services for building state of the art data-products which leverage big-data & data-science technologies.
Zaponet architect, design and builds big-data solutions: data warehouses, user-profile systems, recommendation engines, complex event processing and more
Some of our technology partners are: pentaho ,cloudera ,infobright , vertica, kognitio ,gigaspaces
• more details www.zaponet.com *future meetup: Pentaho Weka for data science
About Me – Amjad Akkawi
Zaponet CTO
Experience in pentaho
Agenda
• Pentaho in business analytics & data integration
• Pentaho BI Demo• Pentaho PDI Demo
About Pentaho
• Recognized leader in business analytics & data integration• Subscription-based business model• Achieved critical mass:
• Over 1,200 commercial customers• Over 10,000 production deployments• Over 185 countries
• Stewardship of most important open source analytics projectsINDUSTRY RECOGNITION OVER 160 PARTNERS GLOBALLY
Why Customer Love Pentaho
Innovation & Scalability
Superior Customer
Service
Total Value
8 weeks time to market
2 weeks time to market
€350K+ cost saving75% lower acquisition costs
Music files from 20,000 sources
Operational reports at all 1000 retail stores
Less than 1 month ROI
Analyzing buying patterns of 5 million
membersAnalytics on 500,000
patients records
…“better functionality and more support”
…“top-notch professional support”
“Pentaho support is as good as its software”
…“a great partner through every phase of
our project”
…“ROI was almost immediate”.
Fully rolled out in budget in 4 months
Marketing dashboard in less than 1 day
Speed of Deployment
Pentaho in the Big Data Fabric
Big
Dat
a M
gmt
HadoopJava MapReduce, PigPentaho MapReduce
NoSQL Databases Analytic Databases
Data IntegrationJob Orchestration
Workflow
SchedulingHigh Performance
Visual IDE
Dat
a In
tegr
atio
n
Pentaho Business Analytics•R
•3rd Party BI Tools•Applications
3rd Party Tools
Big
Ana
lytic
s
High Level Feature/Functions
Advanced Power Users
&ViewersData Mining
Information ConsumersDashboards
Knowledge Workers/
Business UsersAnalysis
Business UsersReporting
Power Users,Developers &
DBAsData
Advanced Predictive
Analysis
Self-service InteractiveKPI & Metrics and
Visualization
Self-service Interactive and Ad Hoc Analysis
Ad hoc and Operational
Reports
High Performance Data Integration, BIG DATA, Cleansing
and Presentation
Com
pone
nts a
re in
depe
nden
t
High Level Feature/Functions
Advanced Power Users
&ViewersData Mining
Information ConsumersDashboards
Knowledge Workers/
Business UsersAnalysis
Business UsersReporting
Power Users,Developers &
DBAsData
Advanced Predictive
Analysis
Self-service InteractiveKPI & Metrics and
Visualization
Self-service Interactive and Ad Hoc Analysis
Ad hoc and Operational
Reports
High Performance Data Integration, BIG DATA, Cleansing
and Presentation
Dashboards
Dashboards & Interactive Dashboards
Dashboards – Geo Location-Based
High Level Feature/Functions
Advanced Power Users
&ViewersData Mining
Information ConsumersDashboards
Knowledge Workers/
Business UsersAnalysis
Business UsersReporting
Power Users,Developers &
DBAsData
Advanced Predictive
Analysis
Self-service InteractiveKPI & Metrics and
Visualization
Self-service Interactive and Ad Hoc Analysis
Ad hoc and Operational
Reports
High Performance Data Integration, BIG DATA, Cleansing
and Presentation
Reports – Interactive, Static, Distributed
15
Reports – Reporting Pack & House Styles
Reports – Reporting Pack & House Styles
High Level Feature/Functions
Advanced Power Users
&ViewersData Mining
Information ConsumersDashboards
Knowledge Workers/
Business UsersAnalysis
Business UsersReporting
Power Users,Developers &
DBAsData
Advanced Predictive
Analysis
Self-service InteractiveKPI & Metrics and
Visualization
Self-service Interactive and Ad Hoc Analysis
Ad hoc and Operational
Reports
High Performance Data Integration, BIG DATA, Cleansing
and Presentation
18
Enhanced In-Memory Analytics• Enhanced in-memory caching for speed of
thought visualization & analysis– More re-usability of in-memory data– Fewer trips to the database/disk
• Builds on existing unique extreme-scale in-memory analytics– Support for external data grids
• Infinispan / JBoss Enteprise Data Grid and Memcached
• Scale to caching hundreds of GBs (potentially TBs) of data in-memory
• Competition– Java heap or C++ memory space (a few GB at
most (most BI products)or
– Proprietary (hard to manage) in-memory technology (e.g. Qlikview, Microstrategy)
Analyzer – Table format
Analyzer – Chart format
Analyzer: Geo Location-Based Analysis
High Level Feature/Functions
Advanced Power Users
&ViewersData Mining
Information ConsumersDashboards
Knowledge Workers/
Business UsersAnalysis
Business UsersReporting
Power Users,Developers &
DBAsData
Advanced Predictive
Analysis
Self-service InteractiveKPI & Metrics and
Visualization
Self-service Interactive and Ad Hoc Analysis
Ad hoc and Operational
Reports
High Performance Data Integration, BIG DATA, Cleansing
and Presentation
Scenario 1
OperationalDatabase Dashboard
Report
Scenario 2
Data Mart(s) / Warehouse
Metadata
Dashboard
Report
Analyzer
Metadata – Schema WorkbenchComplex calculations and multi-cube requirements may need more modeling
Scenario 3
Unstructured Data100
Data Mart(s) / Warehouse
Structured Data
BIG DATA Technology
and/orStaging Area &
Data Vault
Pentaho Data Integration
Source data acquisition
Initial consolidation as required
Pentaho Data Integration
Cleansing
Transformation
Change Data Capture
Data Warehouse Management
PDI PDI Metadata
Dashboard
Report
Analyzer
Variations on a Theme
Unstructured Data
Ad-hoc Data
Data Mart(s) / Warehouse
Structured Data
AlertingSMS, eMail & attachments
Pentaho Data Integration
Source data acquisition
Initial consolidation as required
Pentaho Data Integration
Cleansing
Transformation
Change Data Capture
Data Warehouse Management
PDI PDI Metadata
Dashboard
Report
Analyzer
BIG DATA Technology
and/orStaging Area &
Data Vault
PDI Components• Enterprise Edition Data Integration Server
– Execution and remote monitoring– Integrated scheduling– Enterprise Security options– Enhanced content management including revision history and locking– Remote distributed cluster based processing
Kettle Conceptual Model
Pentaho Data Integration
Step based processing engine with instant visualization of results
Pentaho Data Integration
Step based performance
32
Pentaho Data Integration
Integrated Metadata Creation
Pentaho and Big DataForrester Wave, Enterprise Hadoop Solutions, Q1 2012
Only vendor in strong performer category: “an impressive Hadoop integration tool”
Only business analytics vendor
Richest functionality Most extensive integration
with open source Apache Hadoop and major Hadoop distributions
Expanded Insight into Big and Diverse Data• Improved support for Hadoop
– Simpler deployment across Hadoop clusters• Support for the Hadoop cache• Debian RPM installer
– Performance and ease of use enhancements for Pentaho MapReduce visual development
– Support for Hadoop Security data access
• New NoSQL database support– Cassandra– MongoDB
• Growing the Pentaho big data community– Open sourced all big data components (Hadoop & NoSQL)
• Apache License – same as used by leading Hadoop and NoSQL distros
– New big data developer resources: How to documents, videos, walk-throughs
Hadoop Data Management & Integration
Accessible by any ETL developer or data scientist
Pentaho MapReduce
NoSQL Data Management & Integration
Accessible by any ETL developer or data scientist
Visual Job OrchestrationAny Data Source
Visual Job Orchestration Any Data Source
Scheduling
Accessible to any ETL developer
or data scientist
Pentaho Integration Options
PentahoBI Server
OtherApplication
Pentaho
CustomStuff
My Application
PentahoComponents
IntegrationBundled Mashup Extended Embedded
Value Fastest Way to Get Analytics that
Have Your Look & Feel
An Integrated Experience for Yours
End User
Customizing Pentaho for Your
Experience
Ultimate Integration and Customization
What it Takes?
• Pentaho is a separate app, branded with Partner’s logo, look & feel
• Optional: Partner app may include links to Pentaho reports, analysis and dashboards (popping new window)
• Optional: Single sign-on creates a seamless experience
• Pentaho & Partner app have the same UI
• Pentaho User Console, or individual reports, analysis or dashboards are included in partner app
• Single sign-on creates a seamless experience
• Pentaho’s core functionality is extended through plug-ins. Examples:- Connecting to custom data sources- Adding new visualizations- Customizing security- Replacing Pentaho rules engine
• Integrate with Partner’s App Server
• Directly embedding Pentaho into your app
• Calling Pentaho Java APIs from your App
Skill Level • Limited HTML skills • HTML skills • HTML skills• Java skills
• HTML skills• Java skills• Knowledge of Pentaho architecture
Q & A
NEXT …Pentaho PDI DemoPentaho BI Demo
“Traditional” Database Support
DATA INTEGRATIONDATA ANALYSIS
Broadest Support for Big Data Platforms
Hadoop NoSQL Analytic Databases
top related