mapr –pentaho business solutions · mapr –pentaho business solutions the benefits of a...
TRANSCRIPT
MapR– PentahoBusinessSolutionsTheBenefitsofaConvergedPlatformtoBigDataIntegration
TomScurlockDirector,WWAlliancesandPartners,MapR
KeyTakeaways
1. We“focusonbusinessvalues”and“businessoutcomes,”PentahoandMapRtechnologiesfollowthesolutions.
2. Businesstransformationiscomplex. Companiesneedtoevaluatehowdatacanbeusedtogeneraterevenuestreamsusingaconvergeddataplatform.
3. Companiesthatinvestinoneconvergeddataplatformforlargedatavolumes,varietyandvelocitywillincreaserevenues,businessagility,productivityintoday’sdigitaleconomy.
Agenda
• Whythedatausedinbusinesstransformationiscomplex?
• Manypointsofsolutionsonmultipleclusters
• MapR – Convergeddataplatform
• Infrastructure:– Optimizedresourcesconsumption– MapR buildsabreakthroughtechnology
• Datareferencearchitecture
• Nextgenerationofenterpriseapplications:– Enabletransformationthroughconvergedapplications– ’To-Be’enterpriseapplicationsarchitecture– PentahoAndMapR solutions
WhytheDataUsedInBusinessTransformationisComplex?
Dataresidesinmultipleplaces
Changingregulatory
requirements
Dataismessyandincomplete
Dataisstructured,Semi-StructuredandUnstructured
Dataisinconsistent
ANALYTICALAPPLICATIONS
Businessinsight
OPERATIONALAPPLICATIONSBusinessperformance
NextGenerationofEnterpriseConvergedApplications
Next-Gen Applications
Complete Access to Real-time and Historical Data in One Platform
BusinessTransformationChallenges
ApplicationIntegration
B2BIntegration
FileIntegration ProcessIntegration
DataConsumption
DataSynchronization DataIntegration
MarketForces
Cloud Mobile
E-Commerce
Big Data
IoT
IndustryInitiatives
Fraud Detection in Real Time
360 Customers Visibility
Predictive Maintenance
Smart Logistics
Omni-ChannelOptimization
Hadoop&SparkCluster
DocumentDB
ClassicDataWarehouse
NoSQL
ApplicationServer
MessageMiddleware
IBMMainframe
ExpensivetoStitch|Fragile|LimitationsforSpeed,Scale,Reliability
JSON API
ODBC
JMSHBASE API
REST-APIODBC/JDBC
C-Levels
HDFS API BUSINESSCONCERN:
• DATABECOMESEXPENSIVE
• I/OSPEED• SCALE/PERF.• SECURITY• H.A.• COMPLEXITY• SINGLETENANT• CLUSTERSPRAWL
IT Budget
ESB / Data Integration Platform
ConvergedDataPlatformonASingleCluster
EngineeredassingleplatformFiles,Tables,Documents,andStreams
Enterprise-GradeCapabilitiesRunsatalowercost
ConvergedapplicationdevelopmentplatformScale-outtoanyworkload
“Connected”,notconvergedSeparatesolutionsperdatatype
OperationalconcernsExpensivetooperate
Datareplication,datamovementLimitedinscale
Hadoopcluster
Streamprocessing
Classicdatawarehouse
Messagingplatform
NoSQLdatabase
DocumentDatabase
Searchserver
ManyPointSolutionsonMultipleClusters
© 2017 MapR Technologies 10
MAPR CONVERGED DATA PLATFORM
Data Center
CONVERGED DATA PLATFORMHigh Availability Real Time Unified Security Multi-tenancy Disaster Recovery Global Namespace
EVENT DATA STREAMS
ANALYTICS & MLENGINES
OPERATIONAL DATABASE
CLOUD-SCALE DATA STORE
ON-PREMISE, IN THE CLOUD, HYBRID
Existing Enterprise Applications
Intelligent ApplicationsBatch & InteractiveAnalytics
IoT/Edge
ComplexEventProcessinginReal-TimeBusinessRulesEngine,ProcessAutomation,IdentifyAlerts,Patterns.
MapR– EnterpriseConverged DataPlatform
CONVERGEDDATAPLATFORMHighAvailability RealTime UnifiedSecurity Multi-tenancy DisasterRecovery GlobalNamespace
EVENTDATASTREAMS
ANALYTICS&MLENGINES
OPERATIONALDATABASE
CLOUD-SCALEDATASTORE
ContainerApps. OperationalIntelligenceML,StreamingAnalytics
EventEnabledIntelligentApplications
Data Producers
LegacyApps.(ERP,CRM,BI)
HDFSAPI POSIX,NFS HBaseAPI JSONAPI KafkaAPIJDBC/ODBC
On-Premise, In the Cloud, Hybrid
Smart Logistics
Fraud DetectionIn Real Time
Predictive Maintenance
Supply Chain Visibility
Omni-ChannelCustomer Engagement
(PUB/SUB)
RESTAPI
OptimizedResourceConsumption
Linux File System(general purpose, slower than MapR-FS,
leaves HA up to other engines)
Storage Hardware
HDFS (append-only)
Java Virtual Machine
HBase(excessive writes)
Java Virtual Machine
Storage Hardware
Every layer contends for more CPU and memory
Efficient architecture frees up resources: shared HA, DR, and I/O systems
Java Virtual Machine
Kafka(separate cluster)
X
Replace with speed, connectivity, HA/DR
Replace with less I/O and RAM consumption
Eliminate layer
Replace with fullread-write
MapR-FS + MapR-DB + MapR Streams
Fast, Efficient, Direct I/O
Eliminate layer
X
Hadoop: MapR:
OptimizedforSpeed
Supportsparallelprocessingoflargescaleanalyticsandmachinelearningacrossdata.
MapR– BuiltwithBreakthroughTechnologyInnovativearchitecturedeliversuncompromisingscale,speedandavailability
OptimizedforAvailability
Providesadvancedcapabilitiesincludingself-healinganddisasterrecoverytosupportcontinuousdataaccess.
OptimizedforScale
Enableshighscaleprocessingbyorganizingunderlyingdataintolargedistributedcontainerstoscaletotrillionsoffiles.
MapR DataReferenceArchitecture
ANALYTICALAPPLICATIONS
Businessinsight
OPERATIONALAPPLICATIONSBusinessperformance
NextGenerationofEnterpriseConvergedApplications
Next-Gen Applications
CompleteAccesstoReal-timeandHistoricalDatainOnePlatform
CentralDataProcessing&Aggregation
Ad-hocanalysis
OtherDataSources
Real-timeanalysis
Reporting
Streaming
Stream
Topic Replicating
DataCentersWorldwide
Stream
Topic
Stream
Topic
Data Ingesting
‘To-Be’EnterpriseApplicationArchitecture:GloballyDistributed
StageBusinessAgility:Accelerateslarge scale deployment.
StageInfrastructure Agility:Offers ETLOffloadingAnddataoptimization toreduceTCO
StageApplications Agility:Deliversevent-enabled apps.,monitoringbusiness decisions inreal-time.
MapR - ConvergedDataPlatform
JourneywithMapRConvergedDataPlatform
BusinessIntelligence:Re-useofexisting BIapps.(Tableau,MicroStrategies,Cognos,Crystal,Custom Apps.,etc).
Next-GenApps:Operational Intelligence,Streaming Analytics,MachineLearning,IoT Predictive Maintenance.
On-Prem/Cloud:DataReplication,HAbetween datacentersinreal-time.
Micro-Services: Docker,Containers,Images,Kubernates,Docker Swam.
DataFabric:Fast TCOreduction indataappliances andservers,DataOff-loading &Optimization.
Adop
tion+Bu
sine
ssValue
Benefits:
• Unified Files,Tables,Events Streamsunder one admin.
• On-Premise,Any Clouds• Next GenApps,Micro-Services• DataNormalization for structure,semi-
structure &non-structure datafrommultiple APIsources
• Multi-tenancy• Supports Hadoop Distributed File
System &Eco-System.• Shareresources• DataLocality• Supports No-SQL,HBASEDB• Massive Parallel Processing• Fully Read/Write inreal-time• ExtremeDataScalability,HA,Security,
HighPerformance&Relibility.• Geo-Replication
1
2
3
AnalyticalDatabase
ModalityData
DNAVariant
ClinicalData
PatientSatisfaction
EHR/EMR
PACS/Imaging
Administrative
FinancialData
CloudAnnotation
PentahoandMapRDataintegration,orchestrationandanalytics
ConvergedDataPlatform
ValidateCleanse
Standardize
Blend&Ingest
MapReduceSpark
MachineLearning
Process&Refine
AnalyzeOptimize
Orchestrate
Deliver
Redshift
VirtualizedData
Reports Discovery
Visualization Predictive
DataasaService
EmbeddedApplications