extending wso2 analytics platform
TRANSCRIPT
![Page 1: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/1.jpg)
Webinar: Extending WSO2 Analytics Platform
Mohanadarshan VivekanandalingamAssociate Technical Lead
![Page 2: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/2.jpg)
Agenda
● Introduction to WSO2 analytics platform
● Examine extensions including
○ Real-time analytics (Siddhi extension)○ Batch analytics extensions○ Event Receiver and Event Publisher extensions○ Predictive analytics extensions
● Outline the benefits of WSO2’s analytics platform through
real-world customer use cases.
2
![Page 3: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/3.jpg)
WSO2 Data Analytics Server
3
![Page 4: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/4.jpg)
WSO2 Data Analytics Server
4
• Fully-open source solution with the ability to build systems and applications that collect and analyze both realtime and persisted data and communicate the results.
• High performance data capture framework
• Highly available and scalable by design
• Pre-built Data Agents for WSO2 products
![Page 5: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/5.jpg)
Real-time Analytics
5
![Page 6: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/6.jpg)
Realtime Analytics Extensions
6
●This includes Siddhi Extensions
■ Custom Function
■ Custom Window
■ Custom Aggregate
■ Custom Stream Function
■ Custom Stream Processor
![Page 7: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/7.jpg)
Function Extension
7
● Consumes zero or more parameters for each event and output a single
attribute as an output.
● This could be used to manipulate event attributes to generate new
attribute like Function operator.
● Extend org.wso2.siddhi.core.executor.function.FunctionExecutor
from InValueStream
select math:sin(inValue) as sinValue
insert into OutMediationStream;
![Page 8: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/8.jpg)
Window Extension
8
● Allows events to be collected and expired without altering the event
format based on the given input parameters like the Window operator.
● Default Window types - Length, Time, Unique and etc..
● Extend org.wso2.siddhi.core.query.processor.stream.window.WindowProcessor
from TempStream#window.custom:customWindow(10)
select *
insert into AvgRoomTempStream ;
![Page 9: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/9.jpg)
Aggregate Extension
9
● Consumes zero or more parameters for each event and output a single
attribute (having an aggregated results based in the input parameters as
an output).
● Used with conjunction with a window in order to find the aggregated
results based on the given window.
● Default Aggregators - sum, max, avg and etc..
● Extend org.wso2.siddhi.core.query.selector.attribute.aggregator.AttributeAggregator
from pizzaOrder#window.length(20)
select custom:count(orderNo) as totalOrders
insert into orderCount;
![Page 10: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/10.jpg)
Stream Function Extension
10
● Allows events to be altered by adding one or more attributes to it. (Simply,
can output multiple outputs)
● Events can be output upon each event arrival
● Extend org.wso2.siddhi.core.query.processor.stream.function.StreamFunctionProcessor
from geocodeStream#geo:geocode(location)
select latitude, longitude, formattedAddress
insert into dataOut;
![Page 11: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/11.jpg)
Stream Processor Extension
11
● Allows to alter an event format
● Considered as Window++
● Extend org.wso2.siddhi.core.query.processor.stream.StreamProcessor
from baseballData#timeseries:regress(2, 10000, 0.95, salary,
rbi, walks, strikeouts, errors)
select *
insert into regResults;
![Page 12: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/12.jpg)
Batch Analytics Extension
12
•User Defined Functions (UDF)
•Aggregators for Lucene Indexing
•DataSource Connectors (Eg: HBase, Cassandra & etc..)
![Page 13: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/13.jpg)
User Defined Functions (UDF)
13
● Apache Spark allows UDFs (User Defined Functions) to be created if you
want want to use a feature that is not available for Spark by default.
public class StringConcatonator implements CarbonUDF {
/**
This UDF returns the concatenation of two strings
*/
public String concat(String firstString, String secondString) {
return firstString + secondString;
}
}
• Add below to DAS_HOME/repository/conf/analytics/spark/spark-udf-config.xml
<udf-configuration>
<custom-udf-classes> <class-name>org.wso2.customUDFs.
StringConcatonator</class-name>
...
</custom-udf-classes>
</udf-configuration>
![Page 14: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/14.jpg)
Aggregators for Lucene Indexing
14
WSO2 DAS contains 5 default Lucene based aggregated functions.
● MIN● MAX● SUM● AVG● COUNT
Users can add custom aggregator function for Lucene by extending below interface.
org.wso2.carbon.analytics.dataservice.core.indexing.aggregates.AggregateFunction
(DAS 3.1.0 onwards)
Refer mail thread - [Architecture] [Analytics] Improvements to Lucene based Aggregate functions (Installing Aggregates as OSGI components)
![Page 15: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/15.jpg)
Datasource Connectors
15
DAS supports below datasource connectors by default.
● RDBMS● Cassandra● HBASE● HDFS
Extension can be written by implementing the below interface,
org.wso2.carbon.analytics.datasource.core.rs.AnalyticsRecordStore
https://docs.wso2.com/display/DAS310/Configuring+Data+Persistence
![Page 16: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/16.jpg)
Predictive Analytics
16
![Page 17: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/17.jpg)
Predictive Analytics Extensions
17
•Dataset Processors
•Input Adapters
•Model Builders
•Output Adapters
![Page 18: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/18.jpg)
Input Adapters
18
● Used to read data from different storages such as files, HDFs and registry.
● Can create an ML Input Adapter by implementing the MLInputAdapter
interface.
![Page 19: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/19.jpg)
Dataset Processors
19
● Each data source should have an implementation of DatasetProcessor.
● ML supports File, HDFS and DAS as data sources. Therefore we have the
following implementation classes.
![Page 20: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/20.jpg)
Model Builders
20
● ML model generation can be extended by implementing
MLModelBuilders.
● Currently we have a supervised spark model builder and an unsupervised
spark model builder.
● If you need to extend model generation to some other library or a new
algorithm type, you can use this extension point of WSO2 ML.
![Page 21: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/21.jpg)
Output Adapters
21
● Used to write data to different storages such as files, HDFS and registry.
● Can create an ML Output Adapter by implementing the MLOutputAdapter
interface.
![Page 22: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/22.jpg)
Event Receiver Extensions
22
● Allows to receive events from different data sources..
● Implemented with OSGI whiteboard pattern.
![Page 23: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/23.jpg)
Event Publisher Extensions
23
•Allows to push events to various data sinks.
•Implemented with OSGI whiteboard pattern.
![Page 24: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/24.jpg)
Case Studies from Real Customers
24
![Page 25: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/25.jpg)
Pacific ControlsPacific Controls is an innovative company delivering an IoT platform of platforms: Galaxy 2021. The platform allows to manage all kinds of devices within a building and take automated decisions such as moving an elevator or starting the air conditioning based on certain conditions. Within Galaxy2021, CEP is used for monitoring alarms and specific conditions.Pacific Controls also uses other products from the WSO2 platform, such as WSO2 ESB and Identity Server.
https://www.youtube.com/watch?v=OG0N7cfaJ_8
![Page 26: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/26.jpg)
UBER
http://www.infoq.com/presentations/uber-stream-processing
UBER uses WSO2 CEP to detect fraud. P.S : Does not pay for us (Opensource at work ! ).
![Page 27: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/27.jpg)
27
A leading Airlines uses CEP to enhance customer experience by calculating the average time to reach their boarding gate (going through security, walking, etc.). They also want to track the time it takes to clean a plane, in order to better streamline the boarding process and notify both the airline and customers about potential delays. They evaluated WSO2 CEP first as they were already using our platform and decided to use it as it addressed all their requirements.
The Cleveland Clinic, ranked among the top 3 hospitals in the US, uses a Clinical Intelligence Platform that combines big data storage, stream and batch processing to provide decision support to clinicians. Real-time analytics for the platform is provided by WSO2 CEP along with custom extensions to handle healthcare data.
![Page 28: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/28.jpg)
Few more use cases
28
![Page 29: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/29.jpg)
29
US Election Monitor https://wso2.com/election2016/
![Page 30: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/30.jpg)
SUPER BOWL 50 - BigData Game
http://wso2.com/landing/big-data-game/
![Page 31: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/31.jpg)
31
Fraud Detection
31
• Use or change the generic rules we provide and add as many rules as they like
• Change weights of Fraud Scoring Model to suit their business needs
• Use the Markov Modelling and Clustering capabilities to learn unknown Fraud Patterns in their domain
• Use the dashboard provided or plug the Fraud Detection Toolkit to their own Fraud Detection UI
http://wso2.com/library/webinars/2015/02/catch-them-in-the-act-fraud-detection-with-wso2-cep-and-wso2-bam/
https://www.youtube.com/watch?v=aLwG4thHOXg
![Page 32: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/32.jpg)
ESB Analytics
ESB Analytics can be used to collect statistics, debug, and profile your mediation sequences.
https://docs.wso2.com/display/ESB500/ESB+Analytics
![Page 33: Extending WSO2 Analytics Platform](https://reader031.vdocuments.us/reader031/viewer/2022021422/5879efd51a28ab70298b4757/html5/thumbnails/33.jpg)
Conclusion
● Next WSO2 Analytics Platform release contains many bug fixes, improvements and features.○ Incremental Processing - Batch Analytics○ Siddhi Performance Improvements - Realtime Analytics○ Siddhi Debugger ○ Analytics features for ESB, APIM, IS, IOT and etc..○ Cross Tenant Data Retrieval in Super Tenant Spark Queries○ Custom Lucene Aggregators
● Stay tuned for next release and related updates.
33