mule batch processing
TRANSCRIPT
MULE - Batch Processing
2
Batch Processing in Mule
Batch is a Mule construct that provides the ability to process messages in
batches. Within an application, you can initiate a batch job which is a block
of code that splits messages into individual records, performs actions upon
each record, then reports on the results and potentially pushes the
processed output to other systems or queues.
3
Batch processing is particularly useful when working with the following scenarios:
Integrating data sets, small or large, streaming or not, to parallel process recordsSynchronising data sets between business applications, such as syncing contacts between Netsuite and Salesforce, effecting “near real-time”data integrationExtracting, transforming and loading (ETL) information into a target system, such as uploading data from a flat file (CSV) to HadoopHandling large quantities of incoming data from an API into a legacy system
4
5
Learn Batch Fundamentals
Mule’s December 2013 release shipped with a major leap forward feature that will massively change and simplify Mule’s user experience for both SaaS and On-Premise users. Yes, we are talking about the new Batch jobs. If you need to handle massive amounts of data, or you’re longing for record based reporting and error handling, or even if you are all about resilience and reliability with parallel processing, then this post is for you!
6
What's new in Batch
We received great feedback about it and we even have some CloudHub users happily using it in production! However, we know that the journey of Batch has just begun and for the Early Access release of Mule 3.5 we added a bunch of improvements.
Let’s have a look!
https://www.mulesoft.com/exchange#!/batch-process-mule?filters=Business%20Process%20Administration
7
Error handling in Batch
Fact: Batch Jobs are tricky to handle when exceptions raise. The problem is the huge amounts of data that these jobs are designed to take. If you’re processing 1 million records you simply can’t log everything. Logs would become huge and unreadable. Not to mention the performance toll it would take. On the other hand, if you log too little then it’s impossible to know what went wrong, and if 30 thousand records failed, not knowing what’s wrong with them can be a royal pain. This is a trade-off not simple to overcome.
8
Near real time sync with Batch
Learn how to do Real time sync with Mule ESB. We’ll use several of the newest features that Mule has to offer – like the improved Poll component with watermarking and the Batch Module. Finally we’ll use one of our Anypoint Templates as an example application to illustrate the concepts.