mulesoft dataweave data transformation language
TRANSCRIPT
Francesco Ganora
DataWeave A functional data transformation language from MuleSoft
The data mapping challenge
JSON
XMLCSV
Fixed Width
POJO
JSONXML
CSV
Fixed Width
POJO
Structural TransformationValue TransformationConditional mapping
Filtering Grouping
Best practice: always define the mapping in terms of the desired target data structure
The old programmatic approach❖ Map the target message from the source message
programmatically (e.g., via a script or Java method)
❖ Sequence of procedural steps that incrementally build the target message from the source message
❖ Typical example: loop on elements of a source sequence and for each element instantiate a target sub-structure, then attach it to the overall target structure
❖ This approach is neither concise nor expressive; if implemented incorrectly, it is also inefficient
The templating approach❖ Template engines can be used as
data mapping engines:
❖ We define the target structure (template)
❖ We define how each part of the template is generated dynamically from source data
❖ The template consists of a semi-literal expression with placeholders e.g. $() in the this example
❖ More constructs are necessary to instantiate repetitive structures (looping), for conditional mapping, etc.
{“user”:
{“id”: “$(sourceData.userID)”,
“firstName”: “$(sourceData.givenName)”,
“lastName”: “$(sourceData.lastName)”,
“contacts”: {
“phone”: “$(sourceData.phoneNumber)”,
“email”: “$(sourceData.emailAddress)”
}}
<?xml version="1.0">
<user>
<id> $(sourceData.userID) </id>
<firstName> $(sourceData.givenName) </firstName>,
<lastName> $(sourceData.lastName) </lastName>
<contacts>
<phone> $(sourceData.phoneNumber) </phone>
<email> $(sourceData.emailAddress) </email>
</contacts>
</user>
JSON
XML
Issues with standard templating❖ Template depends on the concrete syntax of the target message (separate
templates for XML, JSON etc.)
❖ Placeholder syntax depends on the type of source message (e.g., XPath for XML, JSONPath for JSON, non-standard syntax for other media types)
❖ Placeholder syntax may clash with target message syntax (cannot use for example <> as placeholder markers with XML)
❖ Looping constructs of traditional template engines mix engine syntax with generated content (“PHP-like”)
❖ XSLT is a very powerful templating and transformation language, but it does have drawbacks (verbose XML syntax, cannot operate on non-tree-structured source message that cannot be rendered into XML, etc.)
DataWeave (DW)❖ Data mapping and
transformation tool from MuleSoft
❖ Tightly integrated with AnyPoint Studio IDE
❖ Non-procedural expression language
❖ Applies functional programming constructs (lambdas)
❖ Uses internal, canonical data format (application/dw)
Canonical data representation
1. DW parses the source message into application/dw canonical format using supplied metadata / DataSense capability
2. A DW expression is used to transform the source message (result still in canonical application/dw format)
3. DW renders the canonical target message into the target MIME type specified as a “header” to the DW expression (e.g. %output application/json)
This decouples the transformation from the concrete syntax of source and target messages!
Source message
<source MIME type>
parser rendererSource
message(canonical)
Target message
(canonical)
Target message
DW expression
<target MIME type>application/dw application/dw
The DW canonical format❖ Only 3 kinds of data in SW:
• Simple (String, Number, Boolean, Date types)
• Array
• Objects (key:value pairs)
❖ The canonical application/dw format is shown in a JSON-like concrete syntax in Anypoint Studio
❖ Parsing and rendering between application/json and application/dw is straightforward
[ { "order_nr": "DO1234", "order_date": "2016-03-12T13:30:23+8.00", sku: "1233244", "sku_description": "Product A", qty: "20" }, { "order_nr": "DO1234", "order_date": "2016-03-12T13:30:23+8.00", sku: "1233255", "sku_description": "Product B", qty: "50" }]
XML Parsing❖ repeated XML elements —> repeated object keys
❖ XML attributes —> special @() object
CSV parsing❖ Array of records (lines)
❖ Record (line) —> array element of type Object
❖ Field in record: object field (key is taken from CSV header line or configured metadata)
❖ Reader configuration to set field separator, etc.
DW transform structure
%dw 1.0%input payload application/csv%output application/json%type sapDate = :string { format: “YYYYMMDD” }%var unitOfMeasure = 'EA'%var doubleNumber = (nr) -> [nr * 2.0]%namespace xsi http://www.w3.org/2001/XMLSchema-instance%function fname(name) {firstName: upper name}
——-
order: { ID: payload.orderID ++ " dated " ++ payload.orderDate, nrLines: (sizeOf payload.orderItems) + 1, totalOrderAmount: payload.*orderItems reduce
$$ + (($.orderQuantity as :number) * ($.unitPrice as :number)) } }
Optional header contains:• transformation directives• reusable declarations
Body contains the DW transformation expression
Case study: introductionTransforming a list of order items into a corresponding list of delivery routes.
The source payload is unsorted list of items in CSV format:
OrderId;OrderDate;CustomerId;DeliveryDate;City;ProductId;Quantity
000001;2016-09-14;Customer1;2016-09-20;London;ProductA;120000001;2016-09-14;Customer1;2016-09-20;London;ProductB;88000002;2016-09-15;Customer2;2016-09-20;Paris;ProductC;60000002;2016-09-15;Customer2;2016-09-20;Paris;ProductA;100000002;2016-09-15;Customer2;2016-09-20;Paris;ProductD;15000003;2016-09-15;Customer3;2016-09-23;Berlin;ProductB;14000003;2016-09-15;Customer3;2016-09-23;Berlin;ProductD;30000004;2016-09-15;Customer4;2016-09-20;London;ProductC;14000004;2016-09-15;Customer4;2016-09-20;London;ProductE;30000005;2016-09-16;Customer4;2016-09-20;London;ProductB;20000006;2016-09-16;Customer2;2016-09-22;Paris;ProductD;7000006;2016-09-16;Customer2;2016-09-22;Paris;ProductE;30000007;2016-09-16;Customer5;2016-09-22;Berlin;ProductB;12
The target structure (described in the following slide) is a multi-level JSON structure.
This case study focuses on the structural transformation capabilities of DW, but DW offers a wide range of value and formatting capabilities, conditional mapping, and much more!
Case study: target format
[ { city: "<City>", deliveryDate: "<DeliveryDate>", stops: [ { customer: "<CustomerId>", orderitems: [ { ordernr: "<OrderId>", orderdate: "<OrderDate>", product: "<ProductId>", qty: "<Quantity>" } ] } ] } ]
JSON document with sequence of delivery routes by delivery date and city:
❖ Sort CSV order lines by city and delivery date
❖ Within each delivery date and city, group order lines by customer
❖ Render the structure as JSON
By city / delivery date
By customer
By order item
Case study: step 1Source message parsed as application/dw:
The DW expression payload evaluates the entire message payload (see earlier slide “CSV parsing)”
NOTE: the DW transformer Preview functionality in MuleSoft Anypoint Studio maps the sample source in realtime as you type the transformation!
Case study: step 2Sorting and grouping by combination of city and delivery date:
A composite key is used for sorting and grouping via the string concatenation operator (++) .
The groupBy operator creates an object with the group values as keys.
Case study: step 3Iterating over the group values (city/delivery date combination) to generate the 1st level of the target structure:
The pluck operator maps an object into an array. $$ is the key in the current iteration, $ is the value.
City and delivery date are mapped from the composite key by String manipulation.
Case study: step 4Within each route group, group by customer and generate 2nd (inner) level of target structure:
In the inner pluck the context for $ and $$ changes (e.g., $$ is now the CustomerID key).
Case study: (final) step 5Within each customer group, generate the 3rd (innermost) level of the target structure via the map operator:
Also get the JSON rending by changing the %output directive.
Thanks!
This is just a “taste” of the innovative DataWeave transformation language.
Find out more at:
https://docs.mulesoft.com/mule-user-guide/v/3.8/dataweave