mulesoft dataweave data transformation language

19
Francesco Ganora DataWeave A functional data transformation language from MuleSoft

Upload: fganora

Post on 20-Jan-2017

128 views

Category:

Technology


10 download

TRANSCRIPT

Page 1: MuleSoft DataWeave data transformation language

Francesco Ganora

DataWeave A functional data transformation language from MuleSoft

Page 2: MuleSoft DataWeave data transformation language

The data mapping challenge

JSON

XMLCSV

Fixed Width

POJO

JSONXML

CSV

Fixed Width

POJO

Structural TransformationValue TransformationConditional mapping

Filtering Grouping

Best practice: always define the mapping in terms of the desired target data structure

Page 3: MuleSoft DataWeave data transformation language

The old programmatic approach❖ Map the target message from the source message

programmatically (e.g., via a script or Java method)

❖ Sequence of procedural steps that incrementally build the target message from the source message

❖ Typical example: loop on elements of a source sequence and for each element instantiate a target sub-structure, then attach it to the overall target structure

❖ This approach is neither concise nor expressive; if implemented incorrectly, it is also inefficient

Page 4: MuleSoft DataWeave data transformation language

The templating approach❖ Template engines can be used as

data mapping engines:

❖ We define the target structure (template)

❖ We define how each part of the template is generated dynamically from source data

❖ The template consists of a semi-literal expression with placeholders e.g. $() in the this example

❖ More constructs are necessary to instantiate repetitive structures (looping), for conditional mapping, etc.

{“user”:

{“id”: “$(sourceData.userID)”,

“firstName”: “$(sourceData.givenName)”,

“lastName”: “$(sourceData.lastName)”,

“contacts”: {

“phone”: “$(sourceData.phoneNumber)”,

“email”: “$(sourceData.emailAddress)”

}}

<?xml version="1.0">

<user>

<id> $(sourceData.userID) </id>

<firstName> $(sourceData.givenName) </firstName>,

<lastName> $(sourceData.lastName) </lastName>

<contacts>

<phone> $(sourceData.phoneNumber) </phone>

<email> $(sourceData.emailAddress) </email>

</contacts>

</user>

JSON

XML

Page 5: MuleSoft DataWeave data transformation language

Issues with standard templating❖ Template depends on the concrete syntax of the target message (separate

templates for XML, JSON etc.)

❖ Placeholder syntax depends on the type of source message (e.g., XPath for XML, JSONPath for JSON, non-standard syntax for other media types)

❖ Placeholder syntax may clash with target message syntax (cannot use for example <> as placeholder markers with XML)

❖ Looping constructs of traditional template engines mix engine syntax with generated content (“PHP-like”)

❖ XSLT is a very powerful templating and transformation language, but it does have drawbacks (verbose XML syntax, cannot operate on non-tree-structured source message that cannot be rendered into XML, etc.)

Page 6: MuleSoft DataWeave data transformation language

DataWeave (DW)❖ Data mapping and

transformation tool from MuleSoft

❖ Tightly integrated with AnyPoint Studio IDE

❖ Non-procedural expression language

❖ Applies functional programming constructs (lambdas)

❖ Uses internal, canonical data format (application/dw)

Page 7: MuleSoft DataWeave data transformation language

Canonical data representation

1. DW parses the source message into application/dw canonical format using supplied metadata / DataSense capability

2. A DW expression is used to transform the source message (result still in canonical application/dw format)

3. DW renders the canonical target message into the target MIME type specified as a “header” to the DW expression (e.g. %output application/json)

This decouples the transformation from the concrete syntax of source and target messages!

Source message

<source MIME type>

parser rendererSource

message(canonical)

Target message

(canonical)

Target message

DW expression

<target MIME type>application/dw application/dw

Page 8: MuleSoft DataWeave data transformation language

The DW canonical format❖ Only 3 kinds of data in SW:

• Simple (String, Number, Boolean, Date types)

• Array

• Objects (key:value pairs)

❖ The canonical application/dw format is shown in a JSON-like concrete syntax in Anypoint Studio

❖ Parsing and rendering between application/json and application/dw is straightforward

[ { "order_nr": "DO1234", "order_date": "2016-03-12T13:30:23+8.00", sku: "1233244", "sku_description": "Product A", qty: "20" }, { "order_nr": "DO1234", "order_date": "2016-03-12T13:30:23+8.00", sku: "1233255", "sku_description": "Product B", qty: "50" }]

Page 9: MuleSoft DataWeave data transformation language

XML Parsing❖ repeated XML elements —> repeated object keys

❖ XML attributes —> special @() object

Page 10: MuleSoft DataWeave data transformation language

CSV parsing❖ Array of records (lines)

❖ Record (line) —> array element of type Object

❖ Field in record: object field (key is taken from CSV header line or configured metadata)

❖ Reader configuration to set field separator, etc.

Page 11: MuleSoft DataWeave data transformation language

DW transform structure

%dw 1.0%input payload application/csv%output application/json%type sapDate = :string { format: “YYYYMMDD” }%var unitOfMeasure = 'EA'%var doubleNumber = (nr) -> [nr * 2.0]%namespace xsi http://www.w3.org/2001/XMLSchema-instance%function fname(name) {firstName: upper name}

——-

order: { ID: payload.orderID ++ " dated " ++ payload.orderDate, nrLines: (sizeOf payload.orderItems) + 1, totalOrderAmount: payload.*orderItems reduce

$$ + (($.orderQuantity as :number) * ($.unitPrice as :number)) } }

Optional header contains:• transformation directives• reusable declarations

Body contains the DW transformation expression

Page 12: MuleSoft DataWeave data transformation language

Case study: introductionTransforming a list of order items into a corresponding list of delivery routes.

The source payload is unsorted list of items in CSV format:

OrderId;OrderDate;CustomerId;DeliveryDate;City;ProductId;Quantity

000001;2016-09-14;Customer1;2016-09-20;London;ProductA;120000001;2016-09-14;Customer1;2016-09-20;London;ProductB;88000002;2016-09-15;Customer2;2016-09-20;Paris;ProductC;60000002;2016-09-15;Customer2;2016-09-20;Paris;ProductA;100000002;2016-09-15;Customer2;2016-09-20;Paris;ProductD;15000003;2016-09-15;Customer3;2016-09-23;Berlin;ProductB;14000003;2016-09-15;Customer3;2016-09-23;Berlin;ProductD;30000004;2016-09-15;Customer4;2016-09-20;London;ProductC;14000004;2016-09-15;Customer4;2016-09-20;London;ProductE;30000005;2016-09-16;Customer4;2016-09-20;London;ProductB;20000006;2016-09-16;Customer2;2016-09-22;Paris;ProductD;7000006;2016-09-16;Customer2;2016-09-22;Paris;ProductE;30000007;2016-09-16;Customer5;2016-09-22;Berlin;ProductB;12

The target structure (described in the following slide) is a multi-level JSON structure.

This case study focuses on the structural transformation capabilities of DW, but DW offers a wide range of value and formatting capabilities, conditional mapping, and much more!

Page 13: MuleSoft DataWeave data transformation language

Case study: target format

[ { city: "<City>", deliveryDate: "<DeliveryDate>", stops: [ { customer: "<CustomerId>", orderitems: [ { ordernr: "<OrderId>", orderdate: "<OrderDate>", product: "<ProductId>", qty: "<Quantity>" } ] } ] } ]

JSON document with sequence of delivery routes by delivery date and city:

❖ Sort CSV order lines by city and delivery date

❖ Within each delivery date and city, group order lines by customer

❖ Render the structure as JSON

By city / delivery date

By customer

By order item

Page 14: MuleSoft DataWeave data transformation language

Case study: step 1Source message parsed as application/dw:

The DW expression payload evaluates the entire message payload (see earlier slide “CSV parsing)”

NOTE: the DW transformer Preview functionality in MuleSoft Anypoint Studio maps the sample source in realtime as you type the transformation!

Page 15: MuleSoft DataWeave data transformation language

Case study: step 2Sorting and grouping by combination of city and delivery date:

A composite key is used for sorting and grouping via the string concatenation operator (++) .

The groupBy operator creates an object with the group values as keys.

Page 16: MuleSoft DataWeave data transformation language

Case study: step 3Iterating over the group values (city/delivery date combination) to generate the 1st level of the target structure:

The pluck operator maps an object into an array. $$ is the key in the current iteration, $ is the value.

City and delivery date are mapped from the composite key by String manipulation.

Page 17: MuleSoft DataWeave data transformation language

Case study: step 4Within each route group, group by customer and generate 2nd (inner) level of target structure:

In the inner pluck the context for $ and $$ changes (e.g., $$ is now the CustomerID key).

Page 18: MuleSoft DataWeave data transformation language

Case study: (final) step 5Within each customer group, generate the 3rd (innermost) level of the target structure via the map operator:

Also get the JSON rending by changing the %output directive.

Page 19: MuleSoft DataWeave data transformation language

Thanks!

This is just a “taste” of the innovative DataWeave transformation language.

Find out more at:

https://docs.mulesoft.com/mule-user-guide/v/3.8/dataweave