Fun with code generation patterns – transformations

by Roelant Vos · Published November 10, 2020 · Updated November 10, 2020

In this last post covering some of the new features of the generic schema for Data Warehouse Automation I would like to share some examples of how to define transformations and logic that is more complex than the typical column-to-column mapping.

In the schema definition, a ‘source’ is defined as either (OneOf) a code snippet (DataQuery) or an object (DataObject or DataItem). This means that, at both Data Object as well as Data Item level, the ‘source’ can be either an object or code in a given language.

For added flexibility, and as we’ve seen in the post about adding multiple sources, this also means that a ‘source’ can be a combination of objects and logic.

As with all use-cases, there is an important separation between the recording of the metadata (in Json, conforming to the schema definition) and the application of this metadata in patterns. This is where the flexibility truly comes into play, as you are able to combine all metadata in any combination by defining a pattern.

Defining transformations in metadata

Consider the Json metadata snippet below. In this example there are two mappings defined as part of the list (dataObjectMappings).

Mapping1 contains a standard source- and target Data Object, but at Data Item level a calculation has been defined to be mapped to the ‘TargetColumn’ attribute. The other two attributes are normal column-to-column mappings.
Mapping2 contains a calculation at Data Object level, where the result is directly mapped against a target Data Object.

{
  "dataObjectMappings": [
    {
      "mappingName": "Mapping1",
      "sourceDataObjects": [
        {
          "name": "SourceTable"
        }
      ],
      "targetDataObject": {
        "name": "TargetTable"
      },
      "dataItemMappings": [
        {
          "sourceDataItems": [
            {
              "dataQueryCode": "SELECT 10/2"
            }
          ],
          "targetDataItem": {
            "name": "TargetColumn"
          }
        },
        {
          "sourceDataItems": [
            {
              "name": "Column2Source"
            }
          ],
          "targetDataItem": {
            "name": "Columns2Target"
          }
        },
        {
          "sourceDataItems": [
            {
              "name": "Column3Source"
            }
          ],
          "targetDataItem": {
            "name": "Columns3Target"
          }
        }
      ]
    },
    {
      "mappingName": "Mapping2",
      "sourceDataObjects": [
        {
          "dataQueryCode": "MAX(amount) AS amount"
        }
      ],
      "targetDataObject": {
        "name": "TargetTable"
      }
    }
  ]
}

Defining a pattern for transformations

The pattern created for this example shows two ways you can access the transformation logic (DataQueryCode).

The first way to do this, is by directly retrieving the logic itself from the list / array. In the example, the snippet {{sourceDataObjects.0.dataQueryCode}} means that from the list of source Data Objects, the first entry is selected and the dataQueryCode property is shown (assuming it exists, otherwise the output will be blank).

This is akin to retrieving a specific segment / property that you know exists in a specific place, based on your way of using the schema definition.

The second approach is displayed where the attributes are mapped. The pattern iterates over the list of Data Item Mappings and then displays {{sourceDataItems.0.name}}{{sourceDataItems.0.dataQueryCode}} as source for each column. What happens here is that, because only one of these items can be present, the available item is shown – because the missing one will be left blank.

A potential third way is to add a custom helper function to evaluate existence of segments so the code can be tidied up a bit, but this is not part of this example.

--Example using a query as a source rather than an attribute.
{{#each dataObjectMappings }}

--Working on {{mappingName}}.

SELECT {{sourceDataObjects.0.dataQueryCode}}
{{#each dataItemMappings}}
  {{sourceDataItems.0.name}}{{sourceDataItems.0.dataQueryCode}} as {{targetDataItem.name}}{{#unless @last}},{{/unless}}
{{/each}}
FROM {{targetDataObject.name}}

{{/each}}

When run, the results look like the below.

These are examples of adding the building blocks of complex solutions. Bear in mind that in principle any output can be generated, especially while working against platforms that have mature APIs. This approach can be used for most, if not all, data integration platforms.

Fun with code generation patterns – transformations

Defining transformations in metadata

Defining a pattern for transformations

You may also like...

Leave a Reply Cancel reply

Search this site

Upcoming Events

Recent Posts

Fun with code generation patterns – transformations

Defining transformations in metadata

Defining a pattern for transformations

You may also like...

Using (and moving to) raw data types for hash keys

A brief update on data solution automation progress

Loading too fast for unique date / time stamps – what to do?

Leave a Reply Cancel reply

Search this site

Upcoming Events

Recent Posts