New releases for open source Data Warehouse automation

Today I spend some time finalising and testing new releases for the ecosystem for Data Warehouse Automation, specifically the Taxonomy for ETL Automation Metadata (TEAM) and the Virtual Data Warehouse (VDW) tools.

As a very brief recap, TEAM focuses on source-to-target mapping metadata, linking up a source object to a target object and generating a Json file conform the schema for Data Warehouse Automation.

VDW can ingest these files and apply these to patterns to generate ETL or code.

So, TEAM is the design metadata and VDW the code generator.

What has changed?

The latest versions (v1.6.3 for TEAM and v.1.6.5 for VDW) are primarily bug-fix and ease-of-use oriented based on project feedback from various implementations. The main changes for this release are:

  • Added basic Data Vault validation, so that some issues related to table structure and convention mismatches are detected during validation.
  • Hiding of features not used when in physical mode, so that the interface is cleaner.
  • Added Json export features, including ‘next-up’ object in the lineage as relatedDataObject(s) and other related objects such as metadata connections.
  • Removed the repository feature. This is meant to be ultimately deprecated and was only causing problems between versions for users. This is now always deployed as part of metadata activation.

Details on issues addressed are found in the (now closed) project for v1.6.3: https://github.com/RoelantVos/TEAM/projects/2.

A notable change that may have impact on existing implementations is the (improved) handling of prefixes and suffixes. This means that underscores ‘_’ are now not automatically added. Existing keys and prefixes may need updating as a result.

For example the key prefix SK or HSH may now need to be _SK or _HSH is the underscore needs to be retained. This is related to issue #73.

As always there are many smaller tweaks and tidy-ups such as additional documentation, hover-overs and UI improvements.

The VDW changes are mainly to support the pattern changes. These are not functional changes, but some patterns benefit from the ‘next up’ Related Data Object for lookup purposes. For example, the Staging Area patterns use this to support the Full Outer Join interface.

Roelant Vos

Ravos Business Intelligence admin

You may also like...

Leave a Reply