Roelant Vos Data solution design patterns, implementation, and automation
Today, a new set of changes -all part of the ecosystem for Data Warehouse Automation– have been formally published as new releases on Github. As always a large amount of work has been done in the background, and thanks go out to various projects and teams for providing input and suggestions. This round of improvements concerns the TEAM (management of source-target mapping metadata) application, the Virtual Data Warehouse (VDW) pattern management tool and the Data...
Over the holidays I received various emails related to the (equally) various components of the data warehouse automation ecosystem. I have been placing these in FAQ sections on the weblog, and just wanted to let you know :-). The links are as follows: TEAM FAQ Virtual Data Warehouse FAQ Schema for Data Warehouse Automation FAQ I’ll add a section for the DIRECT control framework in the near future also.
Running all your ETL without any dependencies, while ensuring (eventual) consistency in data delivery.
Generating DIY ETL code in a Jenkins pipeline, using Virtual Data Warehouse example metadata and code generation patterns.
Using a command line utility to generate ETL.
Why the rush? With the latest Virtual Data Warehouse release, there was some urgency in making sure the matching metadata management functionality would become available (to avoid using notepad), so here it is! https://github.com/RoelantVos/TEAM/releases/tag/v1.6.1 TEAM is meant to simplify the management and creation of metadata files, and do most of the hard work based on the principle of doing as little as possible and using conventions to generate the details. The output is made available...
Virtual Data Warehouse v.1.6.3. is a bug-fix release that adds better exception handling in various places, especially when metadata (Json) files do not match the expected format. This is now handled gracefully, and warnings or errors are visible in the event log. Also, defaults are put in place to allow code generation for metadata that does confirm to the Data Warehouse Automation schema but is missing some details that VDW needs to build the user...
This latest version of the Virtual Data Warehouse requires no database, and comes with Data Vault examples to generate code straight away.
The patterns using the schema for Data Warehouse Automation are so versatile, complex screens for PIT, unit testing and RI can be removed.
For a few days I have been adding content to the Frequently Asked Questions section regarding the generic schema for Data Warehouse Automation. A common question, and one that was raised again in today’s Ensemble Modelling conference, is what the benefit is for an organisation to take this approach over purchasing vendor software licenses (‘off the shelf’ solutions). Is adopting a schema and approach such as this at odds with purchasing ETL or Data Warehouse...