Category: ETL

Mappings, code, guidelines and ideas about Extracting, Transformation and Loading.

0

Deterministic dimension keys in a virtual Data Vault

When preparing Data Vault content for consumption in a dimensional model, dimension keys can be created to join the resulting fact- and dimension tables in a performant way. But what about for a truly virtual data mart? This post covers approaches to issue dimension keys that are fully deterministic.

0

Generating data logistics using Biml with the schema for Data Warehouse Automation

This post details how you can connect your own Biml automation framework to the schema for Data Warehouse Automation. A focus on being technology agnostic, and repository-less The schema for Data Warehouse Automation provides a way to separate the storage of your design and code generation (meta) data from how you interact with it and how you apply it to generate data solutions. This (meta)data is something I refer to as design metadata, and it...

0

Major revision of the DIRECT framework

A new set of improvements have been committed to the data logistics control (‘ETL process control’) framework Github. This framework, referred to DIRECT (thanks to acronimify.com), assumes the spot of the Data Logistics Process Control in the engine metaphor for flexible data platform management. In case you were wondering, it’s the Data Integration Run-time Execution Control Tool! This is one of my favourite components, as it’s both so simple but also hard to truly get...

0

Generating Data Vault Hubs with complex Business Keys using standard SQL

When mapping the (source) Business Key definition to the target Business Key attribute(s), the most common scenarios besides the straight-up one-to-one mapping are ‘composition’, ‘concatenation’ and ‘pivoting’. In this post I will focus on the first two, as pivoting can be implemented in different ways that (depending on the solution) do not necessarily require pattern changes.