Roelant Vos - Data solution design patterns, implementation, and automation

Roelant Vos Data solution design patterns, implementation, and automation

March 1, 2024

Data Engine Thinking – the book

Data Engine Thinking – the book by Roelant Vos and Dirk Lerner – provides a full and in-depth overview of everything you need to develop a data solution that can stand both the test of time and the real world.

General / Organisation

February 26, 2024

Introducing Agnostic Data Labs

Announcing Agnostic Data Labs – the new data solution automation platform to truly implemented data warehouse projects your way!

Data Vault

December 11, 2023

Remedies for Driving Keys

Driving keys continue to cause problems, even for experienced Data Vault practitioners. Are there better ways to manage relationship validity?

Data Vault

December 4, 2023

End dating, or not, in Data Vault

End dating Satellites in Data Vault has much more than meets the eye. This post lays out various options and consideration that may help decide whether to implement end-dating, or not.

General

November 5, 2023

A brief update on data solution automation progress

I would like to share this short blog post with an overview of everything that is happening the data solution automation space. First of all, it’s only a week to go before heading to Europe for various things, including: Working towards these events always pushes the overall content, and there is a lot happening. The design/mapping metadata tool TEAM and code-generation tool VDW are getting a facelift and are being updated to use the v2.0...

General

September 15, 2023

Data solution metadata schema – v2.0

The updated (v2.0) version of the data automation metadata schema has been finalised! This version is\ the culmination of many discussions over a long period of time, and hopefully is a step in the direction of making the exchange of data solution / data warehouse metadata easier. The link to the latest & greatest version is here: https://github.com/data-solution-automation-engine/data-warehouse-automation-metadata-schema/releases/tag/v2.0. An important change is that the ‘Data Query’ concept has been split out into a query at...

Architecture

July 27, 2023

Joining tables in the Persistent Staging Area

Joining tables in the Persistent Staging Area (PSA) could be a practical solution that avoids downstream complexities. This post explains the pattern to do so.

Data Vault / General

March 26, 2023

Visualising the merger of time-variant and bitemporal data

By plotting, and then combining, bitemporal and historised data sets on a cartesian plane it’s really easy to understand bitemporal behaviour.

General

March 23, 2023

A not-so-gentle follow-up on bitemporal data challenges

When delivering data from the integration layer (e.g. a Data Vault model) to the presentation layer (anything, but usually a dimensional model or wide table), a key requirement is re-organising data to the selected ‘business’ timeline for delivery.

During this process, we leave the safety of the assertion (technical) timeline behind and start using the real-world state timeline for delivery. This may create some unexpected results!

General

February 20, 2023

A gentle introduction to bitemporal data challenges

The data is wrong! No, it’s not wrong, we’re just looking at it from different points in time. This post shows how a data warehouse helps to manage this common topic on data interpretation.

Data Vault / ETL / General

February 5, 2023

Deterministic dimension keys in a virtual Data Vault

When preparing Data Vault content for consumption in a dimensional model, dimension keys can be created to join the resulting fact- and dimension tables in a performant way. But what about for a truly virtual data mart? This post covers approaches to issue dimension keys that are fully deterministic.