Roelant Vos An expert view on Agile Data Warehousing

0

Early-bird for upcoming training (22-24 October) about to expire

 

 If you are interested in understanding and discussing the intricacies of developing a Data Vault based data platform / Data Warehouse, then please consider the upcoming Data Vault implementation training course. It is a practical training focused on understanding the impacts of various design decisions on the (Data Vault) patterns and overall solution architecture. Regardless whether you develop your solution using custom scripting or by using an ‘off-the-shelf’ vendor application, you need detailed understand how...

0

How to be sure you load 100% of the data 100% of the time!

 

 How can you be sure you load 100% of the data 100% of the time? This is an article that expands on a topic that often comes up in the Data Vault implementation training: applying Referential Integrity and consistency checks in Data Vault. It sounds straightforward, but in reality these are interesting topics for any Data Warehouse (DWH) and Data Vault practitioner. I started to write a blog post about it, but before I knew...

1

Schools of thought on implementing Multi-Active Satellites

 

 Right or wrong? When it comes to data management there are almost always various alternatives for implementation and none of them are necessarily right or wrong. They represent the various options and consequences to consider, and the right solution usually is the one which is made with full understanding of these consequences, with ‘eyes wide open’. Supporting multi-active, sometimes referred to as ‘multi-variant’ or ‘multi-valued’, behaviour of Satellites is one of these areas where opinions...

2

Update on development & collaboration efforts

 

 Development update This is a status update for the small community of open source developers and collaborators that use, or contribute to, the growing ecosystem of metadata management and Data Vault generation tooling. I’ve worked hard on creating a more stable set of tooling around ETL generation and as a result the TEAM and VEDW applications (as well as some others) have now been completely separated with a view to increase greater interoperability with other...

0

Upcoming training and events

 

 Hi everyone, sorry it has been a while! I’m done travelling for a while (both for work as well as leisure) and have been a bit quiet working on new training materials and code updates for the various collaboration areas. As a result, and in the short term, I will resume posting plenty of articles on the weblog to capture various lessons learned and ideas as well as open-source code releases – so watch this...

3

Using (and moving to) raw data types for hash keys

 

 Making hash keys smaller A few months ago I posted an article explaining the merits of the ‘natural business key‘, which can make sense in certain situations. And, from a more generic perspective, why this is something the Data Warehouse management system (‘the engine‘) would be able to figure out automatically and change on the fly when required. This article used the common approach of storing the hash values in character fields (i.e. CHAR(32) for...

0

Registration now working!

 

 I’ve finally properly (I think) configured the website to allow registration and the adding of comments in a user-friendly way, without having the burden of endless spambots. Registration, the creation of an account, will allow commenting and discussing content on the site itself which is a big improvement over the current email-based correspondence. After having the account setup you will receive a welcome email and be able to log in to the site using the...

0

Adopting GitHub for documentation, and resulting blog changes

 

 After having used Git(Hub) to work and collaborate on code for a long time, I have recently spent some time to merge and move various documentation artefacts to GitHub as well. This covers the Data Integration framework and Enterprise Data Warehouse (EDW) architecture documentation, most importantly the various Design Patterns and Solution Patterns. These patterns form the central body of content that actually try to explain how things work in practice. I think it makes a...

0

Is Data Vault becoming obsolete?

 

 What value do we get from having an intermediate hyper-normalised layer? Let me start by stating that a Data Warehouse is a necessary evil at the best of times. In the ideal world, there would be no need for it, as optimal governance and near real-time multidirectional data harmonisation would have created an environment where it is easy to retrieve information without any ambiguity across systems (including its history of changes). Ideally, we would not...

1

Some Q&A on Data Warehouse Virtualisation

 

 I receive a fair bit of questions on the Data Warehouse Virtualisation ideas and wanted to respond and discuss this via this post. I don’t have all the answer but can share my views and expectations. When it comes to DWH Virtualisation and the Persistent Staging Area (PSA), the questions generally fall into two categories: Isn’t it too slow? How about performance? Surely users don’t want to wait for hours to see results? Why bother...