Category: General

Remarks, expressions and thoughts. And everything that does not fit in the other categories :)

0

Update on development & collaboration efforts

Development update This is a status update for the small community of open source developers and collaborators that use, or contribute to, the growing ecosystem of metadata management and Data Vault generation tooling. I’ve worked hard on creating a more stable set of tooling around ETL generation and as a result the TEAM and VEDW applications (as well as some others) have now been completely separated with a view to increase greater interoperability with other...

 
0

Upcoming training and events

Hi everyone, sorry it has been a while! I’m done travelling for a while (both for work as well as leisure) and have been a bit quiet working on new training materials and code updates for the various collaboration areas. As a result, and in the short term, I will resume posting plenty of articles on the weblog to capture various lessons learned and ideas as well as open-source code releases – so watch this...

 
3

Using (and moving to) raw data types for hash keys

Making hash keys smaller A few months ago I posted an article explaining the merits of the ‘natural business key‘, which can make sense in certain situations. And, from a more generic perspective, why this is something the Data Warehouse management system (‘the engine‘) would be able to figure out automatically and change on the fly when required. This article used the common approach of storing the hash values in character fields (i.e. CHAR(32) for...

 
0

Registration now working!

I’ve finally properly (I think) configured the website to allow registration and the adding of comments in a user-friendly way, without having the burden of endless spambots. Registration, the creation of an account, will allow commenting and discussing content on the site itself which is a big improvement over the current email-based correspondence. After having the account setup you will receive a welcome email and be able to log in to the site using the...

 
0

Adopting GitHub for documentation, and resulting blog changes

After having used Git(Hub) to work and collaborate on code for a long time, I have recently spent some time to merge and move various documentation artefacts to GitHub as well. This covers the Data Integration framework and Enterprise Data Warehouse (EDW) architecture documentation, most importantly the various Design Patterns and Solution Patterns. These patterns form the central body of content that actually try to explain how things work in practice. I think it makes a...

 
0

Is Data Vault becoming obsolete?

What value do we get from having an intermediate hyper-normalised layer? Let me start by stating that a Data Warehouse is a necessary evil at the best of times. In the ideal world, there would be no need for it, as optimal governance and near real-time multidirectional data harmonisation would have created an environment where it is easy to retrieve information without any ambiguity across systems (including its history of changes). Ideally, we would not...

 
1

Some Q&A on Data Warehouse Virtualisation

I receive a fair bit of questions on the Data Warehouse Virtualisation ideas and wanted to respond and discuss this via this post. I don’t have all the answer but can share my views and expectations. When it comes to DWH Virtualisation and the Persistent Staging Area (PSA), the questions generally fall into two categories: Isn’t it too slow? How about performance? Surely users don’t want to wait for hours to see results? Why bother...

 
1

Biml Express 2017 tests, comments and work-arounds

The new version of Biml Express, the free script-based ETL generation plug-in for Visual Studio provided by Varigence, has been out for a few months. Mid-July 2017 to be precise. However only recently I have been able to find some time to properly regression-test this new release against my library of patterns / scripts. The driver is the upcoming Data Modelling Zone event and Data Vault Implementation & Automation training sessions – better keep up...

 
0

Updated sample and metadata models for Data Vault generation and virtualisation

After a bit of a pause in working on the weblog and technology (caused by an extended period of high pressure in the day job) I am once again working on some changes in the various concepts I’m writing about on this site. Recently I was made aware of this great little tool that supports easy creation and sharing of simple data models: Quick Database Diagrams (‘QuickDBD’). The tool is 100% online and can be...

 
0

When a full history of changes is too much: implementing abstraction for Point-In-Time (PIT) and Dimension tables

When changes are just too many When you construct a Point-In-Time (PIT) table or Dimension from your Data Vault model, do you sometimes find yourself in the situation where there are too many change records present? This is because, in the standard Data Vault design, tiny variations when loading data may result in the creation of very small time slices when the various historised data sets (e.g. Satellites) are combined. There is such a thing as too...