Comments on ‘Modeling the Agile Data Warehouse with Data Vault’

I just finished reading ‘Modeling the Agile Data Warehouse with Data Vault’ by Hans Hultgren. I think it provides a good and clear overview of the Data Warehouse design using Data Vault as technique for the core DWH layer as well and, most importantly, covers some practical aspects that have been implemented but not documented in (practical) detail such as;

  • Key Satellites
  • Identical Business Keys (primarily solved through defining a concatenated key)

Points that are up for discussion, mainly because they seem to differ from the things Dan Linstedt specifies, are:

  • Multi Active Satellites (called Multi-Valued Satellites in this book). This seems to be a way to reduce the number of tables in the Dan Linstedt version, whereas every additional key is modelled out as a Hub in the Hans Hultgren version
  • Creating events as Hubs. In the original concepts the Link is pitched as the point where various Business Entities come together as a transaction (one of the archetypes was even called ‘Transactional Link’ – for true transactional data only).
  • No reference of end dating Links (relationships) is mentioned including handling of, for instance, the driving key mechanism.

My personal views are that Multi Active Satellites are a great way of linking information directly to a Hub while avoid defining a Hub for something that is not necessarily self-standing. Related to the events; I still see these typically as Links (specifically Link-Satellites). Creating Hubs for an event (i.e. Sale, Appointment etc.) is always an option but I see it more as a last resort after other solutions have failed, including creating Links which have a degenerate key as the transaction ID. And of course it depends on how the data is handled in the sources to an extent as well. In all cases a Link-Satellite can also handle all types of transactions / events.

 
Roelant Vos

Roelant Vos

You may also like...

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.