Category: General

0

Loading too fast for unique date/time stamps – what to do?

Let’s start by clarifying that this concerns the RDBMS world, not the Hadoop world 😉 It’s a good problem to have – loading data too quickly. So quickly that, even at high precision, multiple changes for the same key end up being inserted with the same Load Date/Time Stamp (LDTS). What happens here? A quick recap: in Data Vault the Load Date/Time Stamp (LDTS, LOAD_DTS, or INSERT_DATETIME) is defined as the moment data is recorded...

 
0

NoETL – Data Vault Link Satellite tables (part 2)

This is the second part of the Link Satellite virtualisation overview (the first post on this topic is here), and it dives deeper into the logic behind Driving Key based Link Satellites. Driving Key implementation is arguably one of the more complex things to implement in Data Vault – and you (still) need to ensure you can cover reloads (deterministic outputs!), zero records / time variance and things such as re-opening closed relationships. In the example...

 
1

NoETL – Data Vault Link Satellite tables (part 1)

The final of the series of planned posts (for now at least) about Data Warehouse Virtualisation is all about Link Satellites. As with some of the earlier posts there are various similarities to the earlier approaches – most notably the Satellite virtualisation and processing. Concepts such as zero records and ‘virtual’ or computed end-dating are all there again, as are the constructions of using subqueries to do attribute mapping and outer queries to calculate hash...

 
1

World Wide Data Vault Consortium key takeaways

Last week I attended the second iteration of the World Wide Data Vault Consortium (WWDVC) as hosted by Dan Linstedt in his home state Vermont. It was great to experience the uptake in Data Vault, going from a small group of practitioners last year to a bigger group with lots of new faces this year. Especially engaging was a day prior to the conference of in-depth discussions about various use-cases and technical solutions and improvements...

 
0

Data Virtualisation and DWH Virtualisation

I presented a new style of content today at the Sydney DAMA March event, and thought it may be worthwhile to post the transcript as a paper. It is all about virtualisation your entire Data Warehouse, what is needed to achieve this and what it means for ETL and what role emerging Data Virtualisation techniques play here. In a way it’s the overarching story that supports recent blog posts about NoETL. The premise is that we...

 
0

NoETL – Persistent (History) Staging Area (PSA)

After setting up the initial data staging in the previous post we can load the detected data delta into the historical archive: the Persistent Staging Area (PSA). The PSA is the foundation of the Virtual Enterprise Data Warehouse because all upstream modelling and representation essentially reads from this ‘archive of (data) changes’. This is because the PSA has all the information that was ever presented to the Data Warehouse, either in structured or unstructured format....

 
3

Zero records, time-variance and point-in-time selection

While finalising the posts for the overall Data Vault ETL implementation I have also started thinking about how to document the next steps: the loading patterns for the Presentation Layer. From here on I will refer to the Presentation Layer, Data Marts and Information Marts simply as ‘Information Marts’. This reminded me that I haven’t yet properly covered the ‘zero record’ concept. This is a timely consideration: the whole reason that zero records exist is to make the...

 
2

A brief history of time in Data Vault

To quote Ronald Damhof in yesterday’s twitter conversation: ‘There are no best practices. Just a lot of good practices and even more bad practices’. Sometimes I feel Data Vault lacks a centrally managed, visible, open forum to monitor standards. And, more importantly, the evolution of these standards over time. And, even more importantly, why these standards change over time. It varies (in space and time) where sensible discussions regarding these standards take place, but lately...