NoETL – Data Vault Link tables

Virtualising Data Vault Link structures follows a similar process to that of the virtual Hubs, with some small additions such as the support for (optional) degenerate attributes. To make things a bit more interesting I created some metadata that requires different Business Key ‘types’ so this can be shown and tested in the virtualisation program.

For the example in this post I created three Link definitions (the metadata), one of which (LNK_CUSTOMER_COSTING) has a three-way relationship with the following properties:

  • 1 normal / straightforward key: HUB_CUSTOMER (using the source value ‘Member’)
  • 1 composite key: HUB_MEMBERSHIP_PLAN which is composed of ‘Plan_Code’ and ‘Plan_Suffix’ (the latter being a random hardcoded value for testing purposes)
  • 1 concatenated key: HUB_SEGMENT for which a single-attribute Business Key is created from concatenating the ‘Segment’ attribute and some arbitrary hardcoded values (this could have been another attribute as well)

The used metadata is shown here:

Link Metadata

As with the Hub views, the Link views may also use UNION statements to make sure data from different sources is presented as an integrated result. And similar to Hubs as well, the correct (target) attribute naming is handled in the subquery whereas hashing and condensing is implemented in the outer query.

It’s all very similar – demonstrating the pattern-based approach towards design and implementation!

The generated output is shown in the next two screenshots.

Link Generated Output

 Link_Output

Because looking directly at Link table contents isn’t the most exciting thing to do I added some clarifications to explain how the query logic works:

Link_View_Explanation

The INSERT INTO statement, to allow the view to ‘act’as ETL again is in many ways similar to the Hub approach. Obviously, in this case the Link Hash key is used to check if a record already exists in the target physical table!

Link Insert Into Statement

Over the last few posts we have generated a source-to-staging process, a Persistent Staging Area (PSA) process and Hubs, Views and Links that simulate the Data Warehouse on top off the PSA. Next up is the Link Satellites which will be split across two posts to cater for the variations in this particular area.

 
Roelant Vos

Roelant Vos

You may also like...

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.