Mapping generation for Data Vault demo: part 6 (links)

Getting the Data Vault link entities right is probably the hardest part of the mapping generation algorithm. The following script will require user input at critical moments in the process. It helps to have the target datamodel at hand so you know why (and how) these choices are made. By reading from the history area tables and using the Hub entities created in the previous step this algorithm will create relationship tables (links) including link satellites. Before looking at the code I’ll demonstrate the use for this particular source. It will be very useful to have the data models ready for this exercise. As always the script will create the new table definitions and the ETL mapping between the source(s) and this new target definition. The script can be downloaded here: 4_generate_datavault_links.

The following steps have to be taken:

  1. Run the script using the source command (source 4_generate_datavault_links.tcl).
  2. The script will ask it it is allowed to replace existing metadata, select yes (y) for these occasions.
  3. The first table in the array will be the H_CITY table. You can use either this or the next table (H_COUNTRY) to create a Link table between CITY and COUNTRY. So select yes (y) for this one.
  4. Enter the name of the new Link table (without prefix). Enter CITY_IN_COUNTRY.
  5. The script will ask which Hub entities will be present in this new Link entity. In our case (check the model!) it will be CITY and COUNTRY. Enter CITY COUNTRY at the prompt. This will create an array for the Hubs.
  6. The next question is whether the source contains specific attributes which should be moved to a Link Satellite. These attributes will be descriptive about a relationship. The current source CITY does not contain those. Enter no (n) at this prompt.
  7. After this, a list of the source attributes will be shown and the prompt will require you to check if the source contains transactional or factual attributes. None in our source, so just press enter here.
  8. When all this information is available to the script the mapping will be generated.
  9. If inconsistencies in the naming of the keys is detected, the script cannot (at this stage) relate the relationship so a list of attributes will be shown with a prompt to enter the correct key. For the link CITY_IN_COUNTRY the correct code (check the datamodel!) will be COUNTRYCODE.
  10. That’s it! The Link and Link Satellite tables and mappings have been generated.

This means the the first source table has been processed but the real Link table LANGUAGE_IN_COUNTRY (based on the COUNTRYLANGUAGE source table) has yet to be created. The following steps apply:

  1. When requested whether H_COUNTRY has to lead to a Link, select No (n).
  2. When requested if H_COUNTRYLANGUAGE leads to a Link, select Yes (y).
  3. Due to text limitations (with the prefixes) enter the following name : LNG_IN_COUNTRY.
  4. Select the Hubs which are part of this new Link: LANGUAGE COUNTRY.
  5. This particular source does have attributes which are descriptive for a relationship: Is Official and Percentage. So select Yes (y) here.
  6. There are no transactional attributes so press enter at the next prompt.
  7. The inconsistency in the source model also occurs in this table, so select COUNTRYCODE from the list.

After this the full Data Vault model is available, and fully generated.

Roelant Vos

Roelant Vos

You may also like...

6 Responses

  1. Dan Linstedt says:

    Very cool, this is awesome stuff you are making available. Thank-you for all the hard work, I will try to sift through it soon. I think more people should be writing about the patterns found in the Data Vault – maybe, just maybe it would make it popular.

    Keep up the great work!
    Dan Linstedt

    • Roelant Vos Roelant Vos says:

      Hi Dan,

      Thanks! It’s a great (and useful) exercise. I just found out that the older version of the link / link-satellite still shows in the blog. I’ll update the latest version this evening. This new script includes the correct satellite mappings.


  2. Hi Roelant,
    Great work and very nice that you share your experiences on this topic on your blog. Actually I’m thinking about the generation of Datavaults as well. I will absolutely check your work and see if we share the same ideas about this topic :-).

    Oh ja, en ik kom ook uit Nederland…

    Regards, Denny de Jonge

  3. Sjoerd Evers says:

    Hi Roelant,

    5 years ago we already did quite some stuff with OMB scripting. Since then you have taken a lot of effort to get the most out of OMB scripting. 2 years ago i’ve seen your demo on how to generate all objects from staging till datamarts making use of OMB scripts. Now you taken it up another level by integrating Datavault principles. As soon as i got a free weekend im going to try and get things going on my laptop with your scripts.

    Top notch work mate!



Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.