Mapping generation for Data Vault demo: part 6 (links)
Getting the Data Vault link entities right is probably the hardest part of the mapping generation algorithm. The following script will require user input at critical moments in the process. It helps to have the target datamodel at hand so you know why (and how) these choices are made. By reading from the history area tables and using the Hub entities created in the previous step this algorithm will create relationship tables (links) including link satellites. Before looking at the code I’ll demonstrate the use for this particular source. It will be very useful to have the data models ready for this exercise. As always the script will create the new table definitions and the ETL mapping between the source(s) and this new target definition. The script can be downloaded here: 4_generate_datavault_links.
The following steps have to be taken:
- Run the script using the source command (source 4_generate_datavault_links.tcl).
- The script will ask it it is allowed to replace existing metadata, select yes (y) for these occasions.
- The first table in the array will be the H_CITY table. You can use either this or the next table (H_COUNTRY) to create a Link table between CITY and COUNTRY. So select yes (y) for this one.
- Enter the name of the new Link table (without prefix). Enter CITY_IN_COUNTRY.
- The script will ask which Hub entities will be present in this new Link entity. In our case (check the model!) it will be CITY and COUNTRY. Enter CITY COUNTRY at the prompt. This will create an array for the Hubs.
- The next question is whether the source contains specific attributes which should be moved to a Link Satellite. These attributes will be descriptive about a relationship. The current source CITY does not contain those. Enter no (n) at this prompt.
- After this, a list of the source attributes will be shown and the prompt will require you to check if the source contains transactional or factual attributes. None in our source, so just press enter here.
- When all this information is available to the script the mapping will be generated.
- If inconsistencies in the naming of the keys is detected, the script cannot (at this stage) relate the relationship so a list of attributes will be shown with a prompt to enter the correct key. For the link CITY_IN_COUNTRY the correct code (check the datamodel!) will be COUNTRYCODE.
- That’s it! The Link and Link Satellite tables and mappings have been generated.
This means the the first source table has been processed but the real Link table LANGUAGE_IN_COUNTRY (based on the COUNTRYLANGUAGE source table) has yet to be created. The following steps apply:
- When requested whether H_COUNTRY has to lead to a Link, select No (n).
- When requested if H_COUNTRYLANGUAGE leads to a Link, select Yes (y).
- Due to text limitations (with the prefixes) enter the following name : LNG_IN_COUNTRY.
- Select the Hubs which are part of this new Link: LANGUAGE COUNTRY.
- This particular source does have attributes which are descriptive for a relationship: Is Official and Percentage. So select Yes (y) here.
- There are no transactional attributes so press enter at the next prompt.
- The inconsistency in the source model also occurs in this table, so select COUNTRYCODE from the list.
After this the full Data Vault model is available, and fully generated.