Mapping generation for Data Vault demo: part 1 (the source system)

With the demo environments now up and running I’ve done a search for suiteable source data for the mapping generation demo. After several unsuccesful attempts to obtain representative source data I’ve settled with the free MySql demo data ‘world.sql. Since my demo environment runs on Oracle I have converted the MySql syntax to the Oracle syntax. In this post I’ll sketch the outlines of the demo and the source / target models.

The complete set of source data can be found here: World.

When this script is executed the result will be a set of three populated tables:

Sample source model (World location dataset)

There is a foreign key from COUNTRYCODE in the CITY table to the CODE (PK) in the COUNTRY table. Furthermore the CAPITAL attribute of the COUNTRY table leads back to the CITY ID attribute. These relationships can be defined as ‘A city is situated in a country’. Being a capital or not is actually a property of a city in datawarehouse terms. For the LANGUAGE table the COUNTRYCODE is again used as a foreign key to CODE in the COUNTRY table. It is a one to many relationship where one country can have multiple languages. For each of these languages the corresponding percentage can be viewed for that particular country, as well whether the language is an official language.

 This (very) easy source datamodel will be converted to a Data Vault model like this:

Sample Data Vault result based on the sample source data

The defined hubs will become CITY, COUNTRY and LANGUAGE. The relationship between CITY and COUNTRY will be the link table CITY_IN_COUNTRY. The relation between language and country will be more interesting: the link table LANGUAGE_IN_COUNTRY will have a separate link satellite table where the specifics of that relationship will be stored (percentage, official).

Roelant Vos

Roelant Vos

You may also like...

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.