Data Vault modelling standards

Every successful Data Warehouse solution is backed by solid and consistent data modelling and data architecture practices. This includes predictable, exhaustive and well structured naming conventions. For the purpose of documenting the end-to-end solution design for Data Vault I used the ones listed below.

General conventions:

  • Every table contains at least one (1) attribute which relates records (inserted/updated) to an ETL process identifier; which may relate to a larger ETL process control framework
  • All table names are singular

Table conventions:

TableConventionAttribute(s) Additional information
HubHUB_%Entity Name%- %Entity Name%_SK
- (Primary Key)
- %Business Key% (Unique/Alternate Key)
- Insert Date/Time
- Record Source
- ETL Process ID
The Primary Key Index is Non-Clustered
The Unique Key is a Clustered Index
LinkLNK_%Entity Name%- %Entity Name%_SK (Primary Key)
- %Hub Key(s)% (Unique/Alternate Key)
- Insert Date/Time
- Record Source
- ETL Process ID
Hub keys in Link tables are Foreign Keys from the (parent) Hub
A Link can have 2 or more Hub keys
The Primary Key Index is Non-Clustered
The Unique Key is a Clustered Index
SatelliteSAT_%Entity Name%- %Hub Key% (Primary Key)
- Insert Date/Time (Primary Key)
- Record Source
- ETL Process ID
- %Attribute(s)%
A Satellite relates to only a single Hub and therefore only contains one (1) Hub key
A Satellite never contains Foreign Keys
The Primary Key Index is Clustered
Link-SatelliteLSAT_%Entity Name%- %Link Key% (Primary Key)
- Insert Date/Time (Primary Key)
- Record Source
- ETL Process ID
- %Attribute(s)%
A Link-Satellite relates to only a single Link and therefore only contains one (1) Link key
The Primary Key Index is Clustered

Technical conventions:

ObjectConventionAdditional Information
Primary Key constraintPK_%Entity Name%
IndexIX_%Entity Name%