Simple ETL generation series – an overview
I am starting a new series of articles to showcase how you can setup your own simple ETL generation / automation approach. The idea is to use the TEAM metadata for this, but it is perfectly possible to use any other source of metadata as well of course.
These (working) examples should not be seen to replace commercial tools and frameworks. The posts aim to explain how ETL generation works, how it can be implemented and what to keep in mind either to continue down the DIY path, or to assist in evaluation of available software.
My view is that getting familiar with ETL generation mechanisms is worthwhile either way; to understand how the templates and patterns work.
Publishing the available examples and documentation on this blog could also help the various open source initiatives by applying some better standards in the parameter and variable names, so these can be consistently used in the templates.
The examples are predominantly created in SQL, which allows simple scripts most people can run using SQL Server Management Studio. I will also post some other examples using common templating engines which allow for a far greater degree of customisation (but require different coding skills).
For documentation I have used the Business Activity Diagram notation using draw.io. The result is a set of process-flows that display the iteration through metadata sets and the evaluations performed – to get the required variable presented back on the screen (via the templating engines).
The following topics are planned to be published in the next few weeks;
- Data Vault Hubs (SQL)
- Data Vault Hubs (alternative / C#)
- Data Vault Links
- Data Vault Satellites and Link-Satellites
- Point-In-Time, Dimensions and other related (joined) historised data sets
I have setup a new section in the site menu as well to capture this progressively when the relevant posts are released: the simple automation series.