Tagged: ETL

0

When is a change a ‘change’?

This is a post that touches on what I think is one the essential best-practices for ETL design: the ability to process multiple changes for the same key in a single pass. This is specifically relevant for typical ETL processes that load data to a time-variant target (PSA, Satellite, Dimension etc.). For non-time variant targets (Hubs, Links etc.) the process is a bit easier as this is essentially built-in the patterns already :-). In a given...

 
0

Creating Data Vault Point-In-Time and Dimension tables: merging historical data sources

Beyond creating Hubs, Links and Satellites and current-state (Type 1) views off the Data Vault, one of the most common requirements is the ability to represent a complete history of changes for a specific business entity (Hub, Link or groups of those). If a given Hub has on average 3 or 4 Satellites, is it useful at the very least to see the full history of changes for that specific Hub across all Satellites. How to...

 
1

Denormalise (flatten) XML using Xquery

ETL software have ways to interpret XML files or structures that vary in maturity, but recently I had to find another way to import XML structures that are received as blobs using (transact) SQL only. Because the upstream patterns (Staging to Integration) can easily handle denormalised source data sets the easiest way was to cast each blob as the XML file it is anyway and completely denormalise it (‘flatten’, more rows…). This is opposed to normalising...

 
1

Do we still want to automate against ETL tools?

In the various Data Warehouse initiatives I have been involved with recently I tend to use ETL software less and less. Over the last few years I spent a lot of time figuring out how to automate/generate ETL in various tools – most notably in SSIS, Pentaho, Powercenter and recently SAS (but various others as well). To the point that most of the intended solution can be generated from metadata. But as recently outlined in this...

 
1

Using an ETL platform for your Data Warehouse, is it still relevant?

When I started my career in Data Warehousing and Business Intelligence 15 years ago there was a massive push towards adopting ETL software. Traditionally, specialised ETL software such as Informatica Powercenter, Oracle Warehouse Builder (later superseded by Oracle Data Integration), Microsoft DTS (later superseded by SSIS) and many similar platforms were very successful because of two main reason: ETL software provided a way to ‘explain’ or document what was happening in a way that made...

 
2

Data Vault ETL Implementation using SSIS: Step 2 – Historical Staging ETL

The archiving process such as the Historical Staging ETL are also not conceptually part of Data Vault but can play an important factor in the complete Enterprise Data Warehouse (EDW). While the Staging Area directly supports the Data Vault message the Historical Staging is really optional and does not impact Data Vault processes. For the sake of completion the ETL overview for the Historical Staging is provided regardless. As with any ETL process within the...

 
Data Vault implementation preparations – fundamental ETL requirements 0

Data Vault implementation preparations – fundamental ETL requirements

Prior to working my way through the end-to-end ETL solution for Data Vault certain fundamentals must be in place. The reference architecture is one of them and this is largely documented as part of this site and corresponding Wiki. The other main component from an implementation perspective are the requirements for ETL. As all concepts have their place in the reference architecture for good reasons they also have tight relationships and changes to one concept...

 
EAI, ESB and ETL and the ETL generation tool 0

EAI, ESB and ETL and the ETL generation tool

It has been very quiet on the weblog / idea repository since November. This was partly due to the summer holiday period (in Australia) and the fact that I’ve finally picked up the work to develop the ETL generation tool based on the described EDW architecture. Work wise I’ve recently done a lot of strategy work with very little practical exposure to the technologies, and this is likely to continue for some time to come....