Tagged: SSIS

0

Data Vault ETL Implementation using SSIS: Step 7 – Link Satellite ETL – part 3 – End Dating

I’m catching up on old drafts within WordPress, and in the spirit of being complete on the older SSIS series felt I should pick this one up and complete it. While most of my focus is on developing the virtualisation concepts I still work a lot with more traditional ETL tools, one of which is Microsoft SSIS. Recently I merged the metadata models that underpin the virtualisation and SSIS automation and I am retesting everything...

 
4

Data Vault ETL Implementation using SSIS: Step 5 – Satellite ETL – part 2 – SQL selection (wide table exception)

There is more to say on exactly how Satellite selections are made as originally documented in the Satellite ETL post. In many cases not all records from a Staging Area table are selected, but rather a subset. This creates an interesting issue to which I usually refer to as the ‘wide table exception’. It leads to a minor change in the SQL selection for Satellite entities. The outline and solution are described in this post. Image the scenario where...

 
0

Data Vault ETL Implementation using SSIS: Step 3 – Hub ETL – part 2 – SQL selection

The outline ETL for Data Vault Hub style ETL using SSIS has been documented in ‘Hub implementation’ and ‘Hub exceptions’. As a minor addition I would like to dive a bit deeper into the selection logic associated with Hub ETL. Following the experiences ‘from the trenches’ there are basically two ways to create SQL for a Hub selection from the Staging Area; Straight-up selection of the business key Selection including a composite business key (e.g. assembled from...

 
0

Data Vault ETL Implementation using SSIS: Step 4 – Link ETL

In the Data Vault workflow the Link is the next object to typically be loaded after the Hub has been processed. This can be done in parallel with the standard Satellites but this will be covered in a future posting about the workflow/scheduling of Data Vault ETL. The Link ETL comes down to select the distinct pairs of business keys from the respective Staging Area table, check if they’re there and insert them if they’re...

 
1

Data Vault ETL Implementation using SSIS: Step 3 – Hub ETL – part 1 – overview

The Staging Area and Historical Staging, both part of the conceptual Staging Layer, are not directly related to Data Vault although at least the Staging Area ETL paves to way with the definition of the Event Date/Time. The Hub however is the first true Data Vault template to be implemented in SSIS. Please note that this is true for DV1.0; for DV2.0 the Staging Area is incorporated into the design a bit more. The Hub...

 
2

Data Vault ETL Implementation using SSIS: Step 2 – Historical Staging ETL

The archiving process such as the Historical Staging ETL are also not conceptually part of Data Vault but can play an important factor in the complete Enterprise Data Warehouse (EDW). While the Staging Area directly supports the Data Vault message the Historical Staging is really optional and does not impact Data Vault processes. For the sake of completion the ETL overview for the Historical Staging is provided regardless. As with any ETL process within the...

 
0

Data Vault ETL Implementation using SSIS: Step 1 – Staging Area ETL

While technically (and conceptually) not really part of Data Vault the first step of the Enterprise Data Warehouse is to properly source, or stage, the data. This is a critical step, and often one of the most difficult to get right. It also has direct impact on the core DWH / Data Vault development, or at least supports its core message and principles. This impact can be summarised by the definition that the purpose of...

 
Data Vault implementation A-Z: Staging data (the conceptual side) 0

Data Vault implementation A-Z: Staging data (the conceptual side)

This is the first of a planned series of implementation designs for implementing Data Vault in an end-to-end Data Warehouse environment. The positioning of the Data Vault concepts and techniques in the greater design of the system (reference architecture) is documented elsewhere in this site, mainly in the ‘papers’ section. Data Vault in itself does not provide a complete solution for most Data Warehouse purposes but provides a great set of modeling techniques to design the...

 
ETL generation in SSIS (adult steps) 0

ETL generation in SSIS (adult steps)

Full generation of ETL is the missing component towards the model driven design of the Data Warehouse and I am still pursuing various methods for various ETL suites to add to this concept. Some time ago I looked into ETL generation for Microsoft SSIS using the available DTS libraries (see the post) which really were baby steps. At some point one of my colleagues suggested looking into ‘BIML’ (Business Intelligence Markup Language) and the accompanying...