Roelant Vos An expert view on Agile Data Warehousing

Why you need an Enterprise BI framework 0

Why you need an Enterprise BI framework

 

 Over the last few months I have done a number of pre-sales presentations regarding a framework for designing and developing data integration programs. While the framework in question is a smaller scale (mainly ETL) I did encounter numerous discussion why such a framework is necessary in the first place. What would be achieved by using it? Because this blog serves as my own personal framework and collection of best practices I thought it would be...

Current mapping generation improvement points 0

Current mapping generation improvement points

 

 I’ve demo-ed the posted processes for mapping generation and Data Vault a couple of times for varying audiences and (lucky me) no one has noticed the existing flaws 🙂 So it’s time to make a list of them to keep me from not forgetting to fix them. And it’s a heads up for the few people I know who are using the scripts. There is currently no way to keep the ‘transactional’ attributes out of...

Mapping generation for Data Vault demo: part 6 (links) 6

Mapping generation for Data Vault demo: part 6 (links)

 

 Getting the Data Vault link entities right is probably the hardest part of the mapping generation algorithm. The following script will require user input at critical moments in the process. It helps to have the target datamodel at hand so you know why (and how) these choices are made. By reading from the history area tables and using the Hub entities created in the previous step this algorithm will create relationship tables (links) including link satellites....

Mapping generation for Data Vault demo: part 5 (hubs and satellites) 0

Mapping generation for Data Vault demo: part 5 (hubs and satellites)

 

 After the successful creation of a history layer, it is time to focus on the core parts of the datawarehouse: the Data Vault. In this example a ‘raw’ Data Vault is created: no changes or transformations are done here. The first script in this layer creates the Hubs and the corresponding Satellite. Currently this is done based on the history layer, which is probably not the best approach. It would be better to have the...

Mapping generation for Data Vault demo: part 4 (history area) 0

Mapping generation for Data Vault demo: part 4 (history area)

 

 Now that the staging area tables and the mappings from source to staging are created, it’s time for the next step: the history area. In this step the source data is archived in a SCD2 way. The source tcl file for the staging to history mappings can be downloaded here: 2_staging_to_history_generation. When run (source command!) the script will ask the familiar questions about replacing ETL metadata. The script executes the following steps, specific elements are...

Mapping generation for Data Vault demo: part 3 (staging area) 0

Mapping generation for Data Vault demo: part 3 (staging area)

 

 Right now we’ve got the basics for the mapping generation in order: source data and a configured Oracle Warehouse Builder project. The next step would be to generate the staging area based on the source definitions. First we need to import the source definitions metadata so OWB has something to work on. The source definitions can be imported as source definition metadata to the 00_SOURCE_SYSTEM_WORLD module. The scripts will use this source folder to select the initial...

0

Mapping generation for Data Vault demo: part 2 (setting up OWB)

 

 Now that the source data set has been created and is available it is time to initially set up the Oracle Warehouse Builder environment. As mentioned earlier, I’ve chosen OWB for this demo because of the powerful scripting capabilities (TCL / OMB). For the demo environment to work with the supplied scripts you have to place them in the C\:TCL directory because some scripts call other scripts as some sort of include statement. To create...

Mapping generation for Data Vault demo: part 1 (the source system) 0

Mapping generation for Data Vault demo: part 1 (the source system)

 

This post provides the basic demo data for the Data Vault mapping generation.