Roelant Vos An expert view on Agile Data Warehousing

Mapping generation for Data Vault demo: part 3 (staging area) 0

Mapping generation for Data Vault demo: part 3 (staging area)

 

 Right now we’ve got the basics for the mapping generation in order: source data and a configured Oracle Warehouse Builder project. The next step would be to generate the staging area based on the source definitions. First we need to import the source definitions metadata so OWB has something to work on. The source definitions can be imported as source definition metadata to the 00_SOURCE_SYSTEM_WORLD module. The scripts will use this source folder to select the initial...

0

Mapping generation for Data Vault demo: part 2 (setting up OWB)

 

 Now that the source data set has been created and is available it is time to initially set up the Oracle Warehouse Builder environment. As mentioned earlier, I’ve chosen OWB for this demo because of the powerful scripting capabilities (TCL / OMB). For the demo environment to work with the supplied scripts you have to place them in the C\:TCL directory because some scripts call other scripts as some sort of include statement. To create...

Mapping generation for Data Vault demo: part 1 (the source system) 0

Mapping generation for Data Vault demo: part 1 (the source system)

 

This post provides the basic demo data for the Data Vault mapping generation.

New design pattern added 0

New design pattern added

 

 I’ve added the design pattern (004) which includes the basic ETL ground rules. The architecture section is also almost completed (as a draft I mean). This means that all the basics are present to start the mapping generation testing.

Demo environment ready 0

Demo environment ready

 

 Yesterday I completed the setup of a virtual machine containing an Oracle database, OIBEE as reporting tool and both Powercenter and Oracle Warehouse Builder as ETL software. On the client side I installed OWB and Powercenter (obviously) and Informatica Mapping Architect (formerly: Data Stencils / the Visio mapping template). With this demo environment I can resume the tryouts of mapping generation based on the reference architecture on this blog. Theoretically the entire system could be...

The importance of consistency 2

The importance of consistency

 

 Some say it’s just being picky and that it slows down development. But strictly enforcing consistency in your BI / DWH environment will actually help to greatly speed up the whole development process. In discussions about standards, conventions, concession and consistency the argument usually is that being consistent only makes the system easier to manage for the system control (and they would think ‘why bother now, that’s not our problem’).  And often people will say...

Easy way to filter carriage returns with Powercenter 2

Easy way to filter carriage returns with Powercenter

 

 Being lazy, today I used a filter expression in Powercenter to filter carriage return / line feeds from source text files when loading source data to ETL staging tables. And it works quite well! It’s just a manner of adding the following filter:    The expression itself is: <first attribute of line> != ‘’ It really worked quite well and it saved us from using all kinds of scripts to clean up the file before...

Loading Messages into ETL staging tables 0

Loading Messages into ETL staging tables

 

 Messages are composed of one header and one or more segments which hold the actual data. These are demands a DWH should set for loading messages into ETL staging tables: The header should contain information about the version of the message definition and version of the header, the source system where the message originated from and a description. The message segments can occur not, one or many times. This depends on the type of information...

Types of errors 0

Types of errors

 

 There are two main categories of errors: technical errors and business rule errors (functional). Both types of errors should be classified separately because both provide different information about the nature of the error. A few examples are listed here:  –          Technical errors: invalid data type conversion, duplicate insert, unable to delete or move files or missing primary or foreign keys –          Functional errors: ‘the sum of attribute A and B does not equal C’ or...