Roelant Vos An expert view on Agile Data Warehousing

Demo environment ready 0

Demo environment ready

 

 Yesterday I completed the setup of a virtual machine containing an Oracle database, OIBEE as reporting tool and both Powercenter and Oracle Warehouse Builder as ETL software. On the client side I installed OWB and Powercenter (obviously) and Informatica Mapping Architect (formerly: Data Stencils / the Visio mapping template). With this demo environment I can resume the tryouts of mapping generation based on the reference architecture on this blog. Theoretically the entire system could be...

The importance of consistency 2

The importance of consistency

 

 Some say it’s just being picky and that it slows down development. But strictly enforcing consistency in your BI / DWH environment will actually help to greatly speed up the whole development process. In discussions about standards, conventions, concession and consistency the argument usually is that being consistent only makes the system easier to manage for the system control (and they would think ‘why bother now, that’s not our problem’).  And often people will say...

Easy way to filter carriage returns with Powercenter 2

Easy way to filter carriage returns with Powercenter

 

 Being lazy, today I used a filter expression in Powercenter to filter carriage return / line feeds from source text files when loading source data to ETL staging tables. And it works quite well! It’s just a manner of adding the following filter:    The expression itself is: <first attribute of line> != ‘’ It really worked quite well and it saved us from using all kinds of scripts to clean up the file before...

Loading Messages into ETL staging tables 0

Loading Messages into ETL staging tables

 

 Messages are composed of one header and one or more segments which hold the actual data. These are demands a DWH should set for loading messages into ETL staging tables: The header should contain information about the version of the message definition and version of the header, the source system where the message originated from and a description. The message segments can occur not, one or many times. This depends on the type of information...

Types of errors 0

Types of errors

 

 There are two main categories of errors: technical errors and business rule errors (functional). Both types of errors should be classified separately because both provide different information about the nature of the error. A few examples are listed here:  –          Technical errors: invalid data type conversion, duplicate insert, unable to delete or move files or missing primary or foreign keys –          Functional errors: ‘the sum of attribute A and B does not equal C’ or...

Ideas for general error handling, why need error tables? 3

Ideas for general error handling, why need error tables?

 

 The possible scenarios regarding error and exception handling are limited. You can either: –          Detect an error, stop the process and present the error code. –          Detect an error and write the record in an error table with the corresponding  code. –          Detect an error and write the record in both the target and the error table with the error code. –          Detect an error, flag the record but write it to the DWH table...

A plea against IT abbreviations and upper-casing 0

A plea against IT abbreviations and upper-casing

 

 Every project I have worked on there is an abundance in abbreviations used for almost anything, from project names (codes) to mappings, (source) systems, workflows, team member names and database attributes. Even worse, the meaning of these abbreviations is lost or vague and in some cases even wrong (because the ‘real’ abbreviation is already taken)! Every time I feel it makes it so much harder for new team members to grasp the meaning of the system and/or...