Powercenter transaction control with XML/text file targets
I encountered a situation where I had to populate XML files with general ledger / account balancing information. A specific set (batch) would be a journal entry (header) and all the related transactions would be the line items. But, if any of these transactions are incorrect in some way I want the entire batch to be rejected. For a usual relational database target I would have used the Transaction Control operator to log an error and then rollback the batch when encountering an error (or commit the batch if all the transactions within that batch are OK. Logging an error in this situation would mean using a stored procedure or spamming the session log files. In my particular situation this was not possible and the structure of XML files further complicates matters. Therefore I created a separate transaction control mechanism based on the port processing order concept in Informatica to reject the entire set if any of its members triggers an error.
The mechanism is really easy:
- Implement error handling and flag records that contain an error.
- Sort the set / batch IDs and the error flag so that the records that are erroneous are processed first. I always use TRUE or FALSE so in my case I have set the sorter to be ascending on batch ID and descending on error flag (alphabetical order). Since this is an active transformation, all records are checked before further processing commences.
- Create a record counter per batch of transactions. Every time a new batch appears (they are sorted by now) the count is reset to 1. By definition of the sort order you know that if one of the records within that batch contains an error it will be the first record for the batch. This value is stored in a separate variable and every next record in the batch inherits this value, or depending on what you want with it is added to (re)calculate the error bitmap.
In my case, error messages are stored on batch level, but this can easily be changed to transaction / line item level. So, how is it done? The essence is in the EXP_tag_other_batch_records_on_error. This expression relies on the correct ordering done by the preceding sorter operation.
- The first line will receive the batch number for each record
- The second line compares the new information in line 1 with the information in line 3 (previous batch number). Because of the order Informatica interprets the ports the second line is processed before line 3 receives the new value for the row being processed. Effectively, it still contains the previous state when interpreting line 2. Line 1 and 3 are the same and that means that the current record is in the same batch as the previous one.
- The previous batch number is used in the formula of line 2, and will now be overwritten with the current value (after).
- This is the counter of rows per batch; every time the information in the line 2 variable is 1 this means that the counter is reset (value = 1). In the other situations this line 4 variable is upped by 1. The following formula is used: IIF (New_Batch_Number_flag=1,1, IIF (Previous_Batch_Number = Batch_Number, Batch_line_count_var +1, 1)
- This is the current attribute information for the record being processed.
- The sixth line will receive the error flag information from the record being processed.
- Because of the sort order, if one of the records within a batch contains errors it will be arriving first; so when a new batch is encountered (new batch flag = 1), the value in line 6 will be stored in this variable. In other situations the variable is not changed.
The other lines work in similar fashion. Depending on your situation, errors can be concatenated (added across rows), or transformed into an error bitmap. And routed however the situation requires this.