The Data Integration Run-time Execution Control Tool (DIRECT) is a generic execution and control framework that orchestrates the execution of ETL processes.
It provides various hooks into an ETL process to manage topics such as restartability, recovery from failure, logging, ETL classification and event handling.
There are many ETL control frameworks, as they are needed in every project. Let’s make this the best one! Ideally this becomes a commodity.
- The datamodel and sample code can be found here in the SQL DBM model (online) or in the Github. Note that this was previously on QuickDBD but moved to SQL DBM after their pricing model was changed
- The corresponding DML is available in the Github
- The DIRECT code and content is managed via Github here. This is a private Github for the time being, but more than happy to expand the circle of collaborators. Send me an email if interested. The Github also contains the documentation for the ETL Control Framework as a generic process control framework
Contents / functionality of the tool:
- Runtime execution monitoring & logging
- Models (DDL and DML)
- Disabling / enabling ETL in the control framework
- Recovery, retries
- Managing dependencies and parallelism
- Supporting automation code (re-initialisation, zero key generation, generating process registration records)
- Exception reports (SQL – currently integrated in Confluence)
- SSIS, Powercenter and Oracle wrappers (SSIS fully up to date, others available). Probably have some Pebble as well.