What is happening with the software development ecosystem? A status update.
I would like to provide a brief update on the various things that are in development within the Data Warehouse Automation / ETL generation ecosystem of open source tooling. Instead of working on new articles I have spent most of my time pushing some of the bigger development changes. This is what is currently in progress and can be expected in the short term:
- The Data Integration Framework and DIRECT repositories are being finalised for release as ‘full’ open source (public repositories). Only a few tidy up activities remain.
- The Generic Interface for Data Warehouse Automation (already a public repository) has been updated and is under review by the working group. This is an exciting development that I believe will shape the way the other tools interact.
- The Virtual Data Warehouse tool (VDW – formerly known as VEDW) is being refactored to support templating engines. This will allow users to modify and add custom templates to drive the way code is generated themselves. This is a major revision that also removes any logic handled by the tool itself. This means all metadata will now be provided through TEAM using the Generic Interface for Data Warehouse Automation. Work is expected to completed in the next few weeks.
- The metadata management tool TEAM is upgraded to incorporate functionality required to support the Generic Interface for Data Warehouse Automation. For example, TEAM will now also allow mapping of source-to-staging or staging-to-persistent-staging processes. All required metadata will be provided by TEAM, so VDW does not need to do any lookups – but can just use whatever metadata is made available. Work on this is also expected to be completed in the next few weeks. This requires various underlying metadata repository changes.