Update on development & collaboration efforts

Development update

This is a status update for the small community of open source developers and collaborators that use, or contribute to, the growing ecosystem of metadata management and Data Vault generation tooling.

I’ve worked hard on creating a more stable set of tooling around ETL generation and as a result the TEAM and VEDW applications (as well as some others) have now been completely separated with a view to increase greater interoperability with other ideas and approaches.

This primarily concerns the TEAM and VEDW applications although some very interesting developments on various templating engines and domain specific language is progressing as well.

TEAM (management of ETL generation metadata)

For the people that have subscribed to the Github, the latest release (1.5.1) for TEAM is found here: https://github.com/RoelantVos/TEAM/releases/tag/1.5.1.

You can also download this from Google Drive on: https://drive.google.com/open?id=1Sx-0J2rzppKwV3wRPL3PoN6MdO2k2J_4.

Changes include:

  • Introduction of an installer / setup component
  • Created separate configuration screen
  • Created separation, and storage of, output and configuration paths to allow different configurations to be saved in separate locations
  • Created support for JSON for saving, opening and editing Table Mapping and Attribute Mapping metadata. Note that a SQL Server back-end is still required for activation (validation and preparation of metadata).
  • JSON based data stores for the physical model storage (MD_VERSION_ATTRIBUTE table)
  • Remove legacy Driving Key Indicator in the physical model, this has been superseded as metadata that is required in the Table Mapping. This is also a repository change because the info does not need to be stored anymore
  • Update the repository version to v.1.4.1 to reflect the change to drop the DRIVING_KEY_INDICATOR. The repository can be re-created from the application (repository menu).
  • Various bug fixes (see Github) including maintaining persistence in attribute order (i.e. in support of Same-As Links)

The move towards storing the essential ETL generation metadata in JSON is a step towards creating a back-end that does not rely on a relational database management system, and towards an agreed interface format where all information is made available for consuming application such as VEDW but others as well.

The idea is that TEAM will continue to focus on minimising data entry for ETL generation, and simplify the design process in general. At the same time the required physical model metadata will need to be available for generation, as opposed to being queried at runtime by generation engines. I have many examples on how this works and will publish these in the near future as separate blog posts.

This fits in with the idea that a single metadata file will contain all information required to do the entire generation (per individual process or across all), and the (ETL) generation itself will only have to apply the template and perform the compilation into whatever target platform is required.

Next releases for TEAM will focus on delivering this, while keeping a view to minimise data entry from the front-end still.

Updates on work-in-progress are available on the Github at https://github.com/RoelantVos/TEAM/wiki/Work-In-Progress—Development-Branch.

Virtual EDW (VEDW)

To keep things consistent I have simplified the VEDW application, which now requires TEAM to be available for operation.

The latest version is found on the Google Drive: https://drive.google.com/open?id=1T8fVvhHm_jjXbizOFx7msoajsVmn86ra.

 
Roelant Vos

Roelant Vos

Roelant Vos

You may also like...

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.