Future plans for ETL mapping generation

After the OWB exercises (although they’re not fully complete yet) I will have a go at generation Informatica mappings. Informatica does not supply its own scripting component as OWB does but it does have (limited) capabilities for mapping generation. The way I see it there are three options: 

1. Use the tools supplied (Data Stencils, the Excel plug in)
2. Hack into the repository
3. Parse XML 

The third option would probably yield the best results because the tools currently do not cut it. They’re too inflexible in their current state. For instance you can’t update mappings, you have to manually add your sources and targets after mapping generation and there are no options for user input (prompts / decisions). As for the second option, well let’s just say there are warranty risks involved :-).

I’m thinking of creating some sort of web application to upload XML source definition files and follow basically the same steps as I did with the OWB demo. But this time with this application should export XML files which can be (automatically) imported to Informatica using its command line options (pmrep). Perhaps the parsing and creation of these XML files can be done using php or Java or something like that. If anyone has any samples of this I would be very grateful! It will probably be a time consuming exercise so don’t expect anything soon. I’ll keep posting progress. One thing I do learned today is that you need the Informatica supplied DTD file to link to an XML. A DTD specifies the exact version and this has impact on way the XML files are structured. Before I do any of this however I need to complete the architecture sections in this guide. They already have become outdated to some extent!

 
Roelant Vos

Roelant Vos

You may also like...

2 Responses

  1. Apex says:

    Hi Ravos,

    The first way is easiest, but it let to do only the very simple mappings… at least I failed to make something complex.
    The second way can be done without true hacking 🙂 if you have strong Java knowledge, Infa SDK and a lot of free time:)
    The third way, also, quite painful, mostly because of lack of documentation for the INFA’s XML format.

    So, I came to conclusion that it’s easier to generate a set of SQLDDL and use INFA just for control the workflows. It’s actually sad, that one of the most popular ETL tool don’t have the means for automation routine…

     
    • Roelant Vos Roelant Vos says:

      Hi Apex,

      Yeah, I toyed with the Informatica supplied tools but also failed to create anything complex. I had some Mapping Architect training at Informatica World but wasn’t really impressed. It’s a pity! I want to make use of the Designer to store metadata, comments, process info and so forth and hopefully generating XML’s will enable me to do the same things as OWB does. Now I only have to think of a way how to do that 🙂 I’ll probably need to store intermediate results in some database because I don’t have the complete metadata at hand like I have with OWB.

      I’ve been mailing around for some examples and so far I know the following:
      – There are some limited Data Vault templates circulating based on the XML parsing idea (I don’t have them though)
      – A colleague of mine created a mapping generation Excel sheet with VB to parse XML (with some user input options) and that works well for specific situations
      – Dan Lindstedt is working on a SaaS application to generate ETL

      If I have anything usefull I’ll email it to you but most likely it will take some time to get started.

      Regards,
      Roelant

       

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.