ETL generation using SSIS (baby steps)

Following last week’s discussions on the metadata models I’ve put the last prerequisites in the ETL generation tool and had a more detailed look at actually generating ETL for SSIS based on the available metadata. The ultimate goal is to generate ETL for a variety of ETL tools which have some sort of generation capability (Informatica, Data Stage, SSIS and ODI/OWB) but I’ve picked SSIS to start with creating similar results as demonstrated in the earlier OWB proof of concepts.

The ETL generation uses the data model and constraints to generate a Data Vault integration layer and optional staging and history areas. Last week I’ve made some small gains in getting SSIS packages created via .Net from what is essentially a completely separate web application. The ETL generation tool itself is developed using Visual Studio 2010 and VB .Net and the trick was to get access to the SSIS APIs through this generic Visual Studio development suite, as opposed to SQL Server’s own Visual Studio development tool for SSIS (Business Intelligence Development Studio).

Packages are aimed to be developed for SQL Server 2008.

Below is the process of creating SSIS elements separate from the BIVS. First the references to the following assemblies will need to be added to the Visual Studio project:

– Microsoft.SQLServer.DTSPipelineWrap
– Microsoft.SQLServer.ManagedDTS

These DLL files can be found in the client installation of SQL Server 2008. After that the following code is all that is required to generate a dummy package and a sample dataflow task. I’ve chosen to export the result as a dtsx package file, but you can also save it to SQL Server directly. This is a small step, but it can be used to gradually add more components which are derived from the available metadata. And of course it gives you the complete flexibility in deciding when elements are added to the package.

The code is as follows:

'Import assemblies
Imports Microsoft.SqlServer.Dts.Runtime Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper Imports Microsoft.SqlServer.Dts.Pipeline

Protected Sub Generate_SSIS() Dim package As Microsoft.SqlServer.Dts.Runtime.Package package = New Package package.Name = "MyCreatedPackage"

Dim testapp As Microsoft.SqlServer.Dts.Runtime.Application
testapp = New Application

'Sources and targets (destinations in dataflow)
Dim oledbSource As IDTSComponentMetaData100
Dim oledbDestination As IDTSComponentMetaData100

'Add connection Dim conMgr As ConnectionManager = package.Connections.Add("OLEDB")
conMgr.Name = "OLEDBConnection"
conMgr.ConnectionString = "Provider=SQLOLEDB.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=<database>k;Data Source=<server name>;Auto Translate=False;"

'Add dataflow objects Dim th As TaskHost = package.Executables.Add("DTS.Pipeline")
th.Name = "DataFlow"
Dim dataFlow As MainPipe = th.InnerObject

'Source component info oledbSource = dataFlow.ComponentMetaDataCollection.New()
oledbSource.ComponentClassID = "DTSAdapter.OLEDBSource"
oledbSource.Name = "OLEDBSource"

'Generate and output oledbSource.Instantiate()
oledbSource.RuntimeConnectionCollection.New()

Dim currentDirectory As String = "C:\ETLoutput"
testapp.SaveToXml(currentDirectory + "\test.dtsx", package, Nothing)

End Sub
 
Roelant Vos

Roelant Vos

You may also like...

1 Response

  1. hennie7863 says:

    Hi Roelant,

    Good to see that your looking into SSIS generation. A simple SSIS package can be easiliy generated but more complicated SSIS packages are difficult to generate. I’ve looked into this myself and i think that generating SSIS packages should be more easier than now.

    Gr,
    Hennie

     

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.