Taking into account the projects we carry out, Advanced Business Intelligence (BI) consulting is one of the most in-demand jobs, as stated in some digital newspapers, including Finanzas.com. Therefore, having powerful tools that are simple and adaptable to our changing requirements will mean an increase in the quality of our services. Solid Quality Mentors is considered the leader of global solutions for the entire Microsoft Data Platform, which provides us and others the tool SQL Server Integration Services (SSIS), which is the subject of this article. BI projects inevitably require the consolidation of efficient information. Since such consolidation is achieved through SSIS, and its design is personalized as potentially “similar” in all scenarios in which we have worked to date, automating its creation is a good idea.

Through SolidQ, we’ve noticed that customers often duplicate databases with the same structure to provide what they call “partitions,” which is, in reality, a manual “sharding”; they replicate tables, renaming them to get the desired partition. This practice only adds complexity to the process of Data Warehouse creation, where the unified and “clean” information must be located in order to be utilized correctly.

Apart from this process complexity in the execution of a BI project where the creation of SSIS packages is involved, and because they are “live” projects and typically of medium duration, sometimes the source objects (tables) are modified to add columns, change data types, or delete columns and we are forced to redo the work in SSIS, which is unnecessarily time-consuming.

In this article, we will explain how to create SSIS packages in which we can unify all the information with just two clicks. Moreover, given that using the libraries will not require direct human intervention in the creation of the SSIS packages, a change in the source or destination object of our packages will only require pushing the bottom “generate” key of our generator. Through extensive experience, we have learned that if our systems have more than 50 or 60 tables, we save considerable time doing it this way. If we have to repeat the process only once, because at the end we notice a missed detail (little or not), with half the tables, we would have saved time writing this code.

In short, this article will show how to create a Console C# application that will create SSIS packages without human intervention, except to indicate the source and destination.

Requirements to reproduce the article

The following requirements are recommended to reproduce the article:

  • Visual Studio 2010: Express edition.
  • SQL Server 2008 R2 (the Standard Version is sufficient for SSIS and testing of our created packages.
  • Knowledge of SSIS: the ability to create SSIS packages, deploy and use them.
  • Knowledge of C# programming and management of Visual Studio 2010:
  • A basic understanding of object-oriented programming methodologies and a workable knowledge of Visual Studio 2010. For downloading of the solution code at the SolidQ site, Click here.

Dtsx Package

Before starting, please understand that the creation of SSIS packages by code is not well documented. Most of what you read through the above requirements listing comes from test-error, as well as from reverse engineering, analyzing the objects’ performance and the resulting xml. No copyrights have been infringed upon. To learn how a method works correctly, we sometimes have to see what is generated and compare the results with those generated in Visual Studio.

Moreover, remember that an integration services package is just an XML that contains the needed Meta information to enable the process dtexec.exe to accomplish what we have indicated it will do.

For example, we use BBDD AdventureWorks and move information to a new BBDD that we call Staging.

The goal

Before beginning, be sure you have identified your goal. The best way to do this is to create the package using the tool best for this endeavor, Visual Studio. Once you know what you want to obtain, we can better provide a solution per code. It is common for first-time users to create “n” identical packages, in which something changes.
Start from a very simple code to lay the foundation for what will codify SSIS programmatically. The simplest example involves moving data from a source table to a destination table. To achieve this goal using the SSIS generator, add a DataFlow component (see figure 1), and inside it, two OleDbSource y OleDbDestination components. But in advance, it’s normal to add connection objects. See figure 0.

Figure 0

Figure 0

Figure 1

Figure 1

Figure 3

Figure 3

In the configuration of OleDbSource, indicate which object you want as source (Figure 3) and in OleDbDestination, indicate which object you want as destination (Figure 4).

Figure 3

Figure 3

Figure 4

Figure 4

In short, you will move data from the table HumanResources.Employee of BBDD AdventureWorks, to the table dbo.employee of BBDD Staging.
NOTE: The only requirement is that the destination table, which we call “Employee,” should exist.

Generating packages programmatically

For the building of the SSIS creator, use Visual Studio 2010. Although the Visual Studio 2008 version is the only one used to create SSIS 2008 R2 packages, in this instance, use the libraries that Visual Studio 2008 utilizes; the development environment employed makes no difference, since you are building your own SSIS generator.

First, create a solution consisting of:

  • Library Project (EnriqueCatalaSSISGenerator): it will contain the code of the SSIS generator.
  • Console Project (DemoSSISGenerator): it will use the library project to create the package.
Figure 5

Figure 5

The next step: add the libraries needed for the creation of the SSIS generator library.

These libraries are typically inside folder SDK of Microsoft SQL Server (see figure 6).

C:Program Files (x86)Microsoft SQL Server100SDKAssemblies:

Figure 6

Figure 6

Start working on the codification for the methods needed for the SSIS code generator.

NOTE:  you can download this solution here.

SSIS generation library

Taking into account the kind of SSIS package you want to model, start working on the codification of the methods needed, which must cover the following scenarios:

Add OLE DB connection objects

gssisppp1-7

Add Data Flow components

gssisppp1-8

Add OleDbSource components

gssisppp1-9

Add OleDbDestination components

gssisppp1-10

Main method of package generation that the previous ones use.

Adding OLE DB Connection objects

To add an OleDb connection to a SSIS package, create an object of class “ConnectionManager,” initializing its properties ConnectionString and Name. While ConnectionString will identify a valid connection string, the property Name will identify the name of the connection as seen when editing the package.

Use the following code to add OleDb connections to our packages by code:

protected ConnectionManager AddOleDbConnection(Package pck, string nameConexion, string CadenaConexion)
{
ConnectionManager cm;
cm = pck.Connections.Add(“OLEDB”);
cm.ConnectionString = CadenaConexion;
cm.Name = nameConexion;
return (cm);
}

 

Adding DataFlow components

Use the following set of Executables of the Package object called “STOCK:PipelineTask.”

protected virtual TaskHost AddDataFlow(Package pck, string dataFlowName)
{
Executable e = pck.Executables.Add(“STOCK:PipelineTask”);
TaskHost thMainPipe = (TaskHost)e;
thMainPipe.Name = dataFlowName;
return (thMainPipe);
}

 

Adding OleDbSource components

In the scenario being modeled, you are taking data directly from a table, identifying it only by its name (we haven’t specified any select).

First: create a standard object IDTSComponentMetaData100, which will identify your connection object to the source.  Next:  identify that component in its property ComponentClassID, indicating the value “DTSAdapter.OLEDBSource,” in order to make it an OleDbSource.  Now: launch the method Instantiate() and ProvideComponentProperties(), to ensure the object acquires its OleDBSource properties. Indicate in its properties that it will be “AccessMode = 0” and that the type “OpenRowset” will point at the name of the table with scheme that we have executed (remember, in the example, it will be HumanResources.Employee.)

Finally:  initialize its metadata (see code at the try block). This last part will open the connection against the SQL Server, and it will try to initialize them. If the “connection” object has been incorrectly named, an error occurs.

protected virtual IDTSComponentMetaData100 AddSourceOleDbFromTable(PWrap.MainPipe flujo, string nombreComponente, string no
mbreTablaConEsquema, ConnectionManager connection)
{
PWrap.IDTSComponentMetaData100 conexionAOrigen = flujo.ComponentMetaDataCollection.New();
conexionAOrigen.Name = nombreComponente;
conexionAOrigen.ComponentClassID = “DTSAdapter.OLEDBSource”;
PWrap.CManagedComponentWrapper instance = conexionAOrigen.Instantiate();
instance.ProvideComponentProperties();
conexionAOrigen.RuntimeConnectionCollection[0].ConnectionManager = DtsConvert.GetExtendedInterface(connection);
conexionAOrigen.RuntimeConnectionCollection[0].ConnectionManagerID = connection.ID;
conexionAOrigen.Name = nombreComponente;
instance.SetComponentProperty(“AccessMode”, 0);
instance.SetComponentProperty(“OpenRowset”, nombreTablaConEsquema);
// Reinit metadata
try
{
instance.AcquireConnections(null);
instance.ReinitializeMetaData();
instance.ReleaseConnections();
}
catch (Exception e)
{
throw ;
}
return (conexionAOrigen);
}

 

Adding  OleDbDestination components

At this point, you must provide the code that identifies the OleDbDestination. Unlike in the previous methodology, where you only needed to provide a connection object to a destination and a name to the table, you will need to provide a source component, because in this case, you are going to connect the OleDbSource component directly to OleDbDestination.

gssisppp1-11

The code of initialization is very similar, and these are the only changes:

  • ComponentClassID will be “DTSAdapter.OleDbDestination” since you want an object of destination type.
  • Now, assign the properties:
    1. AccessMode = 3 (to load identification of an object name).
    2. FastLoadOptions = “TABLOCK, CHECK_CONSTRAINTS” (you want fast loading)
    3. OpenRowset = destination object name (in this case caso dbo.employee).
  • Once you have indicated the properties in the loop for each that can be seen in the code, map the columns of the OleDbSource component recently created.
Listing 4

Listing 4

Package generation

Once you have the previous code, only use the methods that have been previously created to generate your package. Note the following code. See listing 5.

public void GeneratePackage(string sourceConnectionString,string destinationConnectionString)
{
/// Open connection to source
///
OleDbConnection cn = new OleDbConnection(sourceConnectionString);
cn.Open();
/// Package creation
Package Mipk = new Package();
///We can specify a package name (optional)
Mipk.Name = “PackageName” ;
Application app = new Application();
///Source OleDb connection
ConnectionManager connOrigen = AddOleDbConnection(Mipk, “ORIGEN”, sourceConnectionString);
///Destination OleDb connection
ConnectionManager connDestino = AddOleDbConnection(Mipk, “DESTINO”, destinationConnectionString);
// Add DataFlow
PWrap.MainPipe df = AddDataFlow(Mipk, “SolidQJ Dataflow”).InnerObject as MainPipe;
/// Add Source oledb connector
IDTSComponentMetaData100 source = AddSourceOleDbFromTable(df,“Source component”, “humanresources.employee”, connOrigen);
/// Add destination oledb connector
AddOleDbDestinationTable(df,connDestino,source,“dbo.Employee”);
///Save the file to folder
app.SaveToXml(@”c:bbddpackage.dtsx”,Mipk,null);
}

 

Console project

In the console application, initialize an object of your class generator code and give it the value preferred.

static void Main(string[] args)
{
EnriqueCatalaSSISGenerator.EnriqueCatalaSSIS packageGenerator = new EnriqueCatalaSSISGenerator.EnriqueCatalaSSIS();
String destinationCS = @”Data Source=(local)sql2008r2;Initial Catalog=Staging;Provider=SQLNCLI10.1;Integrated S
ecurity=SSPI;Application Name=SSIS-Package;Auto Translate=False;”;
String sourceCS = @”Data Source=(local)sql2008r2;Initial Catalog=AdventureWorks;Provider=SQLNCLI10.1;Integrated
Security=SSPI;Application Name=SSIS-Package;Auto Translate=False;”;
packageGenerator.GeneratePackage(sourceCS,destinationCS);
}

 

After opening and launching the .dtsx package, the result of its execution can be seen here:

gssisppp1-13

Conclusion

As we have seen, the SSIS generation process by code is a relatively simple task, if you know its basic features. However, its codification involves the need for the generation of inside components and objects, the details of which were not included in this first article. Our aim is to end up creating more complex packages (as shown in the following figure), which have variable flows and more complex components with the operators “Merge Join” and the “Conditional Split” that require many more hours to set up.

Thanks to the SolidQ library and Solid Quality Mentors, we can generate thousands of packages that carry out very complex tasks in mere seconds. What is even  more important, a change in the objects inherent in the package (change of data types, names of columns, of objects, etc.). It doesn’t cost anything, since only two clicks will regenerate the package.

In next article, readers will be shown how to make these operators infer columns and types of complex data without human intervention.

Enrique Catalá

Enrique Catalá es Mentor en SolidQ, Microsoft Data Platform MVP desde 2013 e Ingeniero en informática. Es Microsoft Certified Trainer (MCT) y está centrado en el motor relacional SQL Server, tanto en la resolución de problemas de rendimiento y escalabilidad en sistemas OLTP como la definición e implementación de entornos de alta disponibilidad confiables, en donde ha llevado con éxito más de 100 proyectos no solo en España, sino en diferentes países como EEUU, Holanda, México, Arabia Saudí, Austria... Es el arquitecto principal de las soluciones para SolidQ llamadas HealthCheck, SQL2Cloud, SCODA y del generador de SSIS de SolidQ. Ponente habitual del SolidQ SUMMIT, miembro y ponente en SQL PASS tanto en España como Iberoamérica ponente en varios SQLSaturday