SQL Server Integration Services (SSIS) offer a wide range of possibilities for integration solutions and data integration, extraction and transformation solutions. There are over 30 components for flow control, and over 40 for dataflow. Tasks such as dumping data from one table to another, adding data, sending emails on errors, downloading a file from an FTP site and processing it – all these are very easy with the components that SSIS provides out-of-the-box. But, on occasions, you’ll find that a given functionality has not been implemented. On many occasions you’ll find yourself obliged to use a script task or script components in order to use a functionality outside the standard provided by the product. What if you could use that functionality again and again? What if it could be a new component and you could use it like the components provided by SSIS? The answer is that that is possible. But there are a series of considerations:
- Extending the functionality isn’t easy:
- Difficulty: It requires a knowledge of the classes and methods that will be used to code that functionality. Additionally, it is language-dependent, so you’ll have to use C# or VB.net and handle them well.
- Time costs: Since this coding is not easy, the development time gets considerably longer.
- You should use a script task whenever possible:
- It’s easy: The component abstracts you form most of the classes you have to use to prepare a component, which makes it easier.
- Quick: As stated above, it’s easier, so the development time is less.
- Specific functionality: If the functionality to be prepared is specific to an individual packet, you should use a script (it won’t be reused, so it won’t be maintained)
But there are also advantages to creating components. There are non-functional project requirements – such as maintainability and reusability – which are very important. Imagine having to use the same function many times when it’s a script task. You’ll have to repeatedly copy the same or similar code throughout the project. And if you detect a fault, you’ll have to change it in all the packets. Personalized components should be used in the following situations:
- When the functionality is generic. The component can be used in other packets and by other developers.
- When the functionality is repeated throughout the packet. For reasons of maintainability, it should be encapsulated in a new component.
- When the implementation changes over time. Imagine that you’ve got 100 packets that use a script task and you have to change the code. You’ll have to edit those 100 packets. If you had used a component, you would only have to change the assembly.
Steps for preparing a personalized component
To better illustrate the concepts in this article, consider the construction of a data task component. This component will perform the following operation: given a dataset which contains changes in stock, it calculates the stock at a given time.
The preparation of a personalized component can be described as a series of phases:
The development phase involves the creation of a class library in C# or VB.net. The class from which to inherit will depend on the type of component you are creating. For the various types of components, there are the following classes:
Once the component has been chosen, the starting point of the process is to inherit from the base class in order to begin developing. To understand this better, take a look at your project
There is a UI folder in which you will develop the component’s UI. There is an auxiliary Buffer class and the class Correcaminos.cs which contains all the component’s logic. It is this class that must, in this case, inherit from the PipelineComponent class. You also have to reference the following libraries:
Now to comment on some of the most important methods. These can be divided into two classes – methods that are called at design time (when you’re designing the packet), and methods that are called at run time, when DTEXEC is invoked.
public override void ProvideComponentProperties()
This is called when the component is added to the dataflow. It creates and configures the component’s inputs and outputs. It is the first method that is called at design time; additionally, you must also configure your component’s behavior to be synchronous or asynchronous.
public override DTSValiedationStatus Validate()
This is called when the component is edited during design, and once before the PreExecute phase. This method validates what has been performed by ProvideComponentProperties anad also validates the other aspects to be configured which the component must comply with. It returns four types of status:
DTSValidationStatus.VS_ISBROKEN: The component is not configured correctly; some property does not have the right value. DTSValidationStatus.VS_ISCORRUPT: The component is damaged and must be reset. Call ProvideComponentProperties(). DTSValidationStatus.VS_ISVALID: The component has been satisfactorily validated. DTSValidationStatus.VS_NEEDSNEWMETADATA: The component's metadata is not correct or is out of date.
public override IDTSInputColumn100 SetUsageType(int, IDTSVirtualInput100, int, DTSUsageType)
The SetUsageType method is called when a column in the virtual input collection is selected to be used by the component. The virtual collection contains the collection of columns provided by the previous component. Based on the DTSUsageType parameter, you have:
UsageType.UT_IGNORED: The columns were previously added to the input collection. The base clase removes the input columns, and the output columns that are paired with the input are deleted. UsageType.UT_READONLY: The column is added to the component's input in read-only mode, so that its value cannot be modified UsageType.UT_READWRITE: The column is added to the component's input in read-write mode
public override void ReinitializeMetaData()
This is the method which is called when the Validate() method returns the status DTSValidationStatus.VS_NEEDSNEWMETADA. In this method, you should solve any possible problems caused by metadata
public override void PreExecute()
This is the first of the methods that is called. This performs those operations which must be performed before the execution itself. It performs tasks such as mapping columns onto own objects, use of variables, etc.
public override void PrimeOutput(int, int, PipelineBuffer)
PrimeOutput is an especially important method for components which behave asynchronously. This method is used to transfer rows to the output buffers (for the data origin and transformation components)
public override void ProcessInput(int, PipelineBuffer)
The last of the run time methods is ProcessInput, which executes the component’s pure logic. It handles the information row by row and performs the component’s functionality. It will be coded according to how the component will, be used.
For the graphical interface, it is necessary to inherit from the following classes:
Although this article will not go into this matter in depth, it is important to note that the creation of a graphical interface is not easy. And, even though you haven’t developed this interface, the necessary mechanisms for configuring a default UI for your component are provided.
To generate the assembly, you must compile the project. This project must be signed, so that this assembly specifies a full name which includes the assembly name, the cultural reference, the public key and the version number. The runtime engine uses this information to distinguish it from other assemblies with the same name.
To sign your assembly you will have to generate a key for signing it, and specify, in the Visual Studio properties, that you want it to be like that.
The implementation phase consists of installing the assembly in the Global Assembly Cache. This step is essential, since, if the assembly is not installed in the GAC, the DTEXEC engine will fail to execute a packet with a component that is not installed in the GAC. To install the assembly in the GAC, either drag the assembly to the C:[WindowsIntall]assembly folder, or use the Visual Studio command console and the “gacutil” tool.
Installation in the SSIS
Once your component has been installed in the GAC, the next step it to install it into the appropriate folders in the Integration Services installation directory. You should install it in C:Program Files (x86)Microsoft SQL Server100DTS and, depending on the type of component, it should be installed in the following folders:
The connection administrator assemblies should be in the Connections folder; the enumerator assemblies should be in the ForEachEnumerators; the log provider assemblies should be in LogProviders.
The component you have created must be put in Pipeline components, and, lastly, the tasks should go in the Tasks folder
Additionally, if the installed component can go in the toolbar, you must add it manually
To perform testing on your component, create a new packet, run it, and check the results. If you wish to debug the component while developing it, you will have to attach the process, where “denenv” is the process during the design phase, and “DtsDebugHost” is the process during the execution phase.
Creating a personalized component for SQL Server Integration Services is not a simple task. It requires an in-depth knowledge of the classes used, and the development time is longer than for a code fragment in a Script Task or Script Component. The advantage is that, for SSIS solutions, maintenance is much simpler, as is change management.
We can conclude that it is only necessary to create a component if that component will solve a given problem that can be reproduced and which will occur more than once over time. In this case, creating a component will be of considerable assistance for your SSIS solutions. When it is necessary to migrate these packets, you must take into account that, if the migration is to a new server which runs the same version of SQL Server as the one it was created for, you will have to install these libraries in the new server in order to be able to run these packets. If the migration is to a new version of SQL Server, you will have to check that none of the methods have been changed between one version and the next.
It’s a good idea to evaluate the component’s development time in terms of the reduced maintenance time required for SSIS packets. If maintenance will be complex, developing a component is a good option; otherwise, you should either use Script solutions or the components provided by SQL Server Integration Services out-of-the-box.
- Developing SSIS Components - May 2, 2012
- How to Rename a Primary Key (The clean and the dirty way) - January 30, 2011