This post is also available in: esEspañol (Spanish)

In this post, we will talk about the new property that appears in the SQL 2016 version: Auto Adjust Buffer Size. This attribute is specific to the “DataFlow” component and can take the values of ‘True’ or ‘False’ (default). Furthermore, we will propose an approximate solution for previous versions.

SQL Server 2016 Auto Adjust Buffer Size What does this property do?

To do this we must first talk about some of the “DefaultBufferMaxRows” and “DefaultBufferSize” features. These two properties define the block size and the number of rows of this in the SSIS data flow, so that the larger the block, the greater the amount of data that will be manipulated at the same time in memory.

By default these properties move 10,000 rows and 10MB in ‘Buffer’ size. These sizes can be small compared to the loads that servers can support where they are run. Therefore, when developing it, it may be convenient to play with these properties to improve the performance of our ETL processes.

When the property in question is set to ‘True’, the “DefaultBufferSize” property is ignored, the runtime adjusting automatically with the “DefaultBufferMaxRows” feature and the row size that is processed in the flow. This helps us to ‘tune’ the package.

Testing the property

Next, a comparison is shown in which the average loading times of a table are shown, moving 30 million rows in an ETL through different parameter settings.

AutoAdjunstBufferSize DefaultBufferMaxRows DefaultBufferSize Tiempo (mm:ss.000)
False 10000 10485760 05:07.329
False 100000 10485760 04:48.187
False 1048576 10485760 04:40.125
True 100000 Automático 04:52.250
True 100000 Automático 04:29.875
True 1048576 Automático 04:25.018

Regardless of whether the property is active or not, as we increase the rows we process the better the result in terms of time, therefore, making a more efficient use of the resources we have available. Once we activate the property, the improvement is much more noticeable due to the automatic adjustments it makes.

How can we do this in other versions?

This property is a novelty in 2016, so if we want to do this in previous versions we can implement an approximation in our packages.

While SSIS knows the width of the row that is running at the time, we can get the width of the row of the target table and from there calculate the size of the ‘Buffer’ easily. It is not a perfect solution, the size of the target row may not be the same as the one in the “DataFlow”, but for simple table loads it can be a very valid solution.

To obtain the size of the row or directly, the value that we want to assign to the “DefaultBufferSize” property, we can use the reference below:

DECLARE @TABLE VARCHAR(50) = ?
DECLARE @ROWS INT = ?

SELECT SUM (MAX_LENGTH) * @ROWS AS BUFFER_SIZE 
FROM  SYS.TABLES T
JOIN  SYS.COLUMNS C ON T.OBJECT_ID = C.OBJECT_ID
JOIN  SYS.SCHEMAS S ON T.SCHEMA_ID = S.SCHEMA_ID
WHERE T.NAME = @TABLE

This is inserted in an “SQL Task” and we give the name of the destination table and the rows that will go to the “DefaultBufferMaxRows” property as parameters. It can go in some variables, for example “BufferRows” and “TableName”.

Variables definidas

In the variable “BufferSize” we parameterize the result of the “SQL Task”. Finally, the variables “BufferRows” and “BufferSize” are assigned to the “DefaultBufferMaxRows” and “DefaultBufferSize” properties of the “DataFlow” respectively through expressions.

Editor de expresiones

As a result, making the same measurements by changing the value of the variable “BufferRows” we obtain the following times:

AutoAdjunstBufferSize DefaultBufferMaxRows DefaultBufferSize Tiempo (mm:ss.000)
False 10000 Calculated 05:05.656
False 100000 Calculated 04:31.359
False 1048576 Calculated 04:29.782

By comparing all the performances we see how the solution calculated by us approximates that provided by SSIS.

Gráfico de tiempos

However, the calculations made by the “AutoAdjunstBufferSize” option return a better result from the outset.

Marco Gómez-Pastrana Carbonell

Marco Pastrana es miembro de SolidQ en el área de Inteligencia de Negocios participando en proyectos de BI y Big Data.

Latest posts by Marco Gómez-Pastrana Carbonell (see all)