Not all virtual machines are the same

Not all virtual machines are the same

It is not uncommon to find a wide range of situations among our customers in terms of virtual machine performance with SQL Server. In many cases, we find situations where performance levels are far from ideal but, in general terms, virtual machines themselves are not to blame. What usually happens is that when we move SQL Server to a virtual machine, we become constrained by a maximum or limited amount of resources (CPU/ memory/ IO) that is significantly different to that of the physical machine. (more…)

Azure Files Premium + SQL Server Failover Cluster instance = simplified OnPremise to Cloud

Azure Files Premium + SQL Server Failover Cluster instance = simplified OnPremise to Cloud

One of the issues that many of our customers face when attempting to migrate OnPremise instances to the Cloud is the lack of a simple “shared storage”. Although there are some alternatives supported by third-party software or SDS solutions that allow us to configure a Failover Cluster instance in Azure, these are highly complex, therefore adding significant further costs to the solution’s TCO.
(more…)

Azure Database integrated authentication with SSIS

Azure Database integrated authentication with SSIS

In many scenarios, we face the need to use integrated authentication in order to gain access to the required data sources to feed our analytical system. In view of Azure’s increasingly widespread use, as is the case with at least part of our infrastructure, some of these sources are hosted in Azure databases. In this case, we will discuss an actual error that we have come across when configuring and using integrated authentication in Azure databases with SSIS.
(more…)

Power BI Bookmarks! What are they for? How can I use them?

Power BI Bookmarks! What are they for? How can I use them?

In this entry, we will show you how to create bookmarks and a few different scenarios where they might be useful. Power BI Bookmarks are basically used to store the status of a specific report page including the filter selection and the visibility of the different objects, allowing the user to return to that same status by simply selecting the saved bookmark.

(more…)

Converting Excel files into CSV and uploading files using SSIS

Converting Excel files into CSV and uploading files using SSIS

Although SQL Server Integration Services, hereinafter SSIS, is capable of uploading Excel files, in most cases it can be time consuming because any small modifications to the Excel files can make the SSIS crash. For that reason, the best option is usually to transform those Excel files into .csv format, since uploading text files will cause you significantly less issues than the Excel files.

You can quickly save any Excel file as csv manually by saving as .csv from within Excel. However, it becomes an issue when you have to do the same for a lot of Excel files, or in cases were you need the change to be done automatically.

In this post, we will explain how to do this format change automatically using PowerShell and how to loop through files in the same directory in order to upload several Excel files together using SSIS as the main tool for the whole process.

1. Preparing the environment

In our example, we will be working with 2 Excel files with 3 tabs each that will be called Excel1.xlsx and Excel2.xlsx. These files contain stock information regarding products and are divided by categories in each tab. In the next image, you can see the Excel files’ structure:

Contenido fichero Excel

File “Excel1.xlsx”, tab “cat1”

 

Contenido fichero Excel

File “Excel1.xlsx”, tab cat2

 

This information will be saved to the Stock table with the following structure:

CREATE TABLE [dbo].[Stock](
  [Item] [varchar](20) NULL,
  [Categoria] [varchar](20) NULL,
  [Stock] [int] NULL,
  [FileName] [varchar](50) NULL,
  [InsertDate] [datetime] NULL
)

Columns FileName and InsertDate will help us to find out which file does each row belong to and on what date and time these were inserted into the table.

 

 

2. Converting Excel files to .csv

Firstly, we will need to create 4 parameters in our SSIS package:

  • ExcelExtension: to state the extension of our Excel file (i.e., xls or xlsx).
  • RutaCSV: path where the csv files will be left.
  • RutaExcel: path where the Excel files will be left.
  • RutaPowerShell: path where the PowerShell script will be left.

Parametros

We will also need 3 variables:

  • commandPowerShell: location where the command to run the PowerShell script will be saved to.
  • FileName: location where the file name will be saved to once the file has been converted into csv, so that the file name can be stored in our database table.
  • FullFilePath: location where the full csv file path will be saved to. This variable shall be used later when creating the loop to iterate through the different files in order to upload them.

Our next step will be to prepare the commandPowerShell variable once all the path and Excel file extension parameters have been set-up. In order to do this, we will need to add the following “Expression” to the variable:

"-command \""+@[$Package::RutaPowerShell] + " -rutaExcel " + @[$Package::RutaExcel] + " -rutaCSV " + @[$Package::RutaCSV] + " -excelExt xlsx\""

With the above values in my own parameters, the resulting variable would be:

Comando PowerShell

This is the PowerShell file that will be run:

<#
#### Script to convert Excel files into CSV ####
Parameters:
    $rutaExcel: path where the Excel files are
    $rutaCSV: path where the csv files will be left
    $excelExt: Excel file extension (xls, xlsx)
#>
param ([string] $rutaExcel, [string] $rutaCSV, [string] $excelExt )

<# 
#### Function to convert Excel files into CSV ####
Parameters:
    $excelFileName: Excel file name
    $csvLoc: destination path for the csv files
    $excelLoc: source path for the Excel files
    $excelExtension: Excel file extension (xls, xlsx)
#> 
Function ExportExcelToCSV ($excelFileName, $csvLoc, $excelLoc ,$excelExtension)
{
    #we shall save the Excel file path
    $excelFile = $excelLoc + $excelFileName + $excelExtension
    $E = New-Object -ComObject Excel.Application
    $E.Visible = $false
    $E.DisplayAlerts = $false
    $wb = $E.Workbooks.Open($excelFile)

    #we will iterate every sheet in the file in order to convert it to csv
    foreach ($ws in $wb.Worksheets)
    {
        $n = $excelFileName + "_" + $ws.Name
        $ws.SaveAs($csvLoc + $n + ".csv", 6)
    }
    $E.Quit()
}

#mask to loop through all Excel files with the provided extension
$mask = "*."+$excelExt
#we will add the dot before the extension type
$ext = "."+$excelExt

#we shall obtain all Excel files that are compliant with the mask in the specified path
$ens = Get-ChildItem $rutaExcel -filter $mascara

#we shall loop through all Excel files found and call the function to convert them to csv
foreach($e in $ens)
{
    ExportExcelToCSV -excelFileName $e.BaseName -csvLoc $rutaCSV -excelLoc $rutaExcel -excelExtension $ext
} 

 

 

Our next step is to call our PowerShell script. For that purpose, we will use the “Execute Process Task” component

Toolboox execute process task excel to csv

We will open the component, go to “Process”, and select the location where the PowerShell script is saved in our computer under Executable:

Configuración Execute Process Task Excel to csv

Following, we will go to “Expressions” and enter an expression to set-up the parameters under “Arguments” with the previously set-up commandPowerShell variable.

Configuracion arguments del execute process taskq excel to csv

Once the above has been completed, we could do a small test to check whether the transformation of Excel files to csv is actually working.

We will place the Excel files in the path:

Ruta ficheros Excel

Run the component:

Ejecución de la tarea PowerShell

And once it has finished running, we will check the path where the csv files should have been created to

Ruta ficheros csv

As you can see in the image below, 6 csv files were created with the naming convention [ExcelFileName]_[ExcelTabName].csv

 

 

3. Uploading csv files to the database

Now it’s time to upload the above csv files to our database. In order to do this, we will use the “Foreach Loop Container” to loop through the directories and upload the csv files.

We will set this up in the “Collection” section by selecting “Foreach File Enumerator” as the Enumerator, then move to ”Expressions” to set-up the parameters for the source path where our csv files are located using our “RutaCSV” package parameter. Down below in the “Files:” we shall enter the “*.csv” mask in order to upload all csv files and under “Retrieve file name” we will select “Fully qualified” in order to save the full file path, which will be needed later on to indicate our reading source where the csv files are located:

Foreach Loop Editor Excel to csv

Following, in the “Variable Mappings” section, we will map our FullFilePath variable, which is where the full csv file path will be saved to:

Once this has been completed, our loop should be fully set-up.

So now it’s time to set-up the tasks within the above loop. One of these will be a “Script Task” in order to obtain the csv file name from the full file path, and there will also be a “Data Flow” task to upload the csv files to our database.

Starting with the “Script Task”, we will drag the component inside the “Foreach Loop Container”. Then, double click on the “Script” section and enter the FullFilePath variable under ReadOnlyVariables and the FileName variable under ReadWriteVariables:

Click on the “Edit Script” button and enter the following code into the script in the Main function:

public void Main()
{
    //save full path into an auxiliary variable
    string aux = Dts.Variables["User::FullFilePath"].Value.ToString();
    //divide path using \
    string[] split = aux.Split('\\');
    //keep last piece of the path that contains file name
    string fileName = split[split.Length - 1];
    //save the name of the file in our varible
    Dts.Variables["User::FileName"].Value = fileName;

    Dts.TaskResult = (int)ScriptResults.Success;
}

At this point, we move on to the “Data Flow” by also dragging it into the loop component after the Script Task. Once inside, we will begin with the source, which will be a “Flat File Source”:

We will open it and click on “New” in order to create a new Flat File Connection Manager.

In this manner, the first Flat File Connection Manager window would be set-up as follows:

Flat File Connection manager excel to csv

Under “File name”, we will enter the path to some of the files for the initial set-up, although this path will be subsequently set-up with parameters to ensure that it is automatically picked up from what is dynamically obtained from the loop.

We will then move on to the “Advanced” section in order to set-up the file columns. In this case, we will have 3 columns, ItemCategoria and Stock.

Once this is completed, our set-up will be ready and the next step will be to provide the parameters for the file path. In order to do this, go to “Connection Managers” in the bottom area, select the newly created InventarioCSV connection and right click to enter “Properties”. All properties will then appear listed on the right hand side of the screen, including the the “Expressions” section where we will enter the “FullFilePath” variable under ConnectionString:

Our next step will be to add a “Conditional Split” in order to discard the rows corresponding to the file headers.

As shown in the following image, each Excel file tab had a top header row that we do not want to save to our database:

Cabecera fichero Excel

So, in order to discard these rows, we will add the following condition to our Conditional Split:

Item == "Item" || Item == ""

This will ensure that no rows with an empty Item column or that contain the word “Item” will be uploaded.

Configuracion Conditional Split excel to csv

Our default exit will be called “Detalle”, and this is what we will connect to our next component, a “Derived Column”:

Data flow excel to csv

This Derived Column will be used to add the columns that will contain the file name and insert date using the following set-up:

Configuracion Derived Column

And finally, all that is left to do is to enter our destination, which will be an “OLE DB Destination” component

We will open it and click on “New” in order to create the connection manager with the connection to our database.

Once this connection has been set-up, we will select the table where we want to save the data to, which in our example is the Inventario table we created previously:

Connection manager excel to csv

And as the last step, in the “Mappings” section, we will map the columns imported with our flow with the table columns:

Mappings excel to csv

So this would be the final image of what our Data Flow should look like:

Data flow excel to csv

And this would be the true image of what our Control Flow should look like:

Control Flow excel to csv

At the very beginning, we added an “Execute SQL Task” that contains a truncated version of the Inventario table.

And now everything should be ready to run our package and check the results:

Control Flow ejecutado excel to csv

As you can see, the csv files have been created to our directory:

And subsequently, these files were uploaded to our table:

Now you know how important is to change Excel to CSV format for issues with SSIS.

If you want more information about SSIS, here you get more information about that where one of our experts addresses the issue in greater depth.

We hope it helps you and if you want to keep getting advices and tricks about this subject, SolidQ invites you to subscribe to our newsletter

Creating PowerPoint presentations from R

Creating PowerPoint presentations from R

Regardless of the tools used for data analysis, normally the way to display the results is a Word document or a PowerPoint presentation.

In this post, we will create a PowerPoint presentation and insert a series of graphics and text programmatically, using the OfficeR and rvg packages together. We will also take advantage of the occasion to present (for those who do not know) the ‘Pipe’ operator, very useful when nesting functions.

(more…)

Azure ASR and SQL Server

Azure ASR and SQL Server

In an on-premises environment when we propose solutions to geographical disasters, the most common option is log shipping. The use of asynchronous database mirroring or availability groups with asynchronous replicas is also common but includes an additional risk that is not usually contemplated. We refer to the “speed” with which the changes are transferred, as quickly as the network and the target system allow us. This means that when the disaster has a human origin, an important error when we become aware of it, we will have this error replicated and applied. Obviously, a better solution would be to combine both options, which are not exclusive, with which we would cover more disaster scenarios increasing the cost of the solution. (more…)