Python for SQL Server Specialists Part 1: Introducing Python

Python for SQL Server Specialists Part 1: Introducing Python

Python is one of the most popular programming languages. It is a general purpose high level language. It was created by Guido van Rossum, publicly released in 1991. SQL Server 2016 started to support R, and SQL Server 2017 adds support for Python. Now you can select your preferred language for the data science and even other tasks. R has even more statistical, data mining and machine learning libraries, because it is more widely used in the data science community; however, Python has broader purpose than just data science, and is more readable and might thus be simpler to learn. This is the first of the four articles that introduce Python to SQL Server developers and business intelligence (BI) specialists. This means that the articles are more focused on Python basics and data science, and less on general programming with Python.


Starting with Python

Python is an interpreted language. The philosophy of the language is about the code readability. For example, you use white spaces to delimit code blocks instead of special characters like semicolon or curly brackets. Python supports automatic memory management. It has a dynamic type system. You can use multiple program paradigms in Python, including procedural, object-oriented, and functional programming. You can find Python interpreters for all major operating systems. The reference implementation of Python, namely CPython, is open source software, managed by the non-profit Python Software Foundation. Of course, being open source, also means that there is a reach set of libraries available. Even the standard library is impressive and comprehensive.

In order to start working with Python and R, you need to do some installation. I am not covering general SQL Server and Visual Studio installation, I am just explaining what you need to do to start using Python with SQL Server.


Installing ML Services and VS 2017 for Data Science

You just start SQL Server setup, and then from the Feature Selection page select Database Engine Services, and underneath Machine Learning (ML) Services (In-Database), with Python only, or both languages, R and Python, selected. After that, all you need are client tools, and you can start writing the code. The following figure shows the SQL Server setup Feature Selection page with appropriate features selected.

The next step is installing client tools. Of course, you need SQL Server Management Studio (SSMS). In addition, you might want to install Visual Studio (VS) 2017. You can use either Professional or even free Community edition to develop python (and also R) code.

When installing Visual Studio 2017, be sure to select Python development workload, and then Data science and analytical applications, like the following figure shows. This will install Python language templates, including data science templates, and also R Tools for Visual Studio.

Selecting the Appropriate Python Engine

There you go, you are nearly ready. There is a small trick here. VS 2017 installs also its own Python interpreter. In order to use the scalable, the one installed with SQL Server, the one that enables executing code in the Database Engine context and includes Microsoft scalable libraries, you need to setup an additional Python environment, pointing to the scalable version of the interpreter. The path for this scalable interpreter is, if you installed the default instance of SQL Server, C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\python.exe. You can see how to setup this environment in the following figure.

That’s it. You are ready to start Python programming. Just start a new project and select the Python Application template from the Python folder. You can also explore the Python Machine Learning templates, which include Classifier, Clustering, and Regression projects. If you selected the Python Application template, you should have open the first empty Python script with default name the same as the project name and default extension py, waiting for you to write and interactively execute Python code.

Python Language Basics

Python uses the hash mark for a comment. You can execute Python code in VS 2017 by highlighting the code and simultaneously pressing the Ctrl and Enter keys. You can use either single or double apostrophes for delimiting strings. The first command you will learn is the print() command. Write and execute the following code:

# Hash starts a comment
print("Hello World!")
# This line ignored
print('Printing again.')
print('O"Hara')   # In-line comment

You can observe the code you wrote and the results in the Interactive window, which is by default below the script window, at the bottom left side of the screen.

Python supports all basic mathematical and comparison operators, like you would expect. The following code introduces them. Note that you can combine strings and expressions in a single print() statement.

1 + 2
print("The result of 3 + 20 / 4 is:", 3 + 20 / 4)
10 * 2 - 7
10 % 4
print("Is 7 less or equal to 5?", 7 <= 5)
print("Is 7 greater than 5?", 7 > 5)

The next step is to introduce the variables. Note that Python is case-sensitive. The following code shows how you can assign values to variables and use them for direct computations and as the arguments of a function.

# Integer
a = 2
b = 3
a ** b
# Float
c = 7.0
d = float(5)
print(c, d)

You can define strings inside double or single quotes. This enables you to use single quotes inside a double-quoted string, and vice-versa. You can use the “%?” operator for formatting strings to include variables, where the question mark stands for a single letter denoting the data type of the variable, for example “s” for strings and “d” for numbers. The str.format() method of the string data type allows you to do variable substitutions in a string. Here are some examples.

e = "String 1"
f = 10
print("Let's concatenate string %s and number %d." % (e, f))
four_cb = "String {} {} {} {}"
print(four_cb.format(1, 2, 3, 4))

The result of the previous code is:

Let’s concatenate string String 1 and number 10.
String 1 2 3 4

You can also create multi-lines strings. Just enclose the strings in a pair of three double quotes. You can also use special characters, like tab and line feed. Escape them with a single backslash character plus a letter, for example letter t for a tab and letter n for a line feed.

You can always get interactive help with the help() command. A Python module is a file with default extension .py containing Python definitions and statements. You can import a module into your current script with the import command, and then use the functions and variables defined in that module. Besides modules provided with the installation, you can, of course, develop your own modules, distribute them, and reuse the code.

Using Functions, Branches, and Loops

Like in any serious programming language, you can encapsulate your code inside a function. You define a function with the def name(): command. Functions can use arguments. Functions can also return values. The following code defines two functions, one that has no arguments, and one that has two arguments and returns a value. Note that there is no special ending mark of a function body – the correct indentation tells the Python interpreter where the body of the first function ends, and the definition of the second function starts.

def p_n():
    print("No args...")
def add(a, b):
    return a + b

When you call a function, you can pass parameters as literals, or through variables. You can also do some manipulation with the variables when you pass them as the arguments to a function. The following code shows these possibilities.

# Call with variables and math
a = 10
b = 20
add(a / 5, b / 4)

You can make branches in the flow of your code with the if..elif..else: statement. The following code shows you and example.

a = 10
b = 20
c = 30
if a > b:
    print("a > b")
elif a > c:
    print("a > c")
elif (b < c):
    print("b < c")
    if a < c:
        print("a < c")
    if b in range(10, 30):
        print("b is between a and c")
    print("a is less than b and less than c")

The results of the code are:

b < c
a < c
b is between a and c

The simplest data structure is the list. Python list is a set of comma-separated values (or items) between square brackets. You can use a for or for each loop to iterate over a list. There are many methods supported by a list. For example, you can use the list.append() method to append an element to a list. The following code shows how to create lists and loop over them with the for and foreach loops. Finally, it shows a while loop.

animals = ["cat", "dog", "pig"]
nums = []
for animal in animals:
    print("Animal: ", animal)
for i in range(2, 5):
i = 1
while i <= 10:
    i = i + 1

The last data structure presented in this introduction article is the dictionary. A dictionary is a set of the key – value pairs. You can see an example of a dictionary in the following code.

states = {
    "Oregon": "OR",
    "Florida": "FL",
    "Michigan": "MI"}
for state, abbrev in list(states.items()):
    print("{} is abbreviated {}.".format(state, abbrev))

I mentioned that in Python you can also use object-oriented paradigm. However, going deeper with object-oriented programming with Python is beyond the scope of this and the following articles.


I guess the programming in Python introduced so far was not over exciting. However, you always need to start with basics, and only after you embrace the basics, the exciting part starts. Therefore, don’t miss my next article, when I will introduce the most important data structure for advanced analytics, the data frame structure.

SQL Server performance with Spectre and Meltdown patches

SQL Server performance with Spectre and Meltdown patches

As you know, as long as you are not totally oblivious to the technological world you will have heard about one of the biggest bugs in the history of computer science (Spectre and Meltdown) and that its effects are real. So real, that we ourselves at SolidQ ourselves have experienced it in our own Query Analytics software. In this post I will try to shed some light on how to proceed if you detect performance regression in your solution with SQL Server, explaining how I have solved it in my own system.


I use columnstore indexes on SQL 2014. What would be the advantages of upgrading to SQL 2016?

I use columnstore indexes on SQL 2014. What would be the advantages of upgrading to SQL 2016?

SQL Server 2016 really represents a drastic change for the better in regards to general performance. And the proof is that, thanks to the engine changes, this is the first SQL Server version where Microsoft actively provides a widespread increase in performance levels by simply upgrading. But what impact can it have in your daily loads if you use columnstore indexes in SQL server 2014 datawarehouse? There was already an amazing increase in performance levels with SQL Server 2014 and the columnstore indexes. Is it worth upgrading to SQL 2016? (more…)

Virtual SANs and the trend towards SDS

Virtual SANs and the trend towards SDS

Lately, we are noticing a trend whereby traditional SANs are increasingly losing ground. This can be due to various reasons, such as forced migrations to the cloud, limited budget environments or even customers who are getting annoyed with their storage providers’ draconian terms and conditions. A common example is to find very costly maintenance agreements, higher-than-market disk size upgrade costs, as well as difficulties to expand/ improve hardware without having to completely replace it. There is also a strong trend to continue going virtual and trading servers like they were commodities. All the above is therefore contributing to an increase in this trend towards SDS, while the VSANs continue to attract new supporters.


The SQL Server Browser service and multiple instances

The Browser service is a service that can be confusing if you don’t know how it works and what it does. You may have noticed that it sometimes is activated and sometimes not, even if you run the same OS and the same SQL Server version on two identical servers. The answer to this slightly confusing behavior is that the Browser service only gets installed when you create a named instance. If you created a default instance, the service will be disabled.

The functionality that the Browser provides is that it listens on UDP port 1434, and replies with the correct port for the instance requested. For example, if I connect to MyServer\MyInstance, a udp call to port 1434 is done and the Browser service replies with the correct TCP port number for my client to use when it does the “real” connection to the instance. The TCP port in use can be either fixed or dynamic depending on firewall rules on the network and other customer specific rules.

From Microsoft TechNet: Upon startup, SQL Server Browser starts and claims UDP port 1434. SQL Server Browser reads the registry, identifies all instances of SQL Server on the computer, and notes the ports and named pipes that they use. When a server has two or more network cards, SQL Server Browser returns the first enabled port it encounters for SQL Server. SQL Server Browser support ipv6 and ipv4.

Ok, so the browser will find all my instances. So what if you mix the default instance and named instances on the same server? Since you have named instances, the Browser service will be activated, and according to the text above, the Browser identifies all instances and their ports on the computer, right?


The fact is that the Browser ignores your default instance totally, it only works with named instances. As long as you don’t change the default TCP port 1433, you will however not notice this, since your connection just bypasses the Browser anyway. It’s only when you change port for your default instance that this effect becomes visible. The only way to connect then, is to manually supply the correct port number in your connection string.

This has the practical effect that you really should think twice before using a default instance on a server where you have fixed ports and named instances. Let´s say you want to consolidate three SQL Server instances into one server and they need to have fixed ports because of firewall rules. You can either create one default instance and two named instances, or create all three as named instances. If you create all three as named instances, the Browser will deliver their TCP ports without problems, your connection string will simply be MyServer\MyInstance(1,2,3). If you choose the setup with one default instance which is not on port 1433, you will have to use a connection string like MyServer,1444 for your default instance, and then MyServer\MyInstance for the other two. This is not only ugly and non-consistent, but also poses a problem in the future if you want to change the fixed port numbers.

Fixed some SQL Server Partition Management Utility Bugs

The SQL Server Partition Management Utility ( is one of the best tool used to manage the partition-switch operations. It is a command line tool and can be integrated in a SSIS package or used to generate the T-SQL scripts needed in a regular “sliding window” partition management scenario. A blog post that shows how to use this tool is this one.

In my case, I wanted to speed the loading of a big partitioned fact table through a SSIS package (that calls two child packages). So this package calls more instances of the tool in order to load more than one staging table in parallel. Each staging table is related to a fact table partition. After each staging table is loaded, the SSIS package loads the target fact table using the partition-switch operations against the staging table.

All seemed to work fine, but during the test phase, when I tried to increase the degree of parallelism (that is the number of executed instances of the tool), I got a deadlock error.


String Replacement Puzzle

Courtesy of SQL Server Pro. You can find the original article here.

My dad, Gabriel Ben-Gan, passed away recently. He loved numbers, logic and puzzles, and used to solve problems in his own unique way. This article is about a puzzle that incorporates the above ingredients. Dad, this one’s for you, and is in your memory.