In my previous article, you learned Python fundamentals. I also introduced the basic data structures. You can imagine you need more advanced data structures for analyzing SQL Server data, which comes in tabular format. In Python, there is also the data frame object, like in R. It is defined in the pandas library. You communicate with SQL Server through the pandas data frames. But before getting there, you need first to learn about arrays and other objects from the numpy library.
In this article, you will learn about the objects from the two of the most important Python libraries, namely, as mentioned, numpy and pandas. (more…)
Python is one of the most popular programming languages. It is a general purpose high level language. It was created by Guido van Rossum, publicly released in 1991. SQL Server 2016 started to support R, and SQL Server 2017 adds support for Python. Now you can select your preferred language for the data science and even other tasks. R has even more statistical, data mining and machine learning libraries, because it is more widely used in the data science community; however, Python has broader purpose than just data science, and is more readable and might thus be simpler to learn. This is the first of the four articles that introduce Python to SQL Server developers and business intelligence (BI) specialists. This means that the articles are more focused on Python basics and data science, and less on general programming with Python.
In this post, we will talk about the new property that appears in the SQL 2016 version: Auto Adjust Buffer Size. This attribute is specific to the “DataFlow” component and can take the values of ‘True’ or ‘False’ (default). Furthermore, we will propose an approximate solution for previous versions.
As you know, as long as you are not totally oblivious to the technological world you will have heard about one of the biggest bugs in the history of computer science (Spectre and Meltdown) and that its effects are real. So real, that we ourselves at SolidQ ourselves have experienced it in our own Query Analytics software. In this post I will try to shed some light on how to proceed if you detect performance regression in your solution with SQL Server, explaining how I have solved it in my own system.
I have uploaded a repository containing a helper to analyse the Data Migration Assistant tool results to GitHub. I have compiled and improved Microsoft’s version released last March in order to process aggregated results from multiple servers using Microsoft’s static code analysis tool. (more…)
SQL Server 2016 really represents a drastic change for the better in regards to general performance. And the proof is that, thanks to the engine changes, this is the first SQL Server version where Microsoft actively provides a widespread increase in performance levels by simply upgrading. But what impact can it have in your daily loads if you use columnstore indexes in SQL server 2014 datawarehouse? There was already an amazing increase in performance levels with SQL Server 2014 and the columnstore indexes. Is it worth upgrading to SQL 2016? (more…)
Lately, we are noticing a trend whereby traditional SANs are increasingly losing ground. This can be due to various reasons, such as forced migrations to the cloud, limited budget environments or even customers who are getting annoyed with their storage providers’ draconian terms and conditions. A common example is to find very costly maintenance agreements, higher-than-market disk size upgrade costs, as well as difficulties to expand/ improve hardware without having to completely replace it. There is also a strong trend to continue going virtual and trading servers like they were commodities. All the above is therefore contributing to an increase in this trend towards SDS, while the VSANs continue to attract new supporters.
While doing an Internet of Things pilot project for a technology engineering group in Scandinavia we got a chance to try out In-Memory OLTP in SQL Server 2014. What we found was a simple way that can be expanded upon to provision, use and, deprovision the memory optimized objects for the duration of our experiments. (more…)
The Browser service is a service that can be confusing if you don’t know how it works and what it does. You may have noticed that it sometimes is activated and sometimes not, even if you run the same OS and the same SQL Server version on two identical servers. The answer to this slightly confusing behavior is that the Browser service only gets installed when you create a named instance. If you created a default instance, the service will be disabled.
The functionality that the Browser provides is that it listens on UDP port 1434, and replies with the correct port for the instance requested. For example, if I connect to MyServer\MyInstance, a udp call to port 1434 is done and the Browser service replies with the correct TCP port number for my client to use when it does the “real” connection to the instance. The TCP port in use can be either fixed or dynamic depending on firewall rules on the network and other customer specific rules.
From Microsoft TechNet: Upon startup, SQL Server Browser starts and claims UDP port 1434. SQL Server Browser reads the registry, identifies all instances of SQL Server on the computer, and notes the ports and named pipes that they use. When a server has two or more network cards, SQL Server Browser returns the first enabled port it encounters for SQL Server. SQL Server Browser support ipv6 and ipv4.
Ok, so the browser will find all my instances. So what if you mix the default instance and named instances on the same server? Since you have named instances, the Browser service will be activated, and according to the text above, the Browser identifies all instances and their ports on the computer, right?
The fact is that the Browser ignores your default instance totally, it only works with named instances. As long as you don’t change the default TCP port 1433, you will however not notice this, since your connection just bypasses the Browser anyway. It’s only when you change port for your default instance that this effect becomes visible. The only way to connect then, is to manually supply the correct port number in your connection string.
This has the practical effect that you really should think twice before using a default instance on a server where you have fixed ports and named instances. Let´s say you want to consolidate three SQL Server instances into one server and they need to have fixed ports because of firewall rules. You can either create one default instance and two named instances, or create all three as named instances. If you create all three as named instances, the Browser will deliver their TCP ports without problems, your connection string will simply be MyServer\MyInstance(1,2,3). If you choose the setup with one default instance which is not on port 1433, you will have to use a connection string like MyServer,1444 for your default instance, and then MyServer\MyInstance for the other two. This is not only ugly and non-consistent, but also poses a problem in the future if you want to change the fixed port numbers.
The SQL Server Partition Management Utility (http://sqlpartitionmgmt.codeplex.com/) is one of the best tool used to manage the partition-switch operations. It is a command line tool and can be integrated in a SSIS package or used to generate the T-SQL scripts needed in a regular “sliding window” partition management scenario. A blog post that shows how to use this tool is this one.
In my case, I wanted to speed the loading of a big partitioned fact table through a SSIS package (that calls two child packages). So this package calls more instances of the tool in order to load more than one staging table in parallel. Each staging table is related to a fact table partition. After each staging table is loaded, the SSIS package loads the target fact table using the partition-switch operations against the staging table.
All seemed to work fine, but during the test phase, when I tried to increase the degree of parallelism (that is the number of executed instances of the tool), I got a deadlock error.