Regardless of the tools used for data analysis, normally the way to display the results is a Word document or a PowerPoint presentation.
In this post, we will create a PowerPoint presentation and insert a series of graphics and text programmatically, using the OfficeR and rvg packages together. We will also take advantage of the occasion to present (for those who do not know) the ‘Pipe’ operator, very useful when nesting functions.
In an on-premises environment when we propose solutions to geographical disasters, the most common option is log shipping. The use of asynchronous database mirroring or availability groups with asynchronous replicas is also common but includes an additional risk that is not usually contemplated. We refer to the “speed” with which the changes are transferred, as quickly as the network and the target system allow us. This means that when the disaster has a human origin, an important error when we become aware of it, we will have this error replicated and applied. Obviously, a better solution would be to combine both options, which are not exclusive, with which we would cover more disaster scenarios increasing the cost of the solution. (more…)
In this post, we will code a script in python (with Visual Studio 2017) to create a program which we can execute as a windows service in order to extract (in almost real time) the tweets related to certain words or hashtags, store them in a SQL server database, and then consume them with Power BI. (more…)
After learning about Python fundamentals and basics about working with data, it is time to start with more exciting parts of this Python for SQL Server Specialists series.
In this article you will learn about the most important libraries for advanced graphing, namely matplotlib and seaborn, and about the most popular data science library, the scikit-learn library. (more…)
Today we will show you how we can refresh a dataset published in Power BI from a Power Shell Script that we would invoke at the end of our ETL process.
We will use the Power BI libraries for power shell to connect to our power Bi portal and send an instruction to refresh a data set. This could be useful to improve our ETL processes, refreshing our on-line datasets used in Power Bi portal before loading data into our data-warehouse and/or our OLAP/Tabular database send an instruction to. (more…)
In my previous article, you learned Python fundamentals. I also introduced the basic data structures. You can imagine you need more advanced data structures for analyzing SQL Server data, which comes in tabular format. In Python, there is also the data frame object, like in R. It is defined in the pandas library. You communicate with SQL Server through the pandas data frames. But before getting there, you need first to learn about arrays and other objects from the numpy library.
In this article, you will learn about the objects from the two of the most important Python libraries, namely, as mentioned, numpy and pandas. (more…)
Python is one of the most popular programming languages. It is a general purpose high level language. It was created by Guido van Rossum, publicly released in 1991. SQL Server 2016 started to support R, and SQL Server 2017 adds support for Python. Now you can select your preferred language for the data science and even other tasks. R has even more statistical, data mining and machine learning libraries, because it is more widely used in the data science community; however, Python has broader purpose than just data science, and is more readable and might thus be simpler to learn. This is the first of the four articles that introduce Python to SQL Server developers and business intelligence (BI) specialists. This means that the articles are more focused on Python basics and data science, and less on general programming with Python.
In this post, we will talk about the new property that appears in the SQL 2016 version: Auto Adjust Buffer Size. This attribute is specific to the “DataFlow” component and can take the values of ‘True’ or ‘False’ (default). Furthermore, we will propose an approximate solution for previous versions.
As you know, as long as you are not totally oblivious to the technological world you will have heard about one of the biggest bugs in the history of computer science (Spectre and Meltdown) and that its effects are real. So real, that we ourselves at SolidQ ourselves have experienced it in our own Query Analytics software. In this post I will try to shed some light on how to proceed if you detect performance regression in your solution with SQL Server, explaining how I have solved it in my own system.
I have uploaded a repository containing a helper to analyse the Data Migration Assistant tool results to GitHub. I have compiled and improved Microsoft’s version released last March in order to process aggregated results from multiple servers using Microsoft’s static code analysis tool. (more…)