Regardless of the tools used for data analysis, normally the way to display the results is a Word document or a PowerPoint presentation.
In this post, we will create a PowerPoint presentation and insert a series of graphics and text programmatically, using the OfficeR and rvg packages together. We will also take advantage of the occasion to present (for those who do not know) the ‘Pipe’ operator, very useful when nesting functions.
In an on-premises environment when we propose solutions to geographical disasters, the most common option is log shipping. The use of asynchronous database mirroring or availability groups with asynchronous replicas is also common but includes an additional risk that is not usually contemplated. We refer to the “speed” with which the changes are transferred, as quickly as the network and the target system allow us. This means that when the disaster has a human origin, an important error when we become aware of it, we will have this error replicated and applied. Obviously, a better solution would be to combine both options, which are not exclusive, with which we would cover more disaster scenarios increasing the cost of the solution. (more…)
The increasing diversification of the type and volume of data, the lowering of computational processing costs and storage costs have opened a window of opportunity for the resurgence of a discipline that already existed on paper and among equations: Machine Learning.
What is Machine Learning?
In the last article in this series about Python for SQL Server specialists, you are going to learn how to use SQL Server Python libraries in SQL Server. You can use two scalable libraries, the revoscalepy and microsoftml libraries, which correspond to equivalent R libraries.
SQL Server and ML Integration
With SQL Server 2016 and 2017, you get a highly scalable Machine Learning (ML) engine. Not every function and algorithm is rewritten as a scalable one. Nevertheless, you will probably find the one you need for your analysis of a big dataset. You can store a Python or R data mining or machine learning model in a SQL Server table and use it for predictions on new data. You can even store graphs in a binary column and use it in SQL Server Reporting Services (SSRS) reports. Finally, ML support is not limited to SQL Server only. You can use R code also in Power BI Desktop and Power BI Service, where we at this time (April 2018) still waiting for Python support. However, you can already use both languages, Python and R, in Azure Machine Learning (Azure ML) experiments. (more…)
In this post, we will code a script in python (with Visual Studio 2017) to create a program which we can execute as a windows service in order to extract (in almost real time) the tweets related to certain words or hashtags, store them in a SQL server database, and then consume them with Power BI. (more…)
Python is one of the most popular programming languages. It is a general purpose high level language. It was created by Guido van Rossum, publicly released in 1991. SQL Server 2016 started to support R, and SQL Server 2017 adds support for Python. Now you can select your preferred language for the data science and even other tasks. R has even more statistical, data mining and machine learning libraries, because it is more widely used in the data science community; however, Python has broader purpose than just data science, and is more readable and might thus be simpler to learn. This is the first of the four articles that introduce Python to SQL Server developers and business intelligence (BI) specialists. This means that the articles are more focused on Python basics and data science, and less on general programming with Python.
When talking about technologies, speed and capacity in the transfer of data plays a fundamental role. That is why two of the most important technology companies in the world, such as Microsoft and Facebook, along with Telxius, a telephone infrastructure subsidiary, have inaugurated the most advanced transatlantic cable so far. The result of this ambitious project has been named Marea (Spanish for “Tide”) and its main aim is to satisfy the growing demand for Internet services and cloud computing in recent years. Marea is the most advanced subsea cable that crosses the Atlantic Ocean. The cable connects the United States (Virginia Beach) with Spain (Bilbao).
I have uploaded a repository containing a helper to analyse the Data Migration Assistant tool results to GitHub. I have compiled and improved Microsoft’s version released last March in order to process aggregated results from multiple servers using Microsoft’s static code analysis tool. (more…)
SQL Server 2016 really represents a drastic change for the better in regards to general performance. And the proof is that, thanks to the engine changes, this is the first SQL Server version where Microsoft actively provides a widespread increase in performance levels by simply upgrading. But what impact can it have in your daily loads if you use columnstore indexes in SQL server 2014 datawarehouse? There was already an amazing increase in performance levels with SQL Server 2014 and the columnstore indexes. Is it worth upgrading to SQL 2016? (more…)
Below is a comparison table between ETL tools: SQL Server Integration Services aka SSIS vs PowerCenter. What tool should I choose? What are the differences? What are the advantages and disadvantages? Do they meet my requirements and particular needs? These are some of the questions that you may have asked yourself at the start of any new project, and we will aim to answer them in this post by providing you with some key facts regarding two very potent tools: SQL Server Integration Services a.k.a. SSIS versus Informatica PowerCenter.