Measure the Strength of Association Between Two Categorical Variables: Mosaic Plot and Chi-Square Test

In a Data Science project it’s really important to get the more insights out of your data. There is a specific phase, the first one in the project, that has the data analysis as goal: the Data Exploration phase.

Among other kinds of analysis, one of the most interesting is the bi-variate one, that finds out the relationship between two variables. If the two variables are categorical, the most common plot used to analyze their relationship is the mosaic plot. At first sight it may appear a little bit confusing. People not aware of some statistical concepts can miss important information this plot can give us. So, we’ll go a little bit deeper in these concepts.

Read the rest of the article here.

Power BI (Audit) Usage Analytics; Get the most out of your Dashboards!

Power BI (Audit) Usage Analytics; Get the most out of your Dashboards!

Am I getting the most out of my Dashboards?

We already know that when a company has to decide whether to invest in a Business Intelligence project, it has to find the answers to all the questions that arise about its effectiveness: Are we really going to get anything out of it? Will it give us the information we need? Will it be beneficial for us? In many cases, it is difficult for companies to have the answer to all these questions, especially when we are in the early stages of the project. (more…)

What is Machine Learning?

What is Machine Learning?

The increasing diversification of the type and volume of data, the lowering of computational processing costs and storage costs have opened a window of opportunity for the resurgence of a discipline that already existed on paper and among equations: Machine Learning.

What is Machine Learning?
(more…)

Get data in Power BI from Oracle Database

Get data in Power BI from Oracle Database

Microsoft’s Business Analytics Service Power BI, enables us to connect to hundreds of data sources and produce beautiful reports that can be consumed on the web and across mobile devices, in order to deliver insights throughout our entire organization. In this post, I will walk you through the process to get data in Power BI from Oracle Database.

When you open Power BI Desktop, you will see the following window: (more…)

How to extract Twitter data with a windows service created with Python in Visual Studio 2017

How to extract Twitter data with a windows service created with Python in Visual Studio 2017

Hi everyone,

In this post, we will code a script in python (with Visual Studio 2017) to create a program which we can execute as a windows service in order to extract (in almost real time) the tweets related to certain words or hashtags, store them in a SQL server database, and then consume them with Power BI. (more…)

How to refresh Power BI dataset from an on-premise Power Shell script

How to refresh Power BI dataset from an on-premise Power Shell script

Hi friends,

Today we will show you how we can refresh a dataset published in Power BI from a Power Shell Script that we would invoke at the end of our ETL process.

We will use the Power BI libraries for power shell to connect to our power Bi portal and send an instruction to refresh a data set. This could be useful to improve our ETL processes, refreshing our on-line datasets used in Power Bi portal before loading data into our data-warehouse and/or our OLAP/Tabular database send an instruction to. (more…)

SQL Server performance with Spectre and Meltdown patches

SQL Server performance with Spectre and Meltdown patches

As you know, as long as you are not totally oblivious to the technological world you will have heard about one of the biggest bugs in the history of computer science (Spectre and Meltdown) and that its effects are real. So real, that we ourselves at SolidQ ourselves have experienced it in our own Query Analytics software. In this post I will try to shed some light on how to proceed if you detect performance regression in your solution with SQL Server, explaining how I have solved it in my own system.

(more…)

How to Better Evaluate the Goodness-of-Fit of Regressions

Most of the content of this post is platform-agnostic. Since in these days I’m using Azure Machine Learning, I take it as a starting point of my studies.

It’s quite simple for an Azure Machine Learning average user to create a regression experiment, make the data flow in it and get the predicted values. It’s also easy to have some metrics to evaluate the implemented model. Once you get them, the following questions arise:  (more…)