Measure the Strength of Association Between Two Categorical Variables: Mosaic Plot and Chi-Square Test

In a Data Science project it’s really important to get the more insights out of your data. There is a specific phase, the first one in the project, that has the data analysis as goal: the Data Exploration phase.

Among other kinds of analysis, one of the most interesting is the bi-variate one, that finds out the relationship between two variables. If the two variables are categorical, the most common plot used to analyze their relationship is the mosaic plot. At first sight it may appear a little bit confusing. People not aware of some statistical concepts can miss important information this plot can give us. So, we’ll go a little bit deeper in these concepts.

Read the rest of the article here.