8) Create a Heat Map Correlation Matrix

This Blog entry is from the Linear Regression section in Learn Palisade.

Multicollinearity refers to an Independent variable that while having a strong correlation to the Dependent Variable, also has an often unhelpful correlation to another variable, with that variable also being quite well correlated to the Dependent Variable. 

Multicollinearity can cause several issue, the most significant is the understatement of Independent Variable coefficients that would otherwise have a remarkable contribution to a model.

We identity Multicollinearity with the help of a Correlation Matrix, which has hitherto been used to identity the relationship between just the Independent Variable and the Dependent Variable.

Start by creating a correlation matrix using Pearson or Spearman:

1.png

Highlight the body of the correlation table, which is where the conditional formatting rules will be applied:

2.png

The heat map is created by using a Not Between conditional formatting rule.  Click on Conditional Formatting to present the menu:

3.png

Then select New Rule.

Select the option ‘Format only cells that contain’:

4.png

Select the function ‘Not Between’:

5.png

Enter the acceptable range, in this case -0.7 and 0.7:

6.png

Click on the Format button to specify the highlighting for the heat map:

7.png

In this example the highlighting is simple red text, so change the font colour to Red from its default of Automatic:

8.png

Click OK to commit the formatting:

9.png

Then click OK to commit the rule:

10.png

Then click OK to commit all rules together:

11.png

The correlation matrix will now highlight in red those correlations that unacceptably covary:

12.png