This Blog entry is from the Linear Regression section in Learn Palisade.
Multicollinearity refers to an Independent variable that while having a strong correlation to the Dependent Variable, also has an often unhelpful correlation to another variable, with that variable also being quite well correlated to the Dependent Variable.
Multicollinearity can cause several issue, the most significant is the understatement of Independent Variable coefficients that would otherwise have a remarkable contribution to a model.
We identity Multicollinearity with the help of a Correlation Matrix, which has hitherto been used to identity the relationship between just the Independent Variable and the Dependent Variable.
Start by creating a correlation matrix using Pearson or Spearman:
Highlight the body of the correlation table, which is where the conditional formatting rules will be applied:
The heat map is created by using a Not Between conditional formatting rule. Click on Conditional Formatting to present the menu:
Then select New Rule.
Select the option ‘Format only cells that contain’:
Select the function ‘Not Between’:
Enter the acceptable range, in this case -0.7 and 0.7:
Click on the Format button to specify the highlighting for the heat map:
In this example the highlighting is simple red text, so change the font colour to Red from its default of Automatic:
Click OK to commit the formatting:
Then click OK to commit the rule:
Then click OK to commit all rules together:
The correlation matrix will now highlight in red those correlations that unacceptably covary: