1) Forward Stepwise Logistic Regression

This Blog entry is from the Linear Regression section in Learn Palisade.

The Blog entry to create a Logistic Regression model is almost identical to that of creating a Linear Regression model, in that default options suffice while the concepts of Dependent and Independent variables are used in for the purposes of creating the model over the X and Y specifications that had previously been used in other analysis.

Logistic Regression is available by clicking the Regression and Classification menu on the StatTools ribbon, then clicking Logistic Regression on the sub menu:

1.png

The logistic regression window will open:

2.png

The concept of stepwise Logistic Regression exists in the same manner as it does in Linear Regression and although not explicitly mentioned, this Blog entry assumes that correlation analysis has been performed on all variables and the variable with the strongest correlation is carried forward as the starting independent variable, in this case High_Risk_Country (a pivoted categorical variable):

3.png

The dependent variable in this dataset is titled Dependent and represents the transaction being fraudulent or not:

4.png
5.png

While it is the default option, it is important to select ‘Include Classification Summary’ option as this is an important performance measure for stepwise Logistic Regression.

Clicking OK will produce the Logistic Regression output:

6.png

Stepwise Linear Regression has now become familar, for which the same concepts exist with Logistic Regression.  The performance measures in Logistic Regression differ from that of Linear Regression; P-Values need to be optimised in the same way and should never ideally exceed 5%, while further optimisation values relate to the classification accuracy of the logistic regression model, for which performance should always be sought:

7.png

It follows that the Logistic Regression model should be improved by adding the next strongest correlating variable seeking improvement in the classification accuracy while maintaining good P-Values.