6) Creating a Confusion Matrix for a C5 Decision Tree.

This Blog entry is from the Probability and Trees section in Learn R.

Beyond the summary statistic created, the confusion matrix is the most convenient means to appraise the utility of a classification model. The confusion matrix for the C5 decision tree model will be created using the CorssTable function of the gmodels() package:

library("gmodels")
CrossTable(CreditRisk$Dependent, CreditRiskPrediction)
1.png

Run the line of script to console:

2.png

The overall utility of the C5 decision tree model can be inferred in the same manner as procedure 100.

The confusion matrix classified 206 records as being bad correctly, taking CreditRiskPrediction column wise, it can be seen that 28 records were classified as Bad yet they were in fact Good.  It can be said that there is an 11.9% error rate on records classified as bad by the model.  Taking note of this metric, in procedure 112 boosting will be attempted which should bring about improvement of this model.