9) Grading the ROC Performance with AUC.

This Blog entry is from the Logistic Regression section in Learn R.

Visually the ROC Curve plot created in the previous Blog entry suggests a that the model created has some predictive power.  A more succinct method to measure model performance is the Area Under Curve statistics which can be calculated with ease by requesting "auc" as the measure to the performance object:

AUC <- performance(ROCRPredictions,measure = "auc")
1.png

Run the line of script to console:

2.png

To write out the contents of the AUC object:

AUC

3.png

Run the line of script to console:

4.png

The value to gravitate towards is the y.values,  which will have a value ranging between 0.5 and 1:

5.png

In this example, the AUC value is 0.827767 which suggests that the model has an excellent utility. By way of grading, AUC scores would correspond:

·         A: Outstanding > 0.9

·         B: Excellent > 0.8 and <= 0.9

·         C: Acceptable > 0.7 and <= 0.8

·         D: Poor > 0.6 and <= 0.7

·         E: Junk > 0.5 and <= 0.6

8) Creating a ROC Curve.

This Blog entry is from the Logistic Regression section in Learn R.

The ROCR package provides a set of functions that simplifies the process of appraising the performance of classification models, comparing the actual outcome with a probability prediction.  It can be noted that although a logistic regression model outputs between -5 and + 5, converting this value to an intuitive probability.

Firstly, install the ROCR package from the RStudio package installation utility.

1.png

Click install to proceed with the installation:

2.png

Reference the ROC Library:

library(ROCR)
3.png

Run the block of script to console:

4.png

Two vectors and inputs are needed to create a visualisation, the first is the predictions expressed as a probability, the second being the actual outcome.  In this example, it will be the vector FraudRisk$ PAutomaticLogisticRegression And FraudRisk$Dependent.  To create the predictions object in ROCR:

ROCRPredictions <- prediction(FraudRisk$PAutomaticLogisticRegression, FraudRisk$Dependent)
5.png

Once the prediction object has been created it needs to be morphed into a performance object using the performance() function.  The performance function takes the prediction object yet also an indication as to the performance measures to be used, in this case true positive rate (tpr) vs false positive rate (fpr).  The performance function outputs an object that can be used in conjunction with the base graphic plot() function:

ROCRPerformance <- performance(ROCRPredictions,measure = "tpr",x.measure = "fpr")
6.png

Run the line of script to console:

7.png

Simply plot the ROCRPerformance object by passing as an argument to the plot() base graphic function:

8.png

Run the line of script to console:

9.png

It can be seen that a curve plot has been created in the plots window in RStudio:

10.png

It can be seen that the line is not diagonal, leading to an inference that the model has some predictive power.