7) Recalling a Neural Network with R

This Blog entry is from the Deep Learning section in Learn R.

Once a model is trained in H2O it can be recalled very gracefully with the predict() function of the H2O package.  It is a simple matter of passing the trained model and the hex dataframe to be used for recall:

Scores <- h2o.predict(Model,CVHex.hex)
1.png

Run the line of script to console:

2.png

A progress bar is broadcast from the H2O server and will be written out to the console.  To review the output, enter the object:

Scores
3.png

Run the line of script to console:

4.png

The Scores output appears similar to a matrix, but it has created a vector which details the actual prediction for a record, hence, this can be subset to a final vector detailing the predictions:

Predict <- Scores[1]
5.png

Run the line of script to console:

6.png

The Predict vector can be compared to the Dependent vector of the CV dataframe in the same manner as previous models within R to obtain Confusion Matrices as well a ROC curves.

3) Recalling a Neural Network with compute() and understanding performance.

This Blog entry is from the Neural Networks section in Learn R.

The topology plot gives a useful window into the neural network, and its similarity to a regression model is unmistakable, there is none on the performance statistics associated with a regression model.

As this is a numeric prediction model, and not a classification model, we will use correlation to determine the relationship between the dependent variable and the predicted variable.

The compute() function is used instead of the predict() function,  which returns an object with a few other properties rather than just the prediction (which would be easier).   Something else to bear in mind is that the recall function, and indeed the training function, is very unforgiving in the event that the dependent variable has been passed (throwing an error Error in neurons[[i]] %*% weights[[i]] : non-conformable arguments). Frustratingly, it is necessary to subset the dataframe to return all columns explicitly, excluding the dependent variable, before passing it to the compute function. To recall the computed model:

ComputedModel <- compute(NeuralNetworkFourByOne,FDX[,c("Skew_3","Max_4","PointStep_16","Close_3","Close_4","PointStep_17_ZScore","PointStep_15","TypicalValue_4","Range_4","Range_2")])
1.png

Run the line of script to console, it may take some time:

2.png

Unlike the predict method, compute has returned an object.   It is necessary to extract the results from this object to a list, not a vector unfortunately, but that can be converted later using the unlist() function, using the net.result() method:

FDXPredeictions <- ComputedModel$net.result
3.png

Run the line of script to console:

4.png

To gain an assessment of the level of performance of the predictions vs the actual, the correlation function can be used:

cor(FDXPredeictions,FDX$Dependent, use="complete",method="pearson")
5.png

Run the line of script to console:

6.png

It can be seen, in this example, that a correlation of 0.65 has been achieved.  Referencing the initial correlation matrix calculated on the same dataset in Linear Regression, it can be seen that this is an absolutely fantastic uplift in performance from the input correlations in isolation.

For completeness, the FDXPredictions vector, after converting it from a list, should be merged into the FDX data frame, however, using a more complex neural network, in this case taking more hidden layers, improvement will be sought in the subsequent procedure.

4) Recalling a Naive Bayesian Classifier for Classification.

This Blog entry is from the Naive Bayesian section in Learn R.

To recall the pivotal classification, rather than recall P for each class and drive it from the larger of the values, the type class can be specified:

ClassPredictions <- predict(BayesianModel,CreditRisk,type = "class")
1.png

Run the line of script to console:

2.png

Merge the classification predictions into the CreditRisk data frame, specifying the dply library also:

library(dplyr)
CreditRisk <- mutate(CreditRisk, ClassPredictions)
3.png

Run the line of script to console:

4.png

Viewing the CreditRisk data frame:

View(CreditRisk)
5.png

Run the line of script to console:

6.png

Scroll to the last column in the RStudio viewer to reveal the classification for each record:

7.png

11) Recalling a Gradient Boosting Machine.

This Blog entry is from the Probability and Trees section in Learn R.

Recalling the GBM is quite initiative and obeys the standardised predict signature.  To recall the GBM:

GBMPredictions <- predict(GBM,CreditRisk,type = "response")
1.png

Run the line of script to console:

2.png

A distinct peculiarity, given that the CreditRisk data frame has a dependent variable which is a factor, is that the binary classification has been modelled between 1 and 2, being the levels of the factor with 1 being Bad, and Good being two:

3.png

 It follows that predictions that are closer to 2, than 1 would be considered to be Good, whereas vice versa, 1.  To appraise the model performance, a confusion matrix should be created.  Create a vector using the ifelse() function to classify between Good and Bad:

CreditRiskGBMClassifications <- ifelse(GBMPredictions >= 1.5,"Good","Bad")
4.png

Run the line of script to console:

5.png

Create a confusion matrix between the actual value and the value predicted by the GBM:

CrossTable(CreditRisk$Dependent, CreditRiskGBMClassifications)
6.png

Run the line of script to console:

7.png

It can be seen in this example that the GBM has mustered a strong performance.  Of 220 accounts that were bad, it can be seen that the GBM classified 182 of them correctly, which gives an overall accuracy rating of 82%. This is a more realistic figure when compared to C5 boosting, as over-fitting will have been contended with.

9) Boosting and Recalling in C5.

This Blog entry is from the Probability and Trees section in Learn R.

Boosting is a mechanism inside the C5 package that will create many different models, then give opportunity for each model to vote a classification, with the most widely suggested classification being the prevailing classification.  The majority classification voted for wins. It could be argued that this is a form of Abstraction.

Simply add the argument 10 to indicate that there should be ten trials to vote:

C50Tree <- C5.0(CreditRisk[-1],CreditRisk$Dependent,trials = 10) 
1.png

Run the line of script to console:

2.png

The summary function will produce a report:

summary(C50Tree)
3.png

In this instance, however, upon scrolling up, it can be seen that several different models \ trials have been created:

4.png

In the above example the decision tree for the 9th trial has been evidenced.  Prediction takes place in exactly the same manner, using the predict() function,  except for it will run several models and established a voted majority classification.  This is boosting:

CreditRiskPrediction <- predict(C50Tree,CreditRisk)
5.png

Run the line of script to console:

6.png

In the above example the decision tree for the 9th trial has been evidenced.  Prediction takes place in exactly the same manner, using the predict() function,  except for it will run several models and established a voted majority classification.  This is boosting:

CreditRiskPrediction <- predict(C50Tree,CreditRisk)
7.png

Run the line of script to console:

8.png

A confusion matrix can be created to compare this object with that created in procedure 100:

CrossTable(CreditRisk$Dependent, CreditRiskPrediction)
9.png

Run the line of script to console:

10.png

In this example, it can be observed that there were 281 accounts where predicted to be bad, taking the CreditRiskPrediction column-wise, it can be observed there was a 1 account classification as bad in error.  Out of 281 classifications as bad, it can be said that the error rate is just 0.3%.  Referring to the original model as created, it can be seen that an 11% increase in performance has been achieved from boosting.

There is such a thing as a model being too good, which would indicate that the model is perhaps over-fit. Over-fitting is dealt with in more detail while exploring Gradient Boosting Machines and Neural Networks, however, at this stage it is sufficient to explain that one should never test a model on the same data used to train.