6) Activating a Classification Model and Appraising Performance.

This Blog entry is from the Neural Networks section in Learn R.

To recall the neural network, return a value between 0 and 1 depending on the likelihood that the record is fraudulent:

FraudRiskPredictions <- FraudRiskNeuralNetwork$net.result
1.png

Run the line of script to console:

2.png

Peeking the results with the head() function:

head(FraudRiskPredictions)
3.png

Run the line of script to console:

4.png

It can be seen that numeric values, between 0 and 1, have been returned.  The closer to one, the more likely that the record is fraudulent.   To assert a proper classification, so that a confusion matrix may be plotted to appraise performance of the model, create a vector contains a 1 where the value of FraudRiskPredictions > 0.5, else 0, yet wrapping FraudRiskPrediction with the unlist() function to transform the list output to a vector:

IsFraud <- ifelse(unlist(FraudRiskPredictions) > 0.5,1,0)
5.png

Run the line of script to console:

6.png

As has become customary, use a confusion matrix to appraise the value of the classifier:

library("gmodels")
CrossTable(FraudRisk$Dependent, IsFraud)
7.png

Run the block of script to console:

8.png

In this example, it can be seen that 720 records were classified as being fraudulent correctly.  In total, it can be seen that 901 records were classified, so the accuracy rate on predicting fraud is 79.9%, a substantial uplift on the logistic regression models created in procedure 93.  It is well worth mentioning, that for classification problems, less is very often more and rather than increase network complexity by adding more and more hidden layers and processing elements, it is often more efficient to create many more abstracted variables backed by intuitive judgement and domain expertise.

5) Training a Classification Model.

This Blog entry is from the Neural Networks section in Learn R.

Neural Networks are universal classifiers, which means to say that they can be used as well on numeric prediction as classification.  It won’t have escaped notice however that the internal weights comprising the neural network are all numeric coefficients.  It follows that all input and output variables should be numeric also (via categorical data pivoting to 1 / 0, unfortunately not being able to rely on neuralnet() to interpret factors).  In this example, a dataset of transactions where half of the transactions are fraud and half genuine, will be used as in Logistic Regression.  Start by importing the FraudRisk dataset:

FraudRisk <- read_csv("D:/Users/Trainer/Desktop/Bundle/Data/FraudRisk/FraudRisk.csv")
1.png

Run the line of script to console:

2.png

Once the FraudRisk data frame has been created, create a neural network of ten independent variables known to have strong correlation to the dependent variable with one hidden layer of four processing elements:

FraudRiskNeuralNetwork <- neuralnet(Dependent ~ Count_Unsafe_Terminals_1_Day + High_Risk_Country + Foreign + Authenticated + Has_Been_Abroad + Transaction_Amt + Different_Country_Transactions_1_Week + Different_Decline_Reasons_1_Day + Count_Transactions_Declined_1_Day + Count_In_Person_1_Day,data = FraudRisk, hidden = 4)
3.png

Run the line of script to console, it may take some time:

4.png

Once the console returns, the Neural Network has been trained upon the FraudRisk Dataset.  For the purposes of this procedure it can be taken for granted that plot would return.

4) Training a Deeper Neural Network.

This Blog entry is from the Neural Networks section in Learn R.

A neural network has been trained having only a single hidden layer, albeit with several processing elements.  Deep learning is the notion of having many more hidden layers and generally many more processing elements.  Each layer is able to achieve abstraction autonomously, finding patterns that may not be apparent in manual abstraction.  HOWEVER, it is lazy, adds valuable computational expense in recall (which begins to matter in super high throughput environments), as such deep learning can be circumvented to an extent, given more creativity in the abstraction phase.

In this example, a much deeper neural network will be created where the same ten inputs will be used.  The first hidden layer will have 8 processing elements, the second hidden layer will have 6 processing elements, the third hidden layer will have 4 processing elements yielding an output.

Beforehand a single value specifying just the number of processing elements was provided, where it was inferred that only a single hidden later is applicable.  In this procedure, it is necessary to construct a vector, with each vector entry corresponding to a hidden layer, with the value of that hidden layer entry being the processing elements for that hidden layer:

NueralNetworkDeep <- NeuralNetworkFourByOne <- neuralnet(Dependent ~ Skew_3 + Max_4 + PointStep_16 + Close_3 + Close_4 + PointStep_17_ZScore + PointStep_15 + TypicalValue_4 + Range_4 + Range_2, data = FDX, hidden = c(8,6,4))
1.png

Run the line of script to console, expect it to take some time:

2.png

Plot the function to inspect the neural network:

plot(NueralNetworkDeep)
3.png

Run the line of script to console:

4.png

The plot has dramatically increased in complexity. It can be observed that the Neural Network now has three hidden layers, the first having 8 processing elements, the second having 6 processing elements and the third having 4 processing elements:

5.png

Naturally, this complexity is only worthwhile in the event that the classification accuracy has improved.  As such, invoke compute and extract the results:

ComputedModelDeep <- compute(NueralNetworkDeep,FDX[,c("Skew_3","Max_4","PointStep_16","Close_3","Close_4","PointStep_17_ZScore","PointStep_15","TypicalValue_4","Range_4","Range_2")])
6.png

Extract the predictions to a list, for conversion to a vector later:

FDXPredeictionsDeep <- ComputedModelDeep$net.result
7.png

Run the line of script to console:

8.png

Appraise the correlations between the predictions and the dependent variable:

cor(FDXPredeictionsDeep,FDX$Dependent, use="complete",method="pearson")
9.png

Run the line of script to console:

10.png

It can be seen that the correlation between predicted and actual has leaped to a staggering 0.91 in response to increasing the complexity of the model.

3) Recalling a Neural Network with compute() and understanding performance.

This Blog entry is from the Neural Networks section in Learn R.

The topology plot gives a useful window into the neural network, and its similarity to a regression model is unmistakable, there is none on the performance statistics associated with a regression model.

As this is a numeric prediction model, and not a classification model, we will use correlation to determine the relationship between the dependent variable and the predicted variable.

The compute() function is used instead of the predict() function,  which returns an object with a few other properties rather than just the prediction (which would be easier).   Something else to bear in mind is that the recall function, and indeed the training function, is very unforgiving in the event that the dependent variable has been passed (throwing an error Error in neurons[[i]] %*% weights[[i]] : non-conformable arguments). Frustratingly, it is necessary to subset the dataframe to return all columns explicitly, excluding the dependent variable, before passing it to the compute function. To recall the computed model:

ComputedModel <- compute(NeuralNetworkFourByOne,FDX[,c("Skew_3","Max_4","PointStep_16","Close_3","Close_4","PointStep_17_ZScore","PointStep_15","TypicalValue_4","Range_4","Range_2")])
1.png

Run the line of script to console, it may take some time:

2.png

Unlike the predict method, compute has returned an object.   It is necessary to extract the results from this object to a list, not a vector unfortunately, but that can be converted later using the unlist() function, using the net.result() method:

FDXPredeictions <- ComputedModel$net.result
3.png

Run the line of script to console:

4.png

To gain an assessment of the level of performance of the predictions vs the actual, the correlation function can be used:

cor(FDXPredeictions,FDX$Dependent, use="complete",method="pearson")
5.png

Run the line of script to console:

6.png

It can be seen, in this example, that a correlation of 0.65 has been achieved.  Referencing the initial correlation matrix calculated on the same dataset in Linear Regression, it can be seen that this is an absolutely fantastic uplift in performance from the input correlations in isolation.

For completeness, the FDXPredictions vector, after converting it from a list, should be merged into the FDX data frame, however, using a more complex neural network, in this case taking more hidden layers, improvement will be sought in the subsequent procedure.

2) Plotting a Neural Network.

This Blog entry is from the Neural Networks section in Learn R.

It is often stated that a neural network is an unexplainable modelling techniques, which practically holds some truth, but to those with a background in regression modelling, explaining the model is not insurmountable.

The neuralnet object that was created in the previous Blog entry, allows for the plotting of the neural network using the base plot() function.  Simply call plot() passing the neural network object as an argument:

plot(NeuralNetworkFourByOne)
1.png

Run the line of script to console:

2.png

A plot is created of the neural network bearing stark resemblance to conceptual models put forward in this training manual,  and in a model of less complexity, is in fact explainable and quite reproducible on a manual basis:

3.png

As the model becomes more and more complex,  with the addition of more and more features, layers and processing elements, the neural network will naturally become less and less explainable.