5) Create a Naive Bayesian Network with a Laplace Estimator.

This Blog entry is from the Naive Bayesian section in Learn R.

To create a Bayesian model with a nominal Laplace estimator of 1, which will mean that in the event that there is nothing it is switch to at least one occurrence in the observation, simply change the parameter value in the training:

SafeBayesianModel <- naiveBayes(CreditRisk,CreditRisk$Dependent,laplace=1)
1.png

Run the line of script to console:

2.png

A Bayesian model has been created as SafeBaysianModel.  Recall the model:

ClassPredictions <- predict(SafeBayesianModel,CreditRisk,type = "class")
3.png

Run the line of script to console:

4.png

The de-facto method to appraise the performance of the model would be to create a confusion matrix:

library(gmodels)
CrossTable(CreditRisk$Dependent, ClassPredictions)
5.png

Run the block of script to console:

6.png

It can be seen that this naive Bayesian model appears to be startlingly accurate, which stands to reason as the same data is being used to test as was trained.  It follows that this would benefit from an element of cross validation, which was introduced in Gradient Boosting Machines.

3) Recalling a Naive Bayesian Classifier for P.

This Blog entry is from the Naive Bayesian section in Learn R.

One of the benefits of using a Bayesian classifier is that it can return initiative probabilities which, ideally, should be fairly well calibrated to the actual environment.  For example, suppose that a 30% P of rain is produced by a weather station for 100 days, if it were to rain on 30 of those days, that would be considered to be a well calibrated model.  It follows that quite often it is not just the classification that is of interest, but the probability of a classification being accurate.

The familiar predict() function is available for use with the BayesModel object, the data frame to use in the recall and specifying a type to equal Raw,  instructing the function to return P and not the most likely classification:

PPredictions <- predict(BayesianModel,CreditRisk,type = "raw")
1.png

Run the line of script to console:

2.png

A peek of the data in the PPredictions output can be obtained via the head() function:

head(PPredictions)
3.png

Run the line of script to console:

4.png

Horizontally the P will sum to one, and evidences clearly the most dominant class. Anecdotally, the calibration of P in naive Bayesian models can be somewhat disappointing, while the overarching classification and be surprisingly accurate.