14) Creating a Factor from a Vector.

This Blog entry is from the Data Structures section in Learn R.

The factor() function turns a Vector containing character fields into a special structure for categorical variables.  Categorical variables are treated differently in data analysis as conceptually they are pivoted to columns in their own right.

Assume that a Vector of customer genders exists:

Gender <- c("Male","Female","Female","Male")

a-script-for-creating-genders-in-r.png

Run the line of script to console:

a-vector-of-genders-written-to-r-console.png

A standard vector has been created.  To transform this Vector into a Factor, simply pass the Gender Vector as an argument to the factor() function by typing:

GenderFactor <- factor(gender)
a-script-in-r-to-turn-a-vector-into-a-factor.png

Run the line of script to console:

a-vector-being-turned-into-a-factor-written-to-r-console.png

It can be observed that the Factor is now available in the environment pane:

a-factor-being-displayed-in-the-rstudio-environment-window.png

To view the factor in the console type:

GenderFactor
a-script-to-write-out-the-gender-factor-in-r.png

Run the line of script to console:

the-gender-factor-being-written-to-r-console.png

Closer inspection shows that despite there being a vector of the strings Male and Female duplicated,  the Factor has correctly identified there to be two levels of Male and Female.  This procedure is an example of the levels being inferred.  Categorical data will not be treated nativily in the predictive analytics tools as follows.