14) Creating a Factor from a Vector.

This Blog entry is from the Data Structures section in Learn R.

The factor() function turns a Vector containing character fields into a special structure for categorical variables.  Categorical variables are treated differently in data analysis as conceptually they are pivoted to columns in their own right.

Assume that a Vector of customer genders exists:

Gender <- c("Male","Female","Female","Male")


Run the line of script to console:


A standard vector has been created.  To transform this Vector into a Factor, simply pass the Gender Vector as an argument to the factor() function by typing:

GenderFactor <- factor(gender)

Run the line of script to console:


It can be observed that the Factor is now available in the environment pane:


To view the factor in the console type:


Run the line of script to console:


Closer inspection shows that despite there being a vector of the strings Male and Female duplicated,  the Factor has correctly identified there to be two levels of Male and Female.  This procedure is an example of the levels being inferred.  Categorical data will not be treated nativily in the predictive analytics tools as follows.

13) Selecting from a Matrix.

This Blog entry is from the Data Structures section in Learn R.

As a matrix is made up of vectors, it is logical to expect it to bear some resemblance in the way selection from a matrix takes place.   All subscripting in a separate dimension when specified inside the [] square brackets, as separate arguments.  The first argument inside the square brackets relates to the row, the next the column.

To obtain the value in a given position of a matrix, in this case two down, three across, type:


Run the line of script to console:


It can be seen that the value 2 has been returned which corresponds to the position specified:


2) Perform Vector Arithmetic.

This Blog entry is from the Data Structures section in Learn R.

A variety of arithmetic operators can be used against vectors such as:

·         + Addition

·         - Subtraction

·         * Multiplication

·         / Division

·         ^ Power

·         %% mod

In this example, a numeric Vector will be multiplied by 2.  Start by creating a Vector, type:

Multiply <- c(1,2,3,4,5)

Run the line of script to console:


In this example, multiply the vector by 2.   Type:

Multiply * 2

Run the line of script to console to write out the new vector:


It can be observed that each position in the vector has been multiplied by the value of 2.  It is also possible to multiple by another vector.  Create another vector by typing:

MultiplyBy <- c(5,4,3,2,1)

Then multiply the existing vector Multiply by the new vector MultiplyBy by typing:

Multiply * MultiplyBy

Run the line of script to console:


It can be observed that for each position in the vector, the value in that position has been multiplied by the same position in the other vector.  Think of this as the equivalent of filling down in an Excel spreadsheet.

1) Create a Vector with c Function.

This Blog entry is from the Data Structures section in Learn R.

The c function is used to combine variables into a vector.  To create a numeric Vector,  start by typing:

NumericVector <- c(1,2,3,4,5)

Run the line of script to console:


The vector appears in the environment pane,  showing the dimensions of [1,5],  which would suggest 1 row,  five columns:


The vector can be referenced in the console, as with all other variables, by typing:


Run the line of script to the console:


To observe how R handles vectors, comprised of separate types (in so far as it CANT handle it), start by typing:

Mixed <- c(1,2,3,4,”string”)

Run the script to console:


It can be seen that the vector has been created and is displayed in the environment pane, however, it is being created as a character vector owing to the presence of character argument which cannot be coerced to a numeric value and as such the entire vector becomes a character vector.  To validate this in the console, type:


Run the line of script to console:


It can be validated that the vector has been created as a string, based on the premise of the double quotations around all of the entries.

Introduction to Data Structures

This Blog entry is from the Data Structures section in Learn R.

Although R seems intimidating at first, requiring what seems to be programming skills, this belies that most of the use cases complex predictive analytics can in fact be distilled into simple procedures, indeed Blog entries.  It is most certainly not correct that R need be viewed as a programming language.

There are certain basic principles that need to be understood however and as covered the first section, this section sets out to reinforce these principles.

In this section, Data Structures, available to R, will be explored.   The exercise will require a new script to have been opened in RStudio.