This Blog entry is from the Linear Regression section in Learn Palisade.
Linear Regression models are not just limited to as single independent variable, rather they can have many. As a methodology, the goal is to keep adding independent variables and observe the model improving in performance, then stopping at the point the model begins to degrade as more independent variables are added.
Create a one-way Linear Regression model, yet in this Blog entry specify the very next strongest correlating value as discovered, which in this example is Point 100, as a further independent variable for the analysis (i.e. both Kurtosis and Point 100 are selected in the I column):
Clicking ok will produce the same analysis, retaining just the single constant (i.e. starting point) yet calculating two coefficients for each independent variable specified:
Note two values from the first model created, the Multiple R and the P Values:
By adding more Independent Variables as per this Blog entry, the goal is to get a higher multiple R value (similar in this regard to an absolute correlation) while maintaining P Values of less than 5% across the enter model (this would be values less than 0.05, although it is open to great subjective interpretation):
In this example, an improvement in the Multiple R has been observed, while the P values look to be infinitesimally small, thus procedural, it could be said that the new model is champion, having challenged the previous model. Deploying the model would be identical to that described in Linear Regression, except the formula would be extended as below to include the new independent variable coefficient:
= 0.000284873 + (Kurtosis Value * 0.00017277) + (Point100 Value * 0.039631235)
Where Kurtosis Value is in cell N2 and Point100 Value is in cell W2 in this example, set out this formula in the same manner as Linear Regression:
Commit the formula, fill down and name the column Model_2:
Repeat this Block entry for the very next strongest correlation until such time that the Multiple R ceases to increase or the P Values increase beyond 5% (or an acceptable threshold as determined by the analyst), at which point stop with a view to improving the model using only analyst best judgement (i.e. adding and removing independent variables that make intuitive sense to garner improvement). Using this Blog entry, it is expected that a model will improve to around five variables, but could be many more depending on the complexity (indeed creativity) of the abstraction process.