4) Fitting a distribution

This Blog entry is from the Monte Carlo Model section in Learn Palisade.

A triangular distribution is extremely useful for modelling a variety of business scenarios, especially if created as part of elicitation and in the absence of data. 

If data does exist though, it would be more useful to set a distribution reflective of that distribution. 

@Risk can handle a very wide range of distribution and has a means to perform statistical tests to see which distribution sees to reflect the data most closely.

Start by clicking opening a dataset for which you wish to perform fitting.  In this example the underlying dataset used to create the Regression Model will be used, the GBPUSD with 5M bars available at \Data\FX\GBPUSD with file name GBPUSD_Abstraction_Example_Clean.csv:

1.png

Start by highlighting the column to fit:

J:J

Click the button Distribution Fitting in the Palisade @Risk Ribbon:

Then on the sub menu click Fit to fitting options:

4.png

Retaining the default options click the Fit button:

5.png

The tool will try and fit the selected data against a series of distributions, before selecting a distribution with the best fit:

6.png

In this example the distribution appears to be relatively normal in distribution about the average, it has determined that this is a Weibull distribution.  Scores presented for several other distributions towards the left hand side of the window, it may be appropriate to select one on a more subjective or intuitive basis:

7.png

In this example, the most fitting will be written out as an @Risk function which can be natively used in future simulations. 

Click write to cell to review the distribution formula:

8.png

It is quite common that the dataset being used is not the same dataset that the @Risk model has been deployed to.  It follows that it is often easier to copy \ make a note of the formula for continued and future use.  As such, highlight the formula in the text box:

9.png

=RiskWeibull(6.8331,0.049509,RiskShift(-0.045066),RiskName("Mean"))

An alternative approach is to click Next, where a target cell can be specified:

10.png

Selecting a cell to write the formula is a less flexible approach in practical use and thus the former method is recommended, keeping a journal of fitted distributions for future reference.

3) Performing Optimisation and Mitigation on Monte Carlo Simulation Data.

This Blog entry is from the Monte Carlo Model section in Learn Palisade.

To Optimise and Mitigate we will need to review the Simulation Data.  Click on the Excel Reports button, then select Simulation Data (Combined):

1.png

On clicking OK a new spreadsheet will be created which will show each simulation:

Owing to this spreadsheet showing the inputs and the output for each simulation we are able to filter to identify optimal and \ or troublesome scenarios.  In this example the filter will be for all records where the forecast is less than -0.7, being a high move:

3.png

Using Summary Statistics we are now able to identify the scenario that would cause a high move which can be compared with other scenarios. 

In the below example we can infer that the average skew for an optimal trade would be -23 vs the more typical value of -8 which would include losing trades.

4.png
5.png

It follows that in terms of optimisation we may determine to focus on trades where the skew is more pronounced.

2) Running Monte Carlo Simulations.

This Blog entry is from the Monte Carlo Model section in Learn Palisade.

Set the simulation iterations to a suitably large number, for example 100,000:

1.png

Click Start Simulation to begin the simulation.  The probability density function will start to take shape:

2.png

In the above example we can determine that in all probability the market is going to move downward, in fact 90% likely to move downward between 0 and -0.7. 

The market is extremely unlikely to go up based on the last 8000 5M observations (although this could just as well indicate a correction). 

The shape of the distribution and the lack of normality would be cause for concern and lead us to reconsider promoting this model in all likelihood revisiting the validity of the model.  It is fine for the purposes of our example albeit with the caveat that the value of the optimisation and mitigation process is closely linked to the predictive performance of the model in the first instance.

1) Configure a Logistic Regression Model and Define Triangular Distribution.

This Blog entry is from the Monte Carlo Model section in Learn Palisade.

In this guide we have implemented a Linear Regression Model in an Excel Spreadsheet across four strongly correlating variables from the training dataset.  We are using intraday GBPUSD with 5M candlesticks whereby we have a chart configured with 700 of the most recent candlesticks. 

The model is comprised of the following independent variables, coefficient and intercept:

Name

Value

Intercept

-0.000810676

_4_Mode

-0.000200686

_4_Median

0.03422778

_4_Skew

-0.000148064

_4_HighPrice

-0.070443585

1.png

The triangular distributions are determined as follows and as based on the data:

Field

Max

Min

Most Likely

_4_Mode

0.00191

-0.03038

-0.010

_4_Median

0.023948

-0.01913

-0.00078

_4_Skew

1.542245

-26.2921

-0.10329

_4_HighPrice

0.00191

-0.03038

-0.01086

Although we are using an explainable model, the function could just as easily be a call to a Neural Network DLL (although in example of Jube Capital optimisation software this is performed automatically upon Neural Network retraining as shown in the conclusion):

2.png

Launch @Risk and click on the value cells:

3.png

Click on Define Assumption, then click Triang when the dialog opens:

4.png

Click Select Distribution to be presented with the configuration.  Using the values as described in the above table, enter the Min, Max and Most Likely for each variable:

5.png

Click ok to place the distribution assumption and then repeat for each variable:

6.png

Click on the formula output cell in yellow and click on the Add Output button:

7.png

Specify an appropriate name or retain the default, then click ok.

2) Output Logistic Regression Model as Probability and set Threshold Function.

This Blog entry is from the Linear Regression section in Learn Palisade.

Logistic regression output is constructed in exactly the same manner as in Linear Regression, in so far as there is a constant representing the starting point, thereafter the addition of the independent variable values multiplied by a coefficient, in this example:

= -0.899434346 + (High_Risk_Country * 2.926138785)

1.png

Fill down and name the column Model_1:

2.png

The output ranges from -5 to +5, however, it is not linear, rather it is a logarithm:

3.png

The output is substantially more intuitive if converted to a probability which ranges from 0 to 100 (or 0 to 1 if being represented as a pure probability), the formula to convert the current output to a probability is:

P = exp(Ouput) / (1+exp(Ouput))

The formula above uses the exp function in excel.  As if creating a model output, select the last cell in the spreadsheet, in this example AW2:

4.png

The function will reference the output of the first model, which in our example is cell AV2 start entering the formula:

=exp(

Select cell AV2 as the model output in its raw state:

5.png

Then complete the formula referencing the output in the same manner:

P = exp(AV2) / (1+exp(AV2))

6.png

Fill down and label the column Model_1_P:

7.png

Unlike the Linear Regression models which simply give an output of a numeric value for use, classification models creating a score, probability or otherwise, rely on the setting of a threshold as an activation function to declare, in this example, fraud.  In our example, the threshold is 80% probability of fraud for the prediction to be considered as such.  Once again, the IF function will be brought to bear for the purposes of creating an activation function.

Select the last cell in the spreadsheet, in this example AX2, and begin an IF function referencing the Model_1_P value in cell AW2:

=IF(AW2>

8.png

A probability is expressed between 0 and 1, therefore .8 would represent 80% likelihood.  It follows that the threshold value would be .8, which would complete the IF function:

=IF(AW2>0.8

9.png

Enter the remaining parameter that will be returned try as 1, then the false return value as 0:

=IF(AW2>0.8,1,0

10.png

Complete the formula by closing the parentheses, fill down and name the column Model_1_Is_Fraud.  Accordingly, any example with a value of one, would be considered activated:

11.png

1) Forward Stepwise Logistic Regression

This Blog entry is from the Linear Regression section in Learn Palisade.

The Blog entry to create a Logistic Regression model is almost identical to that of creating a Linear Regression model, in that default options suffice while the concepts of Dependent and Independent variables are used in for the purposes of creating the model over the X and Y specifications that had previously been used in other analysis.

Logistic Regression is available by clicking the Regression and Classification menu on the StatTools ribbon, then clicking Logistic Regression on the sub menu:

1.png

The logistic regression window will open:

2.png

The concept of stepwise Logistic Regression exists in the same manner as it does in Linear Regression and although not explicitly mentioned, this Blog entry assumes that correlation analysis has been performed on all variables and the variable with the strongest correlation is carried forward as the starting independent variable, in this case High_Risk_Country (a pivoted categorical variable):

3.png

The dependent variable in this dataset is titled Dependent and represents the transaction being fraudulent or not:

4.png
5.png

While it is the default option, it is important to select ‘Include Classification Summary’ option as this is an important performance measure for stepwise Logistic Regression.

Clicking OK will produce the Logistic Regression output:

6.png

Stepwise Linear Regression has now become familar, for which the same concepts exist with Logistic Regression.  The performance measures in Logistic Regression differ from that of Linear Regression; P-Values need to be optimised in the same way and should never ideally exceed 5%, while further optimisation values relate to the classification accuracy of the logistic regression model, for which performance should always be sought:

7.png

It follows that the Logistic Regression model should be improved by adding the next strongest correlating variable seeking improvement in the classification accuracy while maintaining good P-Values.

8) Create a Heat Map Correlation Matrix

This Blog entry is from the Linear Regression section in Learn Palisade.

Multicollinearity refers to an Independent variable that while having a strong correlation to the Dependent Variable, also has an often unhelpful correlation to another variable, with that variable also being quite well correlated to the Dependent Variable. 

Multicollinearity can cause several issue, the most significant is the understatement of Independent Variable coefficients that would otherwise have a remarkable contribution to a model.

We identity Multicollinearity with the help of a Correlation Matrix, which has hitherto been used to identity the relationship between just the Independent Variable and the Dependent Variable.

Start by creating a correlation matrix using Pearson or Spearman:

1.png

Highlight the body of the correlation table, which is where the conditional formatting rules will be applied:

2.png

The heat map is created by using a Not Between conditional formatting rule.  Click on Conditional Formatting to present the menu:

3.png

Then select New Rule.

Select the option ‘Format only cells that contain’:

4.png

Select the function ‘Not Between’:

5.png

Enter the acceptable range, in this case -0.7 and 0.7:

6.png

Click on the Format button to specify the highlighting for the heat map:

7.png

In this example the highlighting is simple red text, so change the font colour to Red from its default of Automatic:

8.png

Click OK to commit the formatting:

9.png

Then click OK to commit the rule:

10.png

Then click OK to commit all rules together:

11.png

The correlation matrix will now highlight in red those correlations that unacceptably covary:

12.png

7) Create a Stepwise Linear Regression Model.

This Blog entry is from the Linear Regression section in Learn Palisade.

Linear Regression models are not just limited to as single independent variable, rather they can have many.  As a methodology, the goal is to keep adding independent variables and observe the model improving in performance, then stopping at the point the model begins to degrade as more independent variables are added.

Create a one-way Linear Regression model, yet in this Blog entry specify the very next strongest correlating value as discovered, which in this example is Point 100, as a further independent variable for the analysis (i.e. both Kurtosis and Point 100 are selected in the I column):

1.png
2.png

Clicking ok will produce the same analysis, retaining just the single constant (i.e. starting point) yet calculating two coefficients for each independent variable specified:

3.png

Note two values from the first model created, the Multiple R and the P Values:

4.png

By adding more Independent Variables as per this Blog entry, the goal is to get a higher multiple R value (similar in this regard to an absolute correlation) while maintaining P Values of less than 5% across the enter model (this would be values less than 0.05, although it is open to great subjective interpretation):

5.png

In this example, an improvement in the Multiple R has been observed, while the P values look to be infinitesimally small, thus procedural, it could be said that the new model is champion, having challenged the previous model.  Deploying the model would be identical to that described in Linear Regression, except the formula would be extended as below to include the new independent variable coefficient:

= 0.000284873 + (Kurtosis Value * 0.00017277) + (Point100 Value * 0.039631235)

Where Kurtosis Value is in cell N2 and Point100 Value is in cell W2 in this example, set out this formula in the same manner as Linear Regression:

6.png

Commit the formula, fill down and name the column Model_2:

7.png

Repeat this Block entry for the very next strongest correlation until such time that the Multiple R ceases to increase or the P Values increase beyond 5% (or an acceptable threshold as determined by the analyst), at which point stop with a view to improving the model using only analyst best judgement (i.e. adding and removing independent variables that make intuitive sense to garner improvement).  Using this Blog entry, it is expected that a model will improve to around five variables, but could be many more depending on the complexity (indeed creativity) of the abstraction process.

6) Deploying a One Way Linear Regression Model in Excel.

This Blog entry is from the Linear Regression section in Learn Palisade.

The one-way linear regression output contains, at this stage, two values to be concerned with for the purpose of deployment, the Constant value and the Coefficient for the value for the single independent variable:

1.png

The Constant represents a static value that will always be the starting point of a linear regression model, whereas the coefficient value for the Independent Variable would be a value that is added to the constant, as the multiplication of the Independent variables prevailing value. 

In this example, the one-way linear regression model will be deployed as a further abstracted variable in the original dataset (a precursor to ensemble modelling, including boosting).  Making note of the two values of interest in the Linear Regression output, navigate back to the dataset and select the cell in first row immediately after the last variable, in this example cell AD1:

2.png

The formula that will be built in this Blog entry takes the form:

= Constant + (Kurtosis Value * Kurtosis Coefficient)

Put differently, using the values output from this Blog entries Linear Regression model:

= 0.000244006 + (Kurtosis Value * 0.000147134)

Kurtosis Value, in this example, will be the dynamic value contained in the example, which in this case is located in cell N2.  To construct the formula, start by entering the formula in cell AD1:

= 0.000244006 + (

3.png

Then reference the variable Kurtosis in the exemplar, in this example cell N2:

=0.000244006 + (N2

4.png

Multiplying the Kurtosis Value by the coefficient are specified, indeed calculated, in the Linear Regression model:

=0.000244006 + (N2 * 0.000147134

5.png

Close the parenthesis to complete the formula, fill down, and name the column Model_1:

6.png

The model has now been deployed across all examples in the file.  Scientific notation in the variable, the E value, suggests that the changes are forecast to be very small, hence many decimal places. 

Scientific notation (the E) can be corrected by setting the number value in the column to halt at the fifth decimal place, for example, using the format cell functionality of Excel (a Blog entry detailed outside of the scope of explicit documentation in this guide):

7.png
8.png

5) Creating a One Way Linear Regression Model.

This Blog entry is from the Linear Regression section in Learn Palisade.

A one-way linear regression model is the first model created in this Blog that has predictive power.  In actuality, a linear regression model was brought to bear in drawing a trend line, however this Blog entry introduces a more detailed means to create a predictive model via linear regression.

Having ranked the correlation strength of the independent variables vs the dependent variable, note the strongest correlating value which is in this case the variable Kurtosis.  The strongest correlating variable will always be the starting point in the development of a predictive model in this methodology.

For the dataset, in the StatTools ribbon, select the menu Regression and Classification towards the centre right of the ribbon, then click on the sub menu Regression (referring to linear regression):

1.png

The regression window will appear, shifting from the nomenclature of x and Y (referring to the axis on a chart) to Dependent and Independent (D and I respectively):

2.png

The dependent variable for the purpose of these Blog entries is intuitively labelled Dependent in the datasets.  For the variable Dependent, ensure that the D column is selected (only one can be selected):

3.png

The Independent Variable, being the first and strongest correlating value in this example Kurtosis will be selected in the I column I:

4.png

The default parameters are adequate for the model, thus clicking OK will set about calculating a Linear Regression model.  StatTools will return a linear regression output:

5.png

4) Adding a Trend Line to a Scatter Plot.

This Blog entry is from the Linear Regression section in Learn Palisade.

The scatter plot example created is visually difficult to inspect, there is just about a relationship depicted, but it is shallow and the direction of relationship not immediately obvious. 

Excel can help by adding a trend line which, at this stage at least, will give a general indication of relationship and direction of that relationship.  In fact, this is an implementation of one-way linear regression, covered in more depth in this module.

 Hover over the chart created by StatTools, then right click:

1.png

Click the sub menu item Trend Line:

2.png

A trend line is drawn that shows a very slight relationship, however it would be reasonable to conclude that the relationship is slim to non-existent in this data.  Financial market data is however notoriously difficult to forecast, hence performing this analysis across a wider array of variables and bringing several variables together is required for the purposes of predictive modelling.

3) Ranking Correlation by Absolute Strength.

This Blog entry is from the Linear Regression section in Learn Palisade.

Correlations, and strength, is used predominantly to find the strongest correlating independent variables for the purposes of carrying that variable through to predictive analytics.  The variables with the strongest correlations, in one direction or another, are carried forward to predictive analytics techniques in subsequent Blog entries.

Firstly, select the Dependents variable and select all of the independent variables correlating, to whatever degree, to that Independent Variable.  In this example it would involve selecting the first two columns in the correlation matrix, where Dependent is across the top, and the independent variables are along the left-hand side, and of course the values to the right of the names.  Once selected, right click on this selection, then copy:

1.png

Create a new worksheet by clicking the tab with a plus sign at the base of the workbook currently in focus:

2.png

Once the new workbook is in focus, select cell A1, right click and paste the values:

3.png

A more succinct comparison between the Dependent and Independent variables is presented, although it is not ordered by the strength of correlation:

4.png

Both -1 and +1 are equally interesting for the purposes of variable selection, and so, we cannot simply order by the correlation without first levelling it to an absolute – this is positive – value.  Start by creating a new column called ABS, in this example select cell C1, setting the value to ‘ABS Value’, although the naming is quite unimportant:

5.png

For each of the Independent Variables, apply the Excel ABS Function to the value.  In this example, the first cell of interest is in C3, being the Mean.  In cell C3, start by typing the formula:

=ABS(

6.png

Select the Independent Variable correlation value, which is in cell B3 in this example:

=ABS(B3

7.png

Close the parenthesis to complete the formula, then fill down:

8.png

At this stage, as the Independent Variable correlations have been set to an absolute value, only values close to +1 are deemed interesting. It is helpful to order the absolute correlations from high to low, where the strongest correlations are at the top, the weakest at the bottom. 

Click cell A1 to reset the selection, if anything is selected, then navigate to the data ribbon, select the icon Sort towards the middle of the ribbon, which will infer the dataset:

9.png

Select the parameters Sort By ‘ABS Value’ then the order ‘Largest to Smallest’:

10.png

On clicking Ok, the values are ordered based on their strength of correlation:

11.png

It is extremely likely that those correlations towards the top of the table will be included in predictive modelling, while those correlations towards the bottom of the table will be disregarded. 

It is not, at this stage at least, advisable to make a determination of what variables should or should not be in a predictive analytics model, instead relying on stepwise approaches as covered in subsequent Blog entries.

2) Create a Correlation Matrix using Spearman and Pearson.

This Blog entry is from the Linear Regression section in Learn Palisade.

Correlation is a measure of relationship and direction of that relationship.  It is a single value that ranges from -1 to +1, which would signal the direction and strength of a relationship.  Both -1 and +1 in their extremes are equally interesting.  A correlation matrix takes all the variables together and produces the correlation value between each variable.  This matrix will be the bedrock, starting point, for many of the techniques used in the following Blog entries.

In the StatTools ribbon, towards the centre of the ribbon select summary statistics then the sub menu Correlation and Covariance:

1.png

The correlation and covariance window will open:

2.png

In this Blog entry perform correlation analysis against all abstracted variables, this is the Independent and Dependent variables created.  Start by selecting all variables.  Towards the top left of the variables grid right click, then click select all:

3.png

Upon having selected all the variables, deselect the raw variables in the file – this is all variables before the dependent variable in this example:

4.png
5.png

Notice that the Correlation analysis is set to Pearson Linear, while the Covariances is also selected.  Covariances are superfluous to this Blog entry and can be deselected:

6.png

Click OK to produce the Correlation matrix.  NOTE THAT THIS MAY TAKE SOME TIME IN EXTREMLY LARGE DATASETS.  The correlation matrix will return after some time processing:

7.png

1) Creating a Scatter Plot to inspect a potential relationship between Dependent and Independent Variable.

This Blog entry is from the Linear Regression section in Learn Palisade.

This Blog thus far has focused on performing analysis of just a single independent variable.  This would be referred to as Univariate Analysis.  Bivariate analysis would refer to bringing two values together to estimate their relationship.  In terms of visualisation of a relationship, this could be achieved by visually inspecting a scatter plot comparing one value to another as dots on a chart with two axes.

The Scatter Plot Analysis is available in the StatTools Ribbon, click the button Summary Graphs towards the centre of the ribbon, then ScatterPlot in the sub menu:

1.png

The ScatterPlot window will open which will ask for parameters for the X and Y:

2.png
3.png

Y is the vertical axis; X is the horizontal axis.  In this example, the Dependent Variable (titled Dependent, being the actual price change observed) is to be on the Y axis, with the Skew (this is the prices tending towards higher or lower) on the X axis.

4.png

At this stage it is not necessary to display the Correlation Coefficient, however, the default is acceptable, where Display Correlation Coefficient is checked.  Click OK to perform the analysis, a Scatter Plot will be drawn and returned:

5.png

The intention is to visually inspect the scatter plot to see if the mass tends with an increase. in the Skew. Furthermore, it shows how the values are clustered together, in some instances there may be distinct groups emerge which need to be split (i.e. maybe a different model for one cluster).

In this example, it is visually quite hard to draw any conclusion as the scatter plot is far too dense, having too many dots overlaying each other.  A solution is to take a random sample, to reduce the size of the dataset without losing any significance of meaning.  Creating a scatter plot on the smaller dataset makes the Scatter Plot much easier to understand.

6.png

In the above example while there are far fewer points on the scatter plot,  the overall shape is broadly the same.

5) Creating Time Series Analysis in StatTools.

This Blog entry is from the Basic Statistics section in Learn Palisade.

Start by opening the spreadsheet containing the dataset loaded into the StatTools realm:

1.png

The Time Series Analysis is available in the StatTools ribbon towards the centre, titled Time Series and Forecasting, thereafter in the sub menu Time Series Graph:

2.png

Clicking on the sub menu item Time Series Graph will open a window to configure a line chart showing the prices, in our example Interval_Close, plotted over time:

3.png

If not already selected, select the value Time Series (with Label) from the drop down towards the top right hand side of the window:

4.png

By selecting a label, the process will use a date order without requiring the dataset to be ordered prior (the example is ordered new to old).

Both the label, being the Interim_Buffer_Date, and the target of the analyses, being the Interval_Close, need to be checked under the columns Lbl and Val respectively:

5.png

Click OK to begin the analysis:

6.png

The analysis is output in the form of a line chart showing the oldest records plotted towards the axis, with the newest records plotted away from the axis.  Overall the analysis is intended to visually cue trend and provide some directional insight to the summary statistics.

4) Calculating a Z-Score in Excel.

This Blog entry is from the Basic Statistics section in Learn Palisade.

The measure of how many standard deviations from the mean a value may be, would be calculated via the following formula:

= (Test Value – Mean) / Standard Deviation.

The values Mean and Standard Deviation would be taken from the StatTools output.   The Mean value is contained in cell B9 and the Standard Deviation is Contained in the cell B11.

Start by selecting any cell in the output, in this case D9, then type the formula:

= (1.0000 – B9) / B11

1.png

The hardcoded value of 1.0000 would be the prevailing price and thus a variable.  It would therefore be possible to replace the value 1.0000 with a cell reference, a technique that will be used extensively in the abstraction process, especially in creating Behavioral Analytics and Abstraction Deviations.

3) Creating Summary Statistics in Palisade StatTools.

This Blog entry is from the Basic Statistics section in Learn Palisade.

On the StatTools ribbon, select the menu item Summary Statistics to the centre left of the ribbon, an action that will yield sub menu items, of which one item will be One Way Variable Summary:

1.png

Clicking on the sub menu item One-Variable Summary will open a window to specify the summary statistics to be performed and the variable to perform summary statistics upon:

2.png

Computationally the calculation of the summary statistics is immaterial, thus it is suggested that the default options, which specify all, remain selected. 

Select a check box next to the target variable, in this case, Interval_Close:

3.png

Clicking OK will initiate the calculation of the summary statistics for the selected variable:

4.png

For each summary statistic, there is a single numeric value that represents that calculation of a particular summary statistic.

2) Creating a Histogram in StatTools.

This Blog entry is from the Basic Statistics section in Learn Palisade.

On the StatTools ribbon in Excel, click on the menu item towards the centre left which is titled Summary Graphs.  In the drop down menu there is a sub menu item titled Histogram:

1.png

Clicking on Histogram will open the Histogram configuration window:

2.png

For the purposes of this example create a histogram based on just one variable, in this case Interval_Close, while allowing StatTools to infer the most appropriate number of bins. 

To select the variable, simply check the checkbox next to the variable name:

3.png

Clicking the OK button will create the Histogram in a new Excel workbook:

4.png

Two outputs are created, the tabular representation of the Histogram which includes the probability density, and a bar chart providing a visualisation for the distribution.

1) Configure a Dataset in Palisade StatTools.

This Blog entry is from the Summary Statistics section in Learn Palisade.

To begin using Palisade StatTools, or indeed most Palisade products which rely upon datasets, the dataset needs to be configured.  StatTools does a very good job at inferring the dataset and then loading that dataset into the StatTools realm, making it a trivial matter or procedure.

Click anywhere in the dataset, although in this example, click on cell A1:

1.png

On the StatTools ribbon, click on the icon Dataset Manager towards the left hand side of the ribbon:

2.png

StatTools will infer the dataset boundaries, asking for confirmation:

3.png

Click Yes, an action that will load the Dataset Manager window based on the inferred Dataset:

4.png

The default, inferred, dataset is almost always acceptable.  Notice that the Apply Cell Formatting check box is set.  Cell formatting is a useful feature to determine if a Dataset has been loaded into the StatTools realm after completing the process.

Click OK based on the default, inferred, settings:

5.png

When the formatting of the cells changes colour, with the grid being emphasised and the header text having a blue background, the Dataset can be taken to exist in the StatTools realm and is eligible for further analysis.

All subsequent StatTools Blog entries rely upon the Dataset being loaded into StatTools in the above manner.

Introduction to Monte Carlo Model

This Blog entry is from the Monte Carlo Model section in Learn Palisade.

Monte Carlo Simulation is a technique to create many random simulations based upon a random case (i.e. a transaction).   The random value can be forced to obey certain statistical assumptions, which in this example will be a triangular distribution.  Monte Carlo simulation is an enormous topic in its own right yet these procedures are intended to give just a basic overview of the tool and allow for the simulation of models created in these procedures.

Simulation for Communication refers to being able to run models based on explainable statically assumptions so to facilitate expectation setting for the models impact.  Furthermore, that millions of random simulations will be exposed to the model, where records of both the randomly generated record and the output are retained, Monte Carlo simulation can help identify scenarios where there is potential for optimisation or risk mitigation.