After you have created a dataset report, the next step is to create a predictive model, which will then be imported into MicroStrategy. You can create a predictive model in one of the following two ways:
Using third-party applications
There are many companies that offer data mining applications and workbenches. These tools are specialized for the data mining workflow and are usually rich in data mining features and options. MicroStrategy Data Mining Services integrates seamlessly with these applications in the following ways:
You can use one of the following two options to provide the MicroStrategy dataset to third-party applications:
While there are many sophisticated data mining algorithms, some data mining techniques are fairly simple and do not require specialized tools. The MicroStrategy TrainRegression function can be used to create predictive models using the regression data mining technique. This technique should be familiar to you if you have ever tried to extrapolate or interpolate data, or tried to find the line that best fits a series of data points, or used Microsoft Excel’s LINEST or LOGEST functions.
Regression analyzes the relationship between several predictive inputs, or independent variables, and a dependent variable that is to be predicted. It does this by finding the line that best fits the data, with a minimum of error.
For example, you have a dataset with just two variables, X and Y, which are plotted as in the following chart:
Using the Regression technique, it is relatively simple to find the straight line that best fits this data (see the following chart). The line is represented by a linear equation in the classic y = mx + b format, were m is the slope and b is the y-intercept.
Alternatively, you can also fit an exponential line through this data (see the following chart). This line has an equation in the y = b mx format.
So, how can you tell which line has the better fit? There are many statistics used in the Regression technique. One basic statistic is an indicator of the “goodness-of-fit,” meaning how well the line fits the relationship among the variables. It is also called the Coefficient of Determination, whose symbol is R2. The higher the R, the better the fit. We can see that the linear predictor has R2 = 0.7177 and the exponential predictor has R2 = 0.7459; therefore, the exponential predictor statistically is a better fit.
With just one independent variable, this example is considered a “univariable regression” model. In reality, the Regression technique can work with any number of independent variables, however, always only one dependent variable. While the “multivariable regression” models are not as easy to visualize as the univariable model, the technique will generate statistics so you can determine the goodness-of-fit.
The MicroStrategy TrainRegression function allows both types of Regression techniques. Both the linear and exponential Regression predictor can be created easily, with just one change to set the type of model.
Example: Predicting quarterly online sales
First, let us create a metric using the TrainRegression function. As you can see from the following, the Train Regression metric has a set of metrics as its inputs, including the On-line Sales metric we are trying to predict (the dependent variable) and several metrics that describe each quarter (the independent variables). The order of these inputs in the metric’s expression does not matter.
Next, set the parameters for the TrainRegression metric for it to work properly. Highlight the TrainRegression portion of the expression, right-click it and view the TrainRegression parameters, shown as follows.
The TrainRegression Parameters have the following meanings:
Multiple Linear Regression (MLR): If this type of regression is specified, the function will attempt to calculate the coefficients of a straight line that best fits the input data. The calculated formula follows the following format: y = b0 + b1x1 + b2x2 + … + bnxn, where y is the target value, and x1 … xn are the independent variables. The MLR technique finds the bn values that best fit the data. To use MLR, set RegressionType = 0.
Multiple Exponential Regression (MER): If this type of regression is specified, the function will attempt to calculate the coefficients of an exponential curve that best fits the input data. This is accomplished by calculating the natural log (ln) of the input target variables and then performing the same calculations used for MLR. Once the straight-line coefficients are calculated, MER takes the natural exponential of the values, which results in the coefficients for the formula of an exponential curve. The calculated formula will follow the following format: y = b0 * (b1^x1) * (b2^x2) * … * (bn^xn), where y is the target value, and x1 … xn are the independent variables. The MER technique finds the bn values that best fit the data. To use MER, set RegressionType = 1.
Once you have created a metric with one regression type, it is easy to make another metric of the other type by simply copying the metric and changing the RegressionType parameter. It would be good to also change the ModelFileName and ModelName parameters so there is not any conflict with the original training metric.
To use the training metric, add it to your dataset report and run the report. When the report execution finishes, the training metric’s column will contain the results of the Regression model, which is saved to the location specified by the ModelFileName parameter.
To create a predictive model using MicroStrategy
It is recommended that reports containing training metrics have report caching disabled. The reason for this is to insure that the PMML model is always generated. If the report is in the report cache, there is no need for MicroStrategy to execute the training metric, since the results have already been cached. Therefore, the PMML model file is not generated. To disable report caching for a training report, open the report with the Report Editor and select Report Caching Options from the Data menu. This opens the Report Caching Options dialog box, which allows you to disable report caching for the report.
When the report has completed execution, a file containing the generated PMML model can be found at the location specified by the training metric. This model can now be imported into a MicroStrategy project, which is the subject of the next section.
Microstrategy Related Interview Questions
|Informatica Interview Questions||IBM Cognos Interview Questions|
|Teradata Interview Questions||Adv Java Interview Questions|
|Cloud Computing Interview Questions||SQL Database Interview Questions|
|OBIEE Interview Questions||Qlik View Interview Questions|
|Tableau Interview Questions||Talend Interview Questions|
|IBM Cognos TM1 Interview Questions||Data modeling Interview Questions|
|Cognos ReportNet (CRN) Interview Questions||Qlik Sense Interview Questions|
All rights reserved © 2020 Wisdom IT Services India Pvt. Ltd
Wisdomjobs.com is one of the best job search sites in India.