in Blog

March 25, 2021

Predictive Models Performance Evaluation


Artur Haponik

CEO & Co-Founder

Reading time:

11 minutes

Predictive analytics is one of the most fascinating aspects of our work and the whole machine learning discipline. These tools help us predict various situations in your company or assess the probability of a given scenario or course of events. However, there is something we all have to remember about–performance evaluation of predictive models. In fact, this is what makes them so useful. What do we mean by performance evaluation? In this article, we are going to take a closer look at this subject.

In general, predictive models are based on so-called supervised machine learning techniques. These techniques are primarily:

  • Regression
  • Classification
  • Ensemble methods (which combine those two mentioned above)

First off, let’s take a look at these techniques. A short reminder will help us understand why performance evaluation is critical and how to do it.

Predictive models: Supervised machine learning techniques

As an introduction to this blog post, let’s remind ourselves that supervised machine learning methods are used primarily when you need to predict or explain data you possess. In fact, the supervised machine learning techniques group and interpret data based exclusively on input data. A supervised algorithm doesn’t matter which one, takes a known set of input data and known responses to the data (output) and trains a model to produce predictions.

The most extensively used ML supervised techniques are classification and regression. For instance, the supervised ML techniques can be used to predict/assess whether the company will win the specific contract or how many users will sign up for the newsletter over the next year. There’s also the ensemble method that combines various supervised techniques–it can incorporate classification and regression into one model or just take several regression/classification techniques to improve their accuracy.

Predictive models: Supervised machine learning techniques


In our past blog posts, we sometimes mentioned regression. This technique is used primarily to predict/explain a specific value based on prior data. Generally speaking, there are five major types of regression:

  • Simple linear regression
  • Polynomial regression
  • Support vector regression
  • Decision tree regression
  • Random forest regression

These techniques serve different purposes. For instance, the technique called the decision tree helps in making decisions and is commonly used in operations research, business intelligence, and strategic planning. With a decision tree, you can assess the probability of a specific event/scenario and devise a strategy to deal with it or to prevent it from happening.

Moreover, you can use regression techniques to predict salary levels, disease spread, property values, and many other different things. The key is always the same–you have to have a set of prior data that’s a basis for the predictive models.

Decision tree regression


In short, classification helps predict or explain a class value. The classification techniques help companies in estimating the probability of an occurrence of a specific event based on one or more inputs. For instance, classification enables companies to predict whether a given customer will buy a product.

Such a prediction is typically based on their behavior on the website and historical data regarding their behavior and/or past purchases. And let’s take another example. Classification helps companies assess whether the company will win the contract. In such a situation, the output is a number between 0 and 1, where 0 means “no”, and 1 means “yes”. However, everything above 0.5 brings you closer to the answer “yes”.

What’s characteristic regarding the classification models? The output can be ascribed to two (yes, no) or, in some situations, three classes. And while classification predicts a discrete class label, the aforementioned regression predicts a quantity.

And then, we have the ensemble methods, which combine regression and classification.

yes, no

Ensemble methods

The purpose of ensemble methods is to improve the accuracy of the previously analyzed techniques and obtain more high-quality results. The main idea behind ensemble methods is to reduce the variance and bias that’s typical of every single machine learning technique.

You see, every single ML model can turn out to be accurate under certain circumstances but inaccurate under others. Now, because ensemble methods take at least two different predictive models into consideration, the bias decreases. How does it look like in real life? Take the example of random forest, which is a textbook ensemble method. Random forests combine many decision trees (regression models). As a result, the random forest technique is more accurate than just one decision tree.


What are predictive models used for?

In today’s business environment, predictive models play a crucial role. They help companies make more informed decisions and analyze various scenarios. We have made a list of industries and sectors that extensively use predictive models:

  • HR: Identifying which job candidates will perform the best on the job, identifying behaviors that lead to high performance, analyzing the effectiveness of managers across the organization.
  • Healthcare: Making quicker, more accurate diagnoses, diagnosing rare diseases, improving diagnostics.
  • Finance and banking: Anomaly detection, fraud and money laundering prevention.
  • Logistics: Route optimization, cost reduction, drivers’ performance evaluation.
  • Customer service: Analyzing and forecasting total lifetime customer value, customer churn prevention
  • Decision Support Systems: These are digital information systems designed to organize, compile, and present data for decision-makers. Here, predictive models provide managers and CEOs with a range of possible outcomes and their potential consequences for the company. It’s a typically BI-related field.

Of course, the list of industries and sectors that commonly use predictive models is much longer. Similar solutions are used in stock markets, real estate, marketing, software development, production, and many other branches of business.

predictive models, IT, programers, office

What is predictive model performance evaluation?

For obvious reasons, predictive models are useful only when they produce accurate, reliable outcomes. And this is what, in short, performance evaluation is all about. There are various evaluation metrics that are strictly correlated with machine learning techniques.

They come in handy, especially when you are working with supervised ML techniques because all the data you need is readily available. These values help you in the performance evaluation of your predictive models.

What you need to know is that there is a fundamental difference between predictive model performance evaluation in regression and in classification. In a few moments, we will show you the most popular evaluating methods for both these ML techniques.

  • When it comes to regression, you’re dealing with continuous values where it is possible to identify the difference between the actual and predicted output.
  • On the other hand, when you’re evaluating a classification model, you ought to concentrate on the number of predictions that have been classified correctly.

predictive model performance evaluation

Various predictive model evaluation techniques

Model evaluation is an important step in the creation of a predictive model. It aids in the discovery of the best model that fits the data you have. It also considers how well the selected model will perform in the future. In general, there are two major methods of evaluating predictive models:

  • Hold-out
  • Cross-validation.

Now, we are going to analyze both these models.

Hold-out performance evaluation

With the hold-out predictive method, you have three subsets of data:

  1. A training set (designed to build predictive models)
  2. A validation set (it helps you to assess the performance of the model built in the training phase)
  3. A test set (used to determine the future performance of a model)

Here, the main idea is to split up your dataset into a training and testing set. The test set allows you to see how well your predictive model performs on unseen data. Typically, you use 80% of your data for training and the remaining 20% of the data for testing[1].



The cross-validation technique comes in handy when only a limited amount of data is available. Here, you divide data into k groups. Again, one of the k groups is used as the test set, and the rest are used as the training set. In short, the predictive model is trained on the training set and then scored on the test set.

We could say that cross-validation is frequently the preferred performance evaluation method. That’s because it offers the possibility to train your models on multiple splits, which gives a more thorough insight into how your predictive models will perform in the future (on unseen data).

Predictive models: Regression model evaluation techniques

When it comes to regression model evaluation, it’s all about predicting a quantity. Therefore, you can use several metrics to measure your model’s performance:

  • R-squared: It’s a statistical measure of how close data is to the fitted regression line. It is also known as the coefficient of determination. R2 summarizes the explanatory power of the regression model and is computed from the sums-of-squares terms. Bear in mind, however, that R-squared does not take into consideration any biases that might be present in the data.
  • Average error: It’s simply the numerical difference between the predicted value and the actual value.
  • Average Absolute Error (AAE): Similar to the average error, only here, you use the absolute value of the difference to balance out the outliers in the data.
  • Mean Square Error (MSE): It is the most common way of evaluating a regression model. MSE is calculated by the sum of the square of prediction error.
  • Root Mean Square Error (RMSE): Used to compare models whose errors can be measured in the same units.
  • Relative Squared Error (RSE): It can be used to compare models whose errors were measured in different units.
  • Median Absolute Error (MAE): Here, also we take the difference between the predicted and the actual value (like in MSE) and divide it with the number of values. However, here, we take an absolute value of it, not its squared value.
  • Mean Absolute Deviation (MAD): In this model, you take the absolute differences between the predicted and actual values and then consider their median.
  • Median error: The average of all difference between the predicted and the actual values.


Predictive statistical models: Classification model evaluation techniques

Concerning classification, we try to predict or explain a class value. Therefore, we can use several evaluation techniques:

  • Confidence intervals: They are used to assess how reliable a statistical estimate is. Typically, a wide confidence interval means that your model is poor.
  • Confusion matrix: This technique shows the number of correct and incorrect predictions made by the classification model compared to the actual outcomes. Here, we can state that the higher is the concentration of observations in the diagonal of the confusion matrix, the higher is the accuracy of your model.
  • Gain and lift: It’s a measure of the effectiveness calculated as the ratio between the results obtained with and without the model.
  • Kolmogorov-Smirnov chart (K-S Chart): It’s a non-parametric statistical test that is used to compare two distributions in order to assess how close they are to each other. In other words, the K-S chart is a measure of the degree of separation between positive and negative distributions.
  • Chi-Squared Test: This evaluation model is similar to Kolmogorov-Smirnov, but in this case, it is a statistical hypothesis test that is valid to perform when the test statistic is chi-squared distributed under the null hypothesis.
  • ROC Curve/Chart: The ROC chart is similar to the gain or lifts charts in that they provide a means of comparison between classification models. It is a graphical plot that illustrates the performance of a binary classifier system as its discrimination threshold is varied[2].
  • Cross-validation: Actually, we’ve already shown you this model. It consists of splitting your training set into test and control data sets, training your algorithm on the control data set, and testing it on the test data set[3].

As you can see, predictive modeling is quite an extensive field that’s used to support a wide range of companies and organizations. And thanks to analyzed predictive statistical models, companies using predictive models can improve their results and make more informed decisions. If you’d like to find out how predictive analytics can help you with everyday work and development–feel free to contact us.

We are an AI consulting company. We deal with predictive models every day and know how to use them for your company’s good and growth. The Addepto team is at your service!


[1] Eijaz Allibhai. Hold-out vs. Cross-validation in Machine Learning. Oct 3, 2018. URL: Accessed Mar 25, 2021.
[2] Divya Singh. What is Predictive Model Performance Evaluation. Mar 19, 2019. URL: Accessed Mar 25, 2021.
[3] L.V. 11 Important Model Evaluation Techniques Everyone Should Know. February 20, 2016. URL: Accessed Mar 25, 2021.


Machine Learning