This article is the fourth in a series of articles called “Opening the Black Box: How to Assess Machine Learning Models.” The prior columns were What Kind of Problems Can Machine Learning Solve?, Selecting and Preparing Data for Machine Learning Projects, and Understanding and Assessing Machine Learning Algorithms.
Some of the most important data in an organization flows across the CFO’s desktop, so it’s no surprise that machine learning is becoming an important part of the financial function of the firm. Asking the right questions, choosing the best data, and understanding how algorithms predict and classify can help financial executives make better decisions and more effectively communicate with staff.
If you managed to build a machine learning model and find yourself with the right problem to solve and the right data and algorithms to apply, what’s next? Genuinely understanding and properly communicating the model’s accuracy is essential to ensuring that it is effectively deployed within your organization. Senior business leaders will want to develop their own measure of accuracy and will need to spend a lot of time understanding that measure.
To make sure you’re getting the most out of your machine learning model, consider the following questions:
1. How do you understand the accuracy of your model and communicate it to your team?
Different machine learning algorithms have different measures of accuracy built into them. For instance, Random Forest Regressors, a very popular machine learning algorithm, will either use a “Mean Squared Error” or “Mean Absolute Error” test to calculate the model’s accuracy. If you haven’t come across either of those calculations in practice, or only vaguely remember them from a statistics class years ago, then you are not alone. As such, even though the model would spit out accuracy scores based on these tests, the number itself may be difficult for teams to understand.
In order to trust your model results, you need to understand the inputs, the test data, and the relationship to the outputs. That means testing out-of-sample data (data that was not included in the training of the model) and developing criteria that define what you’re trying to solve. This extra effort will allow your team to rely on the results.
If the model’s users don’t fully understand its accuracy and limitations, they may have one of two attitudes: they either completely trust the output, as if the tool were doing magic, or they completely mistrust it and won’t rely on the outcomes. There are problems with both attitudes, of course. Those who trust too much and those who don’t trust machine learning enough will both miss out on potentially better accuracy.
As an illustrative example, we can use the proprietary tool our firm created for predicting and classifying corporate credit ratings. We developed the Sample Credit Rating Estimator (SCRE) to determine whether a machine learning model could improve on existing ways to predict credit ratings. We wanted to fully understand our model’s accuracy to see if we could improve our prior linear regression model’s results, so we began an in-depth assessment to determine what measure of accuracy would satisfy our team.
We had a model with relatively high prediction test scores as compared with our linear regression model. The test score was assessing a perfect match on a credit rating, which was not the traditional testing methodology. So, we spent time defining what “accuracy” looked like to us and the marketplace, which was a predicted rating within two “notches” of actual ratings. We then tested out-of-sample data to generate the model’s predictions and used those predictions to assess accuracy by our own measures.
That blueprint for understanding and defining accuracy with machine learning models will help CFOs who want to more fully grasp and communicate to their team their model’s performance.
2. Does your team understand the tool and its purposes?
In the finance function, decisions are often complicated and risky. While it would be nice to have a model that makes these decisions, the reality is that human judgement is still essential to getting things right. As such, machine learning can be most effective when deployed to augment your team’s analyses, not replace it. When training the team, you need to communicate what the model is good at and bad at by genuinely understanding the accuracy of the tool and examining when it may not produce optimal results.
In our SCRE example, had we trained our team by telling them “according to a prediction test, this is over 70% accurate,” then the use of SCRE might have run the risk of doing more harm than good in our analyses. Instead, we developed a measure that made sense to our team, we then explored in-depth what potential issues the team should look for in order understand when the output may be flawed. This allowed the tool to be optimized to be more than 90% accurate and used as a supplement to our analyses, not as a replacement to it.
Nothing is perfect, and your machine learning model will be no different. Ensure your team understands this reality and looks for ways to comprehend and test the data to improve accuracy.
3. How quickly can machine learning be embedded in a process?
One of the benefits of using machine learning tools is how easy they are to embed in almost any environment. You don’t need special software or a newly created user web interface. Many tools can be implemented in Microsoft Excel models — an app that teams are already familiar with and can be accessed easily using a custom function.
Teams who are learning and adapting to a new way of working and performing their analysis have a large enough task as is. However, many organizations attempt to employ machine learning models alongside a larger technology rollout. It may be best to resist this urge; working with machine learning outputs for the first time can be uncomfortable, so you may want to consider keeping them in a comfortable environment while doing so.
Chandu Chilakapati is a managing director and Devin Rochford a director with Alvarez & Marsal Valuation Services.