xspliner: An R Package to Build Explainable Surrogate ML Models

[This article was first published on python – Appsilon Data Science | End­ to­ End Data Science Solutions, and kindly contributed to python-bloggers]. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

This talk was presented virtually at eRum 2020 by Appsilon engineer Krystian Igras. Here is a direct link to the video.

Why Should We Explain Black Box ML Models?

A vast majority of state-of-the-art ML algorithms are black boxes, meaning it is difficult to understand their inner workings. The more that algorithms are used as decision support systems in everyday life, the greater the necessity of understanding the underlying decision rules. This is important for many reasons, including regulatory issues as well as making sure that the model has learned sensible features. For instance, it might be that a particular ML algorithm discriminates against a minority group for an arbitrary reason. It is difficult to catch this sort of problem if your model is a black box. I have created an R package (xspliner) that helps create explainable surrogate models to better understand black box ML algorithms. 

xspliner pdp

Marginal response and PDP curves

One of the most promising methods to explain black box ML models is to build an explainable surrogate model. This can be achieved by inferring Partial Dependence Plot (PDP) curves from the black box model and building Generalized Linear Models based on these curves. The advantage of this approach is that it is model agnostic, which means you can use it regardless of what methods you used to create your model.

glm model xspliner

Construction of Generalized Linear Model with spline-based approximated PDP transformations

In this presentation, you will learn what PDP curves and GLMs are and how you can calculate them based on black box models. I’ll also show you a custom visualization of how PDP curves are constructed. We will then take a look at a credit-scoring use case in which we take the GBM Model and treat it as a surrogate to create an explainable GLM Model. Finally, the new model is used to create a user-friendly credit scoring tool that also allows the creditor to receive a detailed report summing up the final decision whether to grant credit or not. Want to use xspliner? It is available on CRAN!

Learn More

Does your company need help with enterprise data analytics, machine learning, or Shiny dashboards? Reach out to us at [email protected].

Article xspliner: An R Package to Build Explainable Surrogate ML Models comes from Appsilon Data Science | End­ to­ End Data Science Solutions.

To leave a comment for the author, please follow the link and comment on their blog: python – Appsilon Data Science | End­ to­ End Data Science Solutions.

Want to share your content on python-bloggers? click here.