Stockholm university

Research project Improvement of Machine Learning Interpretability Algorithms

Machine learning has been developed and adopted in different areas where both accuracy and interpretability is required. However, most high-performance models lack interpretability. Therefore, additional algorithms are required to guarantee the complex models’ interpretability.

Genre photo: A robot offering a flower to a human. Photo: Pavel Danilyuk/Pexels.
Machine Learning should be able to provide both high task performance and trust to humans. Photo: Pavel Danilyuk/Pexels.

Currently, machine learning models are being increasingly used in different real-life applications, where potentially high-stakes decisions could be implemented based on their outputs.

It is therefore important for these models to be explainable, i.e., that they are able to explain why a given outcome has been output and possible actionable paths to either correct the model itself or to plan procedures accordingly, based on such predictions and explanations.

Different research perspectives have been adopted: Some researchers are aiming to properly define interpretability itself (since it is such an open concept, related to explainability as well). Others focus on the different algorithms to provide trust in the models. In this regard, many metrics have been studied for the measurement and optimization of interpretability models, using these metrics as proxies for trust generation in users and developers.

The aim of this thesis project is to contribute to the improvement of machine learning interpretability techniques and their application in different industrial areas.

This is Alejandro Kuratomi Hernández’ PhD thesis project.
Tony Lindgren is the supervisor, Panagiotis Papapetrou is the co-supervisor.

Project description

The following are ongoing research themes in the project:
1.    Counterfactual Explanations with Justification: When implementing counterfactual explanations (explanations of the type: How should an instance change so that a given model changes its prediction from an undesired to a desired output) the current interpretable algorithms not always preserve faithfulness to the observations in the dataset. Justification is a potential property to increase the faithfulness of counterfactual explanations, and has not been researched thoroughly for mixed-feature datasets, and is part of the contribution aims of the project.

2.    Local interpretable models, such as LIME (Local Interpretable Model-agnostic Explanations) rely on the creation of a set of synthetic data points which do not necessarily follow the distributions of the observations in the datasets. They are randomly generated with few regards other than weights based on the distance from the synthetic points to the local area of interest. The generation of these synthetic points is paramount for the success of the local surrogate models that aim to explain the black-box models in the region of interest, therefore, the project aims to propose other methods for synthetic instance generation.

3.    Inherently interpretable models are increasingly desired (those that do not require additional algorithms for explainability). DNN, the basic structure for highly complex and accurate models, may be able to provide such explainability in the original feature space of the dataset by extracting information from the different weights in the inner layers.

4.    The application of interpretable and accurate machine learning models (currently advancing on a research definition for a roller bearing friction coefficient prediction project).

Project members

Project managers

Tony Lindgren

Unit head SAS

Department of Computer and Systems Sciences
Tony Lindgren

Panagiotis Papapetrou

Professor, deputy head of department

Department of Computer and Systems Sciences
Panagiotis Papapetrou

Members

Alejandro Kuratomi Hernandez

Research assistant

Department of Computer and Systems Sciences
Alejandro Kuratomi Hernandez

Publications