machine learning and ai for the sciences – towards

1/20Müller, Montavon, Samek | ICML Workshop XAI

Machine Learning and AI for the sciences – Towards Understanding

Part 2: Theory and Extensions

Klaus-Robert Müller, Grégoire Montavon, Wojciech Samek

ICML 2021 Workshop on XAI


Why A Theory of XAI?

Better understand explanation methods, their strength and weaknesses, the underlying assumptions they make and the way they connect.1

2 A good theoretical understanding can be a starting point for new developments (e.g. higher-order explanations).


Example: Can LRP Be Justified Theoretically?


Deep Taylor Decomposition (DTD)

Montavon et al. (2017) Explaining nonlinear classification decisions with deep Taylor decomposition

Montavon et al. Explaining nonlinear classification decisions with deep Taylor decomposition, PatRec 2017



Taylor expansion:






Taylor expansion:


LRP


Two Views on Propagation

DTD view LRP view




Understand Methods Strengths and Weaknesses

Question: What makes the image below predicted by VGG-16 to be a viaduct?

XAI

LRP-0 LRP-0/γ/ϵ

Montavon et al. Gradient-Based vs. Propagation-Based Explanations: An Axiomatic Comparison, in Explainable AI, Springer LNCS 2019



Question: What makes the image below predicted by VGG-16 to be a viaduct?

XAI

LRP-0 LRP-0/γ/ϵ

general statement?Montavon et al. Gradient-Based vs. Propagation-Based Explanations: An Axiomatic Comparison, in Explainable AI, Springer LNCS 2019



Structurally similar to gradient propagation but with some modifications. → LRP propagation can be seen as smoothing/biasing gradient propagation.



Explaining Deep Similarity Models

Similarity prediction is typically implemented with a product operation in feature space. With this product structure, the similiarity score becomes (locally) bilinear with its input.

Eberle et al. Building and Interpreting Deep Similarity Models, IEEE TPAMI 2020


Second-Order Explanations

Idea: Use the DTD framework to propagate second-order terms of the Taylor expansions at each layer instead of the first-order terms.



Second-Order DTD Cast into an LRP Algorithm

Second-order DTD can be cast into multiple LRP computations, followed by an outer product computation. The algorithm is called BiLRP.



Examples of Second-Order BiLRP Explanations

BiLRP explanations finds relevant pairs of pixels/patches rather than individual pixels/patches.



Explaining GNN Predictions

Graph classification can be achieved using Graph Neural Networks (GNNs).In such architecture, the input graph is applied repeatedly at each layer, and the prediction becomes (locally) polynomial with the input graph.

Schnake et al. Higher-Order Explanations of Graph Neural Networks via Relevant Walks, CoRR 2020


Higher-Order Explanations with DTD

Idea: Use the DTD framework to propagate second-order terms of the Taylor expansions at each layer instead of the first-order terms.



Higher-Order DTD Cast into an LRP Algorithm

Higher-order DTD can be cast into multiple LRP computations, one-per walk into the input graph and we call the algorithm GNN-LRP.



Example of a GNN-LRP Explanation


GNN-LRP explanation explains the prediction in terms of relevant walks into the input graph.

machine learning and ai for the sciences – towards

Documents