Explainable AI

There exist a few categories of explainable AI: post-hoc, intrinsic, and distillation.

The post-hoc paradigm usually provides a heat map highlighting important regions for the decision (e.g. [31, 30]). The heat map is computed besides the forward path of the model. The intrinsic paradigm explores the important piece of information within the forward path of the model, e.g., as attention maps

  1. Post-hoc uses forward path of the model to calculate (usually) a heat map which highlight important regions in an image.

  2. Intrinsic looks at attention maps, exploring the important piece of information within the forward path of the model.

  3. Distillation tries to rebuild the neural network into a transparent model.