Lipton, Z. C. (2016). The Mythos of Model Interpretability, (Whi). pdf
Because the literature on interpretability comes with different definitions and ways of evaluating it, both of these position papers underline the need to formalize this term.
Often, interpretability comes as a proxy to various characteristics we seek in a machine learning models. These are the auxiliary or complementary characteristics, which might compete with each other are:
Unfortunately, reliability, causality and trust do not have a formal criteria, can’t be enforced explicitly through a mathematical equation or included within a cost function.
LIME - local+global explanations, evaluating trust, proxy to detect bias, robustness, can evaluate a multitude of models
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. pdf
GRADCAM - local explanations, evaluating reliability/trust, can evaluate convolutional neural networks
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. Proceedings of the IEEE International Conference on Computer Vision, 2017–Octob, 618–626. pdf
LRP - local explanations, evaluating relevant features, can evaluate neural networks
Montavon, G., Samek, W., & Müller, K. R. (2018). Methods for interpreting and understanding deep neural networks. Digital Signal Processing: A Review Journal, 73, 1–15. pdf
DeepLIFT - local explanations, counterfactual, can evaluate neural networks
Shrikumar, A., Greenside, P., & Kundaje, A. (2017). Learning Important Features Through Propagating Activation Differences. pdf
Network Dissection - local explanations, model interpretability, interpretability = alignment with semantic concepts, can evaluate convolutional neural networks
Bau, D., Zhou, B., Khosla, A., Oliva, A., & Torralba, A. (2017). Network Dissection: Quantifying Interpretability of Deep Visual Representations. pdf
Influence functions - local explanations, counterfactual, can evaluate a multitude of models
Koh, P. W., & Liang, P. (2017). Understanding Black-box Predictions via Influence Functions. pdf
s
RRR - local and global explanation, can evaluate gradient-based methods - neural networks
Ross, A. S., Hughes, M. C., & Doshi-Velez, F. (2017). Right for the right reasons: Training differentiable models by constraining their explanations. IJCAI International Joint Conference on Artificial Intelligence, 2662–2670. pdf
Interpretation is highly valued in strictly-regulated fields such as medicine. Rather than aiming at an end-to-end solution where the machine learning algorithm gives a diagnostic, ideally replacing the doctor, the medicine AI target sub-tasks which can assist doctors in making a more informed decision or help them become more efficient by filtering some true-positives or false negatives. By targeting sub-tasks the medical AI works at a lower semantical level which makes them more interpretable.
Due to complex interactions within data which might yield biases or spurious correlations, doctors favour a more interpretable machine learning method as linear regression, although its performance is not as high as the deep learning methods. One example here:
Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M., & Elhadad, N. (2015). Intelligible Models for HealthCare. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’15, 1721–1730. pdf
Some medical AI research suffer has the same issues present in the whole AI field: exaggerated claims (e.g. claiming to solve a task but solving solely a proxy sub-task), small datasets, evaluation issues. Particularly in medicine where the diseases are rare, taking into account prevalence helps assessing the robustness of a system.
However, in 2017 we had a few breakthroughs in medical AI:
Gulshan, V., Peng, L., Coram, M., Stumpe, M. C., Wu, D., Narayanaswamy, A., … Webster, D. R. (2016). Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA - Journal of the American Medical Association, 316(22), 2402–2410. pdf
Salehinejad, H., Valaee, S., Mnatzakanian, A., Dowdell, T., Barfett, J., & Colak, E. (2017). Interpretation of Mammogram and Chest X-Ray Reports Using Deep Neural Networks - Preliminary Results. pdf
Merkow, J., Lufkin, R., Nguyen, K., Soatto, S., Tu, Z., & Vedaldi, A. (2017). DeepRadiologyNet: Radiologist Level Pathology Detection in CT Head Images, 1–22. pdf
These papers propose large datasets and a solid evaluation which assumes a ROC sensitivity/specificity and a comparison with doctors. In contrast to other AI (e.g. self driving cars, recommendation systems), none of these medical AI has been implemented in hospitals as medical solutions.
People express greater confidence in a hypothesis, although false, when asked to generate explanations for it.
Koehler, D. J. (1991). Explanation, imagination, and confidence in judgment. Psychological Bulletin, 110(3), 499–519. pdf
Explanations:
Explanations override:
Lombrozo, T. (2006). The structure and function of explanations. Trends in Cognitive Sciences, 10(10), 464–470. pdf
Explanations can be used to manipulate people’s decisions or to facilitate learning:
Baumeister, R. F., & Newman, L. S. (1994). Self-Regulation of Cognitive Inference and Decision Processes. Personality and Social Psychology Bulletin, 20(1), 3–19. pdf
Interpretability in machine learning advocates for human-centric explanations of black-box models. In simple tasks, the explanation is straightforward, for instance emphasizing the pixels corresponding to a detected object in an image. For other tasks which require slow judgements, a personalized explanation might enforce one’s confirmation biases, e.g. in topic detection a personalized interpretation underlines the words might point to one’s vocabulary. Therefore, in certain slow judgement tasks human-centric explanations as evaluation baselines for interpretability can be biased towards certain individuals.
BLOG
research, news, deep learning, machine learning, interpretability, fatml