Region Grouping Based Counter factual Visual Explanations for Fine-Grained Visual Categorization

Document Type



In this thesis, we consider counterfactual visual explanations for fine-grained classification tasks. Initially, counterfactual instances were provided with causal structural models, in which variables are interconnected via causal networks. Given specified changes to other variables, it is possible to estimate the likelihood of variable outcomes under such conditions. However, in the vast majority of machine learning models, variables are not associated in structural causal models. Thus, we consider more relaxed problem statement, in which non-causal counterfactuals are considered. A non-causal counterfactual query aims to determine what should be altered to get a desired result. For instance, if input X generated output Y, how should we modify X to generate output Z? We propose to use adversarial attack to explain model output by generating counterfactual examples. The adversarial attack is applied to the segmented image parts. In particular, we showed that certain parts of an input image are the most informative of the predicted class. In addition, we propose different approach for generating discriminant attributive maps. Our suggested modified approach for discriminant explanations differentiate between the regions of predicted and counterfactual classes. Specifically, using heatmaps of both input query image and distractor counterfactual image, method emphasizes the regions that belong only to the predicted class in the original image and that pertain to the counterfactual class in the distractor image.

First Page


Last Page


Publication Date



Thesis submitted to the Deanship of Graduate and Postdoctoral Studies

In partial fulfillment of the requirements for the M.Sc degree in Machine Learning

Advisors: Dr. Karthik Nandakumar, Dr. Hang Dai

Online access provided for MBZUAI patrons