Why Our Prototype-based AI Beats Traditional Concept Learning Approaches

Authors:

(1) Sanchit Sinha, University of Virginia ([email protected]);

(2) Guangzhi Xiong, University of Virginia ([email protected]);

(3) Aidong Zhang, University of Virginia ([email protected]).

Table of Links

Abstract and 1 Introduction

3.2 Self-supervised Contrastive Concept Learning

3.3 Prototype-based Concept Grounding

3.4 End-to-end Composite Training

4 Experiments and 4.1 Datasets and Networks

4.2 Hyperparameter Settings

4.3 Evaluation Metrics and 4.4 Generalization Results

4.5 Concept Fidelity and 4.6 Qualitative Visualization

5 Conclusion and References

Appendix

5 Conclusion

In this paper, we discuss a fairly less-studied problem of concept interoperability which involves learning domain invariant concepts that can be generalized to similar tasks across domains. Next, we introduce a novel Representative Concept Extraction framework that improves on present selfexplaining neural architectures by incorporating a Salient Concept Selection Network. We propose a Self-Supervised Contrastive Learning-based training paradigm to learn domain invariant concepts and subsequently propose a Concept Prototype-based regularization to minimize concept shift and maintain high fidelity. Empirical results on domain adaptation performance and fidelity scores show the efficacy of our approach in learning generalizable concepts and improving concept interoperability. Additionally, qualitative analysis demonstrates that our methodology not only learns domainaligned concepts but is also able to explain samples from both domains equally well. We hope our research helps the community utilize self-explainable models in domain alignment problems in the future.

References

[Aggarwal et al., 2021] Ravi Aggarwal, Viknesh Sounderajah, Guy Martin, Daniel SW Ting, Alan Karthikesalingam, Dominic King, Hutan Ashrafian, and Ara Darzi. Diagnostic accuracy of deep learning in medical imaging: A systematic review and meta-analysis. NPJ digital medicine, 4(1):1–23, 2021.

[Alvarez-Melis and Jaakkola, 2018] David Alvarez-Melis and Tommi S Jaakkola. Towards robust interpretability with self-explaining neural networks. arXiv preprint arXiv:1806.07538, 2018.

[Bahadori and Heckerman, 2020] Mohammad Taha Bahadori and David E Heckerman. Debiasing concept bottleneck models with instrumental variables. arXiv preprint arXiv:2007.11500, 2020.

[Bracke et al., 2019] Philippe Bracke, Anupam Datta, Carsten Jung, and Shayak Sen. Machine learning explainability in finance: an application to default risk analysis. 2019.

[Chen et al., 2019] Runjin Chen, Hao Chen, Jie Ren, Ge Huang, and Quanshi Zhang. Explaining neural networks semantically and quantitatively. In ICCV, pages 9187–9196, 2019.

[Chen, 2020] Ting Chen. A simple framework for contrastive learning of visual representations. In ICML. PMLR, 2020.

[D’Amour et al., 2020] Alexander D’Amour, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi, Alex Beutel, Christina Chen, Jonathan Deaton, Jacob Eisenstein, Matthew D Hoffman, et al. Underspecification presents challenges for credibility in modern machine learning. JMLR, 2020.

[Elbaghdadi, 2020] Omar Elbaghdadi. Disenn: Selfexplaining neural networks: A review. https://github.com/ AmanDaVinci/SENN, 2020.

[Ghorbani and Zou, 2019] Amirata Ghorbani and James Zou. Data shapley: Equitable valuation of data for machine learning. In ICML, pages 2242–2251. PMLR, 2019.

[Ghorbani et al., 2019] Amirata Ghorbani, James Wexler, James Zou, and Been Kim. Towards automatic conceptbased explanations. NeurIPS, 2019.

[Gidaris et al., 2018] Spyros Gidaris, Praveer Singh, and Nikos Komodakis. Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728, 2018.

[Goyal et al., 2019] Yash Goyal, Amir Feder, Uri Shalit, and Been Kim. Explaining classifiers with causal concept effect (cace). arXiv preprint arXiv:1907.07165, 2019.

[Huang et al., 2022] Jinbin Huang, Aditi Mishra, Bum-Chul Kwon, and Chris Bryan. Conceptexplainer: Understanding the mental model of deep learning algorithms via interactive concept-based explanations. arXiv preprint arXiv:2204.01888, 2022.

[Hull, 1994] Jonathan J. Hull. A database for handwritten text recognition research. IEEE Transactions on pattern analysis and machine intelligence, 16(5):550–554, 1994.

[Hutchinson and Mitchell, 2019] Ben Hutchinson and Margaret Mitchell. 50 years of test (un) fairness: Lessons for machine learning. In FAccT, pages 49–58, 2019.

[Jeyakumar et al., 2021] Jeya Vikranth Jeyakumar, Luke Dickens, Yu-Hsi Cheng, Joseph Noor, Luis Antonio Garcia, Diego Ramirez Echavarria, Alessandra Russo, Lance M Kaplan, and Mani Srivastava. Automatic concept extraction for concept bottleneck-based video classification. 2021.

[Kim et al., 2018] Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, et al. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). In ICML, pages 2668–2677. PMLR, 2018.

[Koh and Liang, 2017] Pang Wei Koh and Percy Liang. Understanding black-box predictions via influence functions. In ICML. PMLR, 2017.

[Koh et al., 2020] Pang Wei Koh, Thao Nguyen, Yew Siang Tang, Stephen Mussmann, Emma Pierson, Been Kim, and Percy Liang. Concept bottleneck models. In ICML, pages 5338–5348. PMLR, 2020.

[LeCun et al., 1998] Yann LeCun, Leon Bottou, Yoshua ´ Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.

[Leemann et al., 2022] Tobias Leemann, Yao Rong, Stefan Kraft, Enkelejda Kasneci, and Gjergji Kasneci. Coherence evaluation of visual concepts with objects and language. In ICLR2022 Workshop on the Elements of Reasoning: Objects, Structure and Causality, 2022. [Liu et al., 2021] Xiaoqing Liu, Kunlun Gao, Bo Liu, Chengwei Pan, Kongming Liang, Lifeng Yan, Jiechao Ma, Fujin He, Shu Zhang, Siyuan Pan, et al. Advances in deep learning-based medical image analysis. Health Data Science, 2021, 2021.

[Mincu et al., 2021] Diana Mincu, Eric Loreaux, Shaobo Hou, Sebastien Baur, Ivan Protsyuk, Martin Seneviratne, Anne Mottram, Nenad Tomasev, Alan Karthikesalingam, and Jessica Schrouff. Concept-based model explanations for electronic health records. In CHIL, pages 36–46, 2021.

[Murty et al., 2020] Shikhar Murty, Pang Wei Koh, and Percy Liang. Expbert: Representation engineering with natural language explanations. In ACL, pages 2106–2113, 2020.

[Netzer et al., 2011] Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. 2011.

[Peng et al., 2017] Xingchao Peng, Ben Usman, Neela Kaushik, Judy Hoffman, Dequan Wang, and Kate Saenko. Visda: The visual domain adaptation challenge. arXiv preprint arXiv:1710.06924, 2017.

[Peng et al., 2019] Xingchao Peng, Qinxun Bai, Xide Xia, Zijun Huang, Kate Saenko, and Bo Wang. Moment matching for multi-source domain adaptation. In ICCV, pages 1406–1415, 2019.

[Pittino et al., 2021] Federico Pittino, Vesna Dimitrievska, and Rudolf Heer. Hierarchical concept bottleneck models for explainable images segmentation, objects fine classification and tracking. Objects Fine Classification and Tracking, 2021.

[Raji et al., 2020] Inioluwa Deborah Raji, Andrew Smart, Rebecca N White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, Jamila Smith-Loud, Daniel Theron, and Parker Barnes. Closing the ai accountability gap: Defining an end-to-end framework for internal algorithmic auditing. In Proceedings of the 2020 conference on fairness, accountability, and transparency, pages 33–44, 2020.

[Ribeiro et al., 2016] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016.

[Rudin, 2019] Cynthia Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5):206–215, 2019.

[Saito et al., 2020] Kuniaki Saito, Donghyun Kim, Stan Sclaroff, and Kate Saenko. Universal domain adaptation through self supervision. NeurIPS, 33:16282–16292, 2020.

[Sarkar et al., 2022] Anirban Sarkar, Deepak Vijaykeerthy, Anindya Sarkar, and Vineeth N Balasubramanian. A framework for learning ante-hoc explainable models via concepts. In CVPR, pages 10286–10295, 2022.

[Sawada, 2022a] Yoshihide Sawada. C-senn: Contrastive senn. 2022.

[Sawada, 2022b] Yoshihide Sawada. Cbm with add. unsup concepts. 2022.

[Sinha et al., 2021] Sanchit Sinha, Hanjie Chen, Arshdeep Sekhon, Yangfeng Ji, and Yanjun Qi. Perturbing inputs for fragile interpretations in deep natural language processing. In Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, pages 420–434, 2021.

[Sinha et al., 2023] Sanchit Sinha, Mengdi Huai, Jianhui Sun, and Aidong Zhang. Understanding and enhancing robustness of concept-based models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 15127–15135, 2023.

[Sundararajan et al., 2017] Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. In ICML, pages 3319–3328. PMLR, 2017.

[Szepannek and Lubke, 2021 ¨ ] Gero Szepannek and Karsten Lubke. Facing the challenges of developing fair risk scor- ¨ ing models. Frontiers in artificial intelligence, 4, 2021.

[Thota and Leontidis, 2021] Mamatha Thota and Georgios Leontidis. Contrastive domain adaptation. In CVPR, pages 2209–2218, 2021.

[Varoquaux and Cheplygina, 2022] Gael Varoquaux and ¨ Veronika Cheplygina. Machine learning for medical imaging: methodological failures and recommendations for the future. NPJ digital medicine, 5(1):1–8, 2022.

[Venkateswara et al., 2017] Hemanth Venkateswara, Jose Eusebio, Shayok Chakraborty, and Sethuraman Panchanathan. Deep hashing network for unsupervised domain adaptation. In CVPR, pages 5018–5027, 2017.

[Wang and Liu, 2021] Feng Wang and Huaping Liu. Understanding the behaviour of contrastive loss. In CVPR, pages 2495–2504, 2021.

[Wang, 2023] Bowen Wang. Learning bottleneck concepts in image classification. In CVPR, pages 10962–10971, 2023.

[Weller, 2019] Adrian Weller. Transparency: motivations and challenges. In Explainable AI: interpreting, explaining and visualizing deep learning, pages 23–40. Springer, 2019.

[Wu et al., 2020] Weibin Wu, Yuxin Su, Xixian Chen, Shenglin Zhao, Irwin King, Michael R Lyu, and Yu-Wing Tai. Towards global explanations of convolutional neural networks with concept attribution. In CVPR, pages 8652– 8661, 2020.

[Xu et al., 2019] Jiaolong Xu, Liang Xiao, and Antonio M Lopez. Self-supervised domain adaptation for computer ´ vision tasks. IEEE Access, 7:156694–156706, 2019.

[Yeh et al., 2019] Chih-Kuan Yeh, Been Kim, Sercan O Arik, Chun-Liang Li, Tomas Pfister, and Pradeep Ravikumar. On completeness-aware concept-based explanations in deep neural networks. arXiv preprint arXiv:1910.07969, 2019.

[Yu and Lin, 2023] Yu-Chu Yu and Hsuan-Tien Lin. Semisupervised domain adaptation with source label adaptation. In CVPR, pages 24100–24109, 2023.

[Yuksekgonul et al., 2022] Mert Yuksekgonul, Maggie Wang, and James Zou. Post-hoc concept bottleneck models. arXiv preprint arXiv:2205.15480, 2022.

[Zaeem and Komeili, 2021] Mohammad Nokhbeh Zaeem and Majid Komeili. Cause and effect: Concept-based explanation of neural networks. In 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pages 2730–2736. IEEE, 2021.

[Zhou et al., 2018] Bolei Zhou, Yiyou Sun, David Bau, and Antonio Torralba. Interpretable basis decomposition for visual explanation. In ECCV, pages 119–134, 2018.

A Appendix

Structure of Appendix

Following discussions from the main text, the Appendix section is organized as follows:

• Dataset descriptions and visual samples

• Detailed discussion around RCE and algorithmic details for CCL and PCG (Pseudocode)

• More experimental results on key hyperparameters utilized in RCE and PCG

• Concept Fidelity Analysis

• Details on Baseline Replication

• Additional visual results - selected prototypes

• Additional visual results - domain-aligned prototypes

A.1 Dataset Description

A few examples from the training set of the datasets utilized in our approach are shown in Figures 7 (Digits) for both tasks are shown in Figures 7, 8 (VisDA), 9 (DomainNet) and 10 (OfficeHome).

A.2 Training Procedure - Details

Algorithm 1 depicts the overall pseudocode to train the Representative Concept Extraction (RCE) framework with Contrastive (CCL) and Prototype-based Grounding (PCG) regularization. The finer details of each part are listed as follows:

RCE: For networks F and H, we utilize a Resnet34 architecture and initialize them with pre-trained Imagenet1k weights. For the network A, we first utilize the element-wise vector product between the outputs of F and H and then pass the outputs through network A, which is a shallow 2-layer fully connected network. For network T, we utilize a 3-layer fully connected network which outputs a prediction with necessary and sufficient concepts. The final prediction is a weighted sum of outputs of networks A and T followed by a softmax layer. The prediction loss is the standard cross-entropy loss.

PCG: For selection of prototypical samples we utilize a combination of selection of source and target domains. We select 5 and 1 samples from the source and target domains respectively. Note that the PCG regularization starts after the 1st step. The grounding ensures that the outlying concept representations in the target domains are grounded to the abundant source domain representations. For concept bank, we utilize these 6 (5+1) prototypes for each class.

A.3 Results on Key Hyperparameters

Number of Concepts

The first 3 columns of Table 6 list the domain adaptation performance on the OfficeHome dataset across 12 different data settings (listed in rows). We evaluate the performance by varying the number of concepts C (and by extension, the relevance scores S). We choose the base setting of the number of concepts being equal to the number of classes because we want each class to be represented by at least one concept. We observe that increasing the number of concepts has no significant effect on the performance. This observation points to the fact that the relevant concept information is encoded in a few number of concepts. In other words, the concept vector is sparse.

Concept Dimensionality

The last 3 columns of Table 6 list performance by varying the concept dimensionality d (dim). Note that non-unit dimensional concepts are not directly interpretable, and remain an active area of research [Sarkar et al., 2022]. Nevertheless, we report the performance numbers by varying the concept dimensionality. We observe that the with increasing concept dimensionality, the performance on target domains increases in almost all settings. This observation is expected for the following two reasons - 1) increasing concept dimensionality increases the richness of information encoded in each concept during contrastive learning and 2) increased dimensionality increases the expressiveness of the architecture itself.

Size of Representative set for PCG

Table 7 shows the performance on the OfficeHome dataset for two settings of the pre-selected representative prototypes. For all experiments in the main paper, we utilize 5 prototypes from the source domain and 1 from the target domain - for a total of 6 prototype samples for grounding. Note that it is not usually possible to use a lot of prototypes from the target domain as our setting corresponds to the 3-shot setting in [Yu and Lin, 2023]. We show the performance on 5 and 7 selected prototypes on the source domain in Table 7. We observe that increasing the number of prototypes does not result in an improvement of performance, in fact performance does not significantly change and in a few cases, performance drops. This observation implies that only a minimum number of necessary and sufficient grounding set of prototypes is required. This observation is consistent with intuition because if more than a requisite prototypes are selected, the computation time for concept representations will increase.

Distances from the Concept Representation prototypes

Table 7 lists the average normalized distance of the concept representations of the target domain from the concept representations associated with the selected prototypes with varying values of λ1 which controls the effect of supervision in PCG. We choose 3 different values for demonstrating the ablation results - λ1 = 0 for no regularization and λ1 = 1 for very high regularization. We observe that both cases lack generalization performance implying a tradeoff between regularization and generalization.

A.4 Concept Fidelity Analysis

Table 8 lists the consolidated concept fidelity scores of all four datasets. Note: This table is a complete version of the Table 5 in the main text. We see that the concept overlap on all datasets is highest in either our approach or BotCL, both approaches with explicit fidelity regularizations. This demonstrates the efficacy of our approach in maintaining concept fidelity.

A.5 Baseline Replication

We compare our approach against 4 baselines - SENN [Alvarez-Melis and Jaakkola, 2018], DiSENN [Elbaghdadi, 2020], BotCL[Wang, 2023] and UnsupervisedCBM[Sawada, 2022b]. Even though none of the approaches incorporate domain adaptation as an evaluation method, we utilize the proposed methodology directly in our settings. Proper care has been taken to ensure high overlap with the intended use and carefully introduced modifications proposed by us, listed below:

• SENN and DiSENN: We utilize the well tested publicly available code[2] as the basic framework. We modify SENN and DiSENN to include Resnet34 (for objects) and the decoder is kept the same for all setups discussed. Specifically, due to the very slow computation of the robustness loss Lh on bigger networks like Resnet34, we only compute it once every 10 steps.

• BotCL: We utilize the publicly available code [3]. We utilize the same network architecture - LeNet for digits and Resnet34 for objects. Additionally, we amend the LeNet architecture to fit BotCL framework.

• UnsupervisedCBM: Unsupervised CBM is hard to train as it contains a mixture of supervised and unsupervised concepts. However, our approach does not utilize supervision, so we only consider the unsupervised concepts and replace the supervised concepts with the one-hot encoding of the classes of the images. We utilize a fully connected layer for the discriminator network while simultaneously training a decoder. Though a publicly available version of UnsupCBM is not available, we are successful in the replication of its main results.

A.6 Additional Visual Results - Selected Prototypes

Figures 11a and 11b showcase the most top-5 most important prototypes concerning a given query image for the OfficeHome and DomainNet datasets respectively. For each row, the model settings are where the query and prototypes are in the target domain. We show results on all domains for both datasets, to demonstrate the efficacy of our proposed approach, which can generalize to all domains. Note that RCE is an overparameterized version of SENN, hence the performance with respect to baselines remains identical. Our proposed approach can explain each query image with relevant prototypes as opposed to the baselines where the prototypes are barely relevant.

A.7 Additional Visual Results - Domain Aligned

Figures 12a and 12b showcase the most top-5 most important prototypes concerning Digit and VisDA datasets respectively. For each row, the model settings on the left show the prototypes in the source and on the right show target domain. Our proposed approach can explain each concept with relevant prototypes across domains.

This paper is available on arxiv under CC BY 4.0 DEED license.

[2] https://github.com/AmanDaVinci/SENN

[3] https://github.com/wbw520/BotCL

Why Our Prototype-based AI Beats Traditional Concept Learning Approaches

Too Long; Didn't Read

People Mentioned

Companies Mentioned

Table of Links

5 Conclusion

References