4 Steps to Achieve Domain-Invariant Concept Learning in AI Systems

Authors:

(1) Sanchit Sinha, University of Virginia ([email protected]);

(2) Guangzhi Xiong, University of Virginia ([email protected]);

(3) Aidong Zhang, University of Virginia ([email protected]).

Table of Links

Abstract and 1 Introduction

3.2 Self-supervised Contrastive Concept Learning

3.3 Prototype-based Concept Grounding

3.4 End-to-end Composite Training

4 Experiments and 4.1 Datasets and Networks

4.2 Hyperparameter Settings

4.3 Evaluation Metrics and 4.4 Generalization Results

4.5 Concept Fidelity and 4.6 Qualitative Visualization

5 Conclusion and References

Appendix

3 Methodology

In this section, we first provide a detailed description of our proposed learning pipeline, including (a) the Representative Concept Extraction (RCE) framework which incorporates a novel Salient Concept Selection Network in addition to the Concept and Relevance Networks, (b) Self-Supervised Contrastive Concept Learning (CCL) which enforces domain invariance among learned concepts, and (c) a Prototype-based Concept Grounding (PCG) regularizer that mitigates the problem of concept-shift among domains. We then provide details for the end-to-end training procedure with additional Concept Fidelity regularization which ensures concept consistency among similar samples.

3.1 Representative Concept Extraction

3.2 Self-supervised Contrastive Concept Learning

Even though the RCE framework generates representative concepts, the concepts extracted are adulterated with domain noise thus limiting their generalization. In addition, with limited training data, the concept extraction process is not robust. Self-supervised learning contrastive training objectives are the most commonly used paradigm [Thota and Leontidis, 2021] for learning robust visual features in images. We incorporate self-supervised contrastive learning to learn domain invariant concepts, termed CCL.

3.3 Prototype-based Concept Grounding

Concept Fidelity Regularization. Concept fidelity attempts to enforce the similarity of concepts through a similarity measure s(·, ·) of data instances from the same class in the same domain. Formally,

3.4 End-to-end Composite Training

Overall, the training objective can be formalized as a weighted sum of CCL and PCG objectives:

where λ1 and λ2 are tunable hyperparameters controlling the strength of contrastive learning and prototype grounding regularization. The end-to-end training objective can be represented as:

The tunable hyperparameter β controls the effect of generalization and robustness on the RCE framework. Note that a higher value of β makes the concept learning procedure brittle and unable to adapt to target domains. However, a very low value of β makes the concept learning procedure overfit on the source domain, implying a tradeoff between concept generalization and performance.

This paper is available on arxiv under CC BY 4.0 DEED license.