inherent-interpretability

CFM: Language-aligned Concept Foundation Model for Vision

Language-aligned vision foundation models perform strongly across diverse downstream tasks. Yet, their learned representations remain opaque, making interpreting their decision-making difficult. Recent work decompose these representations into …

FaCT: Faithful Concept Traces for Explaining Neural Network Decisions

Deep networks have shown remarkable performance across a wide range of tasks, yet getting a global concept-level understanding of how they function remains a key challenge. Many post-hoc concept-based approaches have been introduced to understand …