inherent-interpretability

Insight: Interpretable Semantic Hierarchies in Vision-Language Encoders

Language-aligned vision foundation models perform strongly across diverse downstream tasks. Yet, their learned representations remain opaque, making interpreting their decision-making hard. Recent works decompose these representations into …

FaCT: Faithful Concept Traces for Explaining Neural Network Decisions

Deep networks have shown remarkable performance across a wide range of tasks, yet getting a global concept-level understanding of how they function remains a key challenge. Many post-hoc concept-based approaches have been introduced to understand …