foundation-models

Insight: Interpretable Semantic Hierarchies in Vision-Language Encoders

Language-aligned vision foundation models perform strongly across diverse downstream tasks. Yet, their learned representations remain opaque, making interpreting their decision-making hard. Recent works decompose these representations into …