ALOE is a method to transform large-scale ViT-based foundation models such as DINOv3 and SigLIP2 into inherently interpretable B-cos variants at a fraction of the cost of training from scratch, bringing interpretability while retaining strong performance across a range of downstream tasks and datasets.
B-cos LMs extend B-cos networks to language models, providing more faithful and human interpretable explanations than post-hoc methods while maintaining comparable task performance.
B-cosification is a method to transform existing pre-trained models to inherently interpretable B-cos variants at a fraction of the cost of training from scratch, yielding models that are interpretable while often outperforming them in terms of classification performance. We also apply B-cosification to CLIP and show that the B-cosified version remains competitive on performance while being interpretable.