We propose three novel evaluation schemes to better understand the faithfulness and differences between attribution methods, and use them to study strengths and shortcomings of some widely used attribution methods.
We show that location-optimization significantly strengthens adversarial patch attacks, and then show that adversarial training on these stronger attacks significantly improves robustness without reducing accuracy