SpRAy (Spectral Relevance Analysis) clusters explanation heatmaps to reveal systematic attention patterns that can indicate Clever Hans behavior.
This implementation follows Lapuschkin et al. (2019) and is designed to work with any heatmap generator (e.g., GradCAM). You can provide precomputed heatmaps directly or let SpRAy generate them via a pluggable heatmap generator.
Use SpRAy when:
- You can compute explanation heatmaps for a set of samples.
- You want to discover systematic attention artifacts.
- You need an interpretable, cluster-based view of model focus.
Do not use SpRAy when:
- You have very few samples (spectral clustering becomes unstable).
- Heatmaps are extremely noisy or poorly aligned across samples.
import numpy as np
from shortcut_detect import SpRAyDetector
# heatmaps: (N, H, W)
heatmaps = np.load("heatmaps.npy")
labels = np.load("labels.npy")
detector = SpRAyDetector(
affinity="cosine",
cluster_selection="auto",
downsample_size=32,
)
detector.fit(heatmaps=heatmaps, labels=labels)
report = detector.get_report()
print(report["report"]["clever_hans"])import torch
from shortcut_detect import SpRAyDetector
model = torch.load("model.pt", map_location="cpu", weights_only=False)
inputs = ... # tensor of shape (N, C, H, W)
detector = SpRAyDetector(
affinity="rbf",
cluster_selection="eigengap",
max_clusters=8,
)
detector.fit(
inputs=inputs,
model=model,
target_layer="backbone.layer4",
head="logits",
)
print(detector.summary())- clusters: size, purity, focus (localization) stats per cluster
- clever_hans: heuristic flags + risk level
- representative_heatmaps: mean heatmap per cluster for inspection
- Start with
affinity="cosine"andcluster_selection="auto". - Use
downsample_size=32or64to speed up clustering. - If you see small, high-purity clusters with very localized attention, investigate corresponding inputs for artifacts.