Fix F.pad axis swap in pad_to_training_size by sidd462 · Pull Request #8 · MrGiovanni/R-Super

sidd462 · 2026-06-21T17:29:05Z

Closes #7.

Summary

pad_to_training_size() in rsuper_train/predict_abdomenatlas.py had two F.pad(...) calls whose padding tuples were ordered for the wrong axis. The z-axis branch was padding W, and the x-axis branch was padding D. When an input volume was too small on the z- or x-axis, the originally-too-small axis stayed too small, and the next layer rejected the shape — but the postprocess block above swallowed the exception with a bare-except, so the only signal was a single line of FAILED postprocess with no context. 25 of 901 PanTS-te cases were silently lost this way.

This PR makes two changes in one file:

Fixes the F.pad axis order so each branch pads the axis it claims to.
Upgrades the bare-except in the postprocess block so a future silent failure of this class isn't possible.

Why the original code was wrong

torch.nn.functional.pad reads its padding tuple last-dim-first. For a 5D tensor of shape (N, C, D, H, W) the tuple has to be ordered:

(W_left, W_right, H_left, H_right, D_left, D_right)

So the tuple controls W first, then H, then D — not the other way round.

In the original code:

The z-axis branch (where z < args.training_size[0]) intended to pad D, but passed (diff, diff, 0,0, 0,0). By the last-dim-first rule that pads W (and leaves D untouched).
The x-axis branch (where x < args.training_size[2]) intended to pad W, but passed (0,0, 0,0, diff, diff). By the same rule, that pads D.

So the two branches were each padding the other axis's branch's target. The originally-too-small axis was never enlarged, and the next layer threw on a shape mismatch.

The fix (1/2) — swap the tuples between the branches

@@ pad_to_training_size — z-axis branch (line 256)
-            tensor_img = F.pad(tensor_img, (diff, diff, 0,0, 0,0))
+            tensor_img = F.pad(tensor_img, (0,0, 0,0, diff, diff))

@@ pad_to_training_size — x-axis branch (line 274)
-            tensor_img = F.pad(tensor_img, (0,0, 0,0, diff, diff))
+            tensor_img = F.pad(tensor_img, (diff, diff, 0,0, 0,0))

After the swap:

z-axis branch passes (0,0, 0,0, diff, diff) → pads D (the z axis). Correct.
x-axis branch passes (diff, diff, 0,0, 0,0) → pads W (the x axis). Correct.

Each branch now pads the axis named in its own guard. The other axes get (0, 0) so they're untouched.

The fix (2/2) — surface postprocess failures instead of swallowing them

The bug was easy to miss for one reason: the postprocess block above the padding code had except: (catches everything) and printed only FAILED postprocess. No error type, no traceback, no case ID. That's why an axis-order bug that triggers on ~3% of cases shipped unnoticed.

-        except:
-            print('FAILED postprocess')
+        except Exception as e:
+            import traceback
+            print(f'FAILED postprocess for {img_name}: {type(e).__name__}: {e}')
+            traceback.print_exc()

Why this is bundled with the F.pad fix and not a separate PR: the bare-except is what hid this bug. Replacing it with something diagnosable closes the same root cause from a different angle — if another shape-related bug crops up in pad_to_training_size (or anywhere else in the postprocess pipeline), it'll surface immediately instead of silently dropping cases. I'd rather land both together than ship the F.pad fix and leave the silent-swallowing scaffolding in place.

If the maintainers prefer the bare-except upgrade to be a separate PR, happy to split it out.

Diff stats

1 file changed, 6 insertions(+), 4 deletions(-), all in rsuper_train/predict_abdomenatlas.py. 3 hunks total: 2 single-line F.pad swaps + the except-block expansion. No other files changed, no dependencies added.

Verification

Tested with the R-Super checkpoint on PanTS-te (n=901):

Before: 25 of 901 cases failed with FAILED postprocess. No prediction outputs were written for those cases, so a downstream evaluator that globs for predictions/*.nii.gz simply never saw them — exactly the loss-by-silence the bare-except was producing.
After: all 901 cases produce a full predictions/<class>.nii.gz tree.
The 25 previously-failing cases all share the property that their post-preprocessing shape is < 128 on the z- or x-axis — i.e. they hit exactly the broken branch.
python -m py_compile rsuper_train/predict_abdomenatlas.py passes.

Fix F.pad axis swap in pad_to_training_size

505408e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix F.pad axis swap in pad_to_training_size#8

Fix F.pad axis swap in pad_to_training_size#8
sidd462 wants to merge 1 commit into
MrGiovanni:mainfrom
sidd462:fix-pad-to-training-size

sidd462 commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sidd462 commented Jun 21, 2026

Summary

Why the original code was wrong

The fix (1/2) — swap the tuples between the branches

The fix (2/2) — surface postprocess failures instead of swallowing them

Diff stats

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant