feat: polish mode — refine an existing figure with style-guided suggestions by dippatel1994 · Pull Request #247 · llmsresearch/paperbanana

dippatel1994 · 2026-06-11T20:06:42Z

Fixes #238 (rollback half shipped in #243; this is polish mode).

paperbanana polish --input figure.png — bring your own figure:

Suggest: VLM audits the figure against the venue style guide (--venue, picks up guidelines synthesize output) and proposes ≤10 concrete improvements (robust parsing: numbered/bulleted/fenced; NO_SUGGESTIONS sentinel exits unchanged)
Apply: true guided edit — GoogleImagenGen gained an optional images kwarg so Gemini edits the actual figure rather than regenerating from text. Providers without guided-edit support are rejected with an actionable error (capability detected by signature; contract documented in the base class). No silent fallbacks.

--iterations N repeats suggest→apply; --num-candidates fans out the apply step in parallel; budget guard + cost summary wired like generate.

Design note: the issue assumed the refinement loop already passed images to image-gen — it doesn't (the loop conditions on images only via the Critic's VLM call), hence the small additive provider extension. 2K/4K upscaling is out of scope (provider-dependent, tracked separately if wanted).

26 new tests; suite at 853 passing.

…stions Adds `paperbanana polish --input figure.png`: a two-step flow where a VLM audits the user-supplied figure against the venue style guide (--venue, neurips default) and produces up to 10 concrete, actionable suggestions, which are then applied to the original figure as a guided image edit (the figure and the numbered suggestions both go to the image provider). - PolishAgent (paperbanana/agents/polish.py): suggest() VLM step with robust list parsing (numbered/bulleted/fenced, NO_SUGGESTIONS sentinel, capped at 10) and apply() guided-edit step; prompts in prompts/polish/. - Guided edits: GoogleImagenGen.generate gains an optional images kwarg (image-conditioned generation); callers detect support by signature. Providers without it are rejected with a clear error. - CLI: --input (validated as a readable image), --venue, --output, --iterations (repeat suggest→apply on the result), --aspect-ratio, provider/model/budget/seed flags consistent with generate; suggestions printed to the console; cost tracked and reported with budget guard. - Multi-candidate: --num-candidates fans the apply step out in parallel with per-candidate output dirs; first successful candidate is primary. Out of scope: 2K/4K upscaling (depends on provider support; separate concern). Fixes #238

dippatel1994 and others added 3 commits June 11, 2026 16:05

Merge branch 'main' into feat/polish-mode

6feaab1

Merge branch 'main' into feat/polish-mode

0a2d542

dippatel1994 merged commit 41b8bfe into main Jun 11, 2026
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: polish mode — refine an existing figure with style-guided suggestions#247

feat: polish mode — refine an existing figure with style-guided suggestions#247
dippatel1994 merged 3 commits into
mainfrom
feat/polish-mode

dippatel1994 commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dippatel1994 commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant