feat: polish mode — refine an existing figure with style-guided suggestions#247
Merged
Conversation
…stions Adds `paperbanana polish --input figure.png`: a two-step flow where a VLM audits the user-supplied figure against the venue style guide (--venue, neurips default) and produces up to 10 concrete, actionable suggestions, which are then applied to the original figure as a guided image edit (the figure and the numbered suggestions both go to the image provider). - PolishAgent (paperbanana/agents/polish.py): suggest() VLM step with robust list parsing (numbered/bulleted/fenced, NO_SUGGESTIONS sentinel, capped at 10) and apply() guided-edit step; prompts in prompts/polish/. - Guided edits: GoogleImagenGen.generate gains an optional images kwarg (image-conditioned generation); callers detect support by signature. Providers without it are rejected with a clear error. - CLI: --input (validated as a readable image), --venue, --output, --iterations (repeat suggest→apply on the result), --aspect-ratio, provider/model/budget/seed flags consistent with generate; suggestions printed to the console; cost tracked and reported with budget guard. - Multi-candidate: --num-candidates fans the apply step out in parallel with per-candidate output dirs; first successful candidate is primary. Out of scope: 2K/4K upscaling (depends on provider support; separate concern). Fixes #238
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #238 (rollback half shipped in #243; this is polish mode).
paperbanana polish --input figure.png— bring your own figure:--venue, picks upguidelines synthesizeoutput) and proposes ≤10 concrete improvements (robust parsing: numbered/bulleted/fenced;NO_SUGGESTIONSsentinel exits unchanged)GoogleImagenGengained an optionalimageskwarg so Gemini edits the actual figure rather than regenerating from text. Providers without guided-edit support are rejected with an actionable error (capability detected by signature; contract documented in the base class). No silent fallbacks.--iterations Nrepeats suggest→apply;--num-candidatesfans out the apply step in parallel; budget guard + cost summary wired likegenerate.Design note: the issue assumed the refinement loop already passed images to image-gen — it doesn't (the loop conditions on images only via the Critic's VLM call), hence the small additive provider extension. 2K/4K upscaling is out of scope (provider-dependent, tracked separately if wanted).
26 new tests; suite at 853 passing.