Skip to content

feat: user-provided reference/sketch images guide diagram generation#248

Merged
dippatel1994 merged 2 commits into
mainfrom
feat/image-input
Jun 11, 2026
Merged

feat: user-provided reference/sketch images guide diagram generation#248
dippatel1994 merged 2 commits into
mainfrom
feat/image-input

Conversation

@dippatel1994

Copy link
Copy Markdown
Member

Fixes #223--image PATH (repeatable) lets a hand-drawn sketch, whiteboard photo, or prior figure guide generation:

  • GenerationInput.input_images: list[str]; validated pre-pipeline (exists + PIL-verifiable), rejected on continue runs
  • Planner receives the sketches as image parts appended after the retrieved exemplars (preserves the existing '[See reference image N]' indexing) with an explicit 'user-provided reference/sketch' section in the prompt — captured by the prompt recorder
  • Visualizer gets a one-line sketch-guided note via a flag rather than description text, so the Critic never sees the sketch and keeps judging against the source text only
  • Survives the --optimize GenerationInput rebuild; recorded in run_input.json; MCP generate_diagram gains the optional input_images param with validation
  • Follow-up (per design sketch): Context Enricher integration under --optimize is deliberately out of scope

Suite at 841 passing with the new coverage (planner attachment, critic blindness, CLI/MCP validation).

dippatel1994 and others added 2 commits June 11, 2026 16:06
Allow passing existing images (hand-drawn sketch, whiteboard photo,
prior figure version) as guidance alongside the methodology text:

- GenerationInput gains input_images: list[str] (paths)
- Repeatable --image PATH on `paperbanana generate`; files are
  validated (existence + PIL-openable raster) before the pipeline
  starts, and rejected when combined with --continue/--continue-run
- Planner attaches the images as additional image parts after the
  retrieved exemplar images, labeled as user-provided reference/sketch
  in the prompt (exemplar "reference image N" indexing is preserved)
- Visualizer diagram prompt gets a one-line note that a user sketch
  guided the plan (carried via a sketch_guided flag so the note never
  leaks into the description the Critic reviews)
- The Critic never sees the sketch: it keeps judging against the
  source text only
- input_images survives the --optimize input rebuild and is recorded
  in run_input.json for reproducibility
- MCP: generate_diagram gains optional input_images (validated paths)

Fixes #223
@dippatel1994 dippatel1994 merged commit 5b36f66 into main Jun 11, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Reference/sketch image input to guide diagram generation

1 participant