Skip to content

Improve regex error messages to surface pattern in output#85

Closed
campoy wants to merge 1 commit intomasterfrom
fix/regex-error-messages
Closed

Improve regex error messages to surface pattern in output#85
campoy wants to merge 1 commit intomasterfrom
fix/regex-error-messages

Conversation

@campoy
Copy link
Copy Markdown
Owner

@campoy campoy commented Apr 11, 2026

Summary

  • Wrap regexp.CompilePOSIX errors to include the full pattern string (e.g. invalid regexp /(/: error parsing regexp: ...), making it straightforward to identify which embed directive failed and to audit patterns in PR reviews.
  • Add a security note to the package-level godoc clarifying that embed directives are treated as trusted input and should not be sourced from untrusted contributors without review.
  • Update the two existing test cases whose expected error strings changed with the new format.
  • Add three new TestExtract cases: invalid regexp shows pattern in error, missing slashes reports value, and a valid-pattern regression test.

Test plan

  • go test ./... passes with all existing and new test cases.
  • "bad start regexp" and "bad end regexp" cases updated to match new error format.
  • New cases verify the pattern appears in compile errors, missing-slashes message, and that valid patterns still match correctly.

Closes #83

Wrap regexp compile errors to include the full pattern string, making
it easy to identify which directive failed and audit patterns in PR
reviews. Update existing tests and add new cases covering the improved
error format, missing-slashes validation, and a valid-pattern
regression. Add a security note to the package godoc.

Closes #83
@campoy
Copy link
Copy Markdown
Owner Author

campoy commented Apr 11, 2026

Closing this — the original security analysis was wrong.

The issue assumed an attacker who can write to the markdown file but cannot read the source files being embedded. That combination is implausible: anyone with write access to the repo's docs almost certainly has read access to the source too. If they can edit the markdown they can just copy the file contents directly — no regex tricks needed.

The "error message leaks path information" finding is even weaker: the path in the error comes from the directive the attacker wrote themselves, so it tells them nothing they don't already know.

The only scenario where controlling embed directives would be a genuine privilege escalation is if embedmd were exposed as a service processing untrusted markdown from external users. That's not what this tool is for, and the path traversal fix in #84 is the right defence for that threat model anyway.

The improved error messages (including the pattern in the output) are still useful for debugging, so that commit can stay — but it's a usability improvement, not a security fix.

@campoy campoy closed this Apr 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

User-supplied regex compiled without validation enables regex injection

1 participant