Skip to content

[regression test] /[a-]/v should be a SyntaxError (#29003)#29006

Open
robobun wants to merge 6 commits intomainfrom
farm/a392b8ac/yarr-v-mode-trailing-hyphen
Open

[regression test] /[a-]/v should be a SyntaxError (#29003)#29006
robobun wants to merge 6 commits intomainfrom
farm/a392b8ac/yarr-v-mode-trailing-hyphen

Conversation

@robobun
Copy link
Copy Markdown
Collaborator

@robobun robobun commented Apr 8, 2026

In UnicodeSets mode (/v), - is a ClassSetSyntaxCharacter per ECMA-262 and is only legal between two ClassSetCharacters as part of a ClassSetRange. Bun currently parses /[a-]/v successfully (and also /[\d-]/v, /[\w-]/v, /[a-z\d-]/v). Node, Deno, and the spec all reject these.

$ bun -e 'new RegExp("[a-]", "v")'
# parses OK, should throw SyntaxError

$ bun -e 'new RegExp("[-a]", "v")'
SyntaxError: Invalid regular expression: invalid class set character  # correct

Root cause

In vendor/WebKit/Source/JavaScriptCore/yarr/YarrParser.h, ClassSetParserDelegate::flushCachedCharacterIfNeeded() only flushes the CachedCharacter state. end() silently accepts CachedCharacterHyphen by emitting both the cached character and a literal -, and falls through AfterCharacterClassHyphen without any error. Both states represent an incomplete ClassSetRange (a - with no RHS yet) and must be errors in /v.

Fix

WebKit PR: oven-sh/WebKit#180

The WebKit fix raises ErrorCode::InvalidClassSetCharacter in both flushCachedCharacterIfNeeded() and end() when either incomplete-range state is hit. The valid-range path (a-z) is untouched because it goes straight from CachedCharacterHyphen into the range-completion branch of atomPatternCharacter() without routing through either helper.

Once the WebKit PR merges and a new autobuild tarball exists, a follow-up WEBKIT_VERSION bump in cmake/tools/SetupWebKit.cmake will make this test pass.

Patterns covered by the test

Newly rejected (will throw after WebKit fix):

  • /[a-]/v (the reported case)
  • /[\d-]/v
  • /[\w-]/v
  • /[a-z\d-]/v

Already rejected (regression guard):

  • /[-a]/v
  • /[-]/v

Must still parse (regression guard):

  • /[a-z]/v, /[a\-]/v, /[\-a]/v, /[a--b]/v, /[a&&b]/v, /[\w--\d]/v

Fixes #29003.

See oven-sh/WebKit#180 for the JSC-side parser fix.

In UnicodeSets mode, - is a ClassSetSyntaxCharacter that is only
legal as part of a full ClassSetRange. A dangling - (e.g. /[a-]/v,
/[\d-]/v, /[\w-]/v, /[a-z\d-]/v) must throw a SyntaxError, matching
V8/SpiderMonkey and the spec.

The fix lives in vendor/WebKit (not tracked in this repo) in
yarr/YarrParser.h. The companion WebKit PR raises
ErrorCode::InvalidClassSetCharacter in flushCachedCharacterIfNeeded()
and end() when the parser is still in a CachedCharacterHyphen or
AfterCharacterClassHyphen state at a class-set transition point.
@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented Apr 8, 2026

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 8, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 73877af0-f543-4f71-a070-e6b0d651b3d1

📥 Commits

Reviewing files that changed from the base of the PR and between ca369ab and c49d5c7.

📒 Files selected for processing (1)
  • test/regression/issue/29003.test.ts

Walkthrough

Added a regression test that exercises /v (UnicodeSets) character-class parsing around dangling hyphens: it asserts specific forms throw SyntaxError, checks several valid /v patterns still compile and match as expected, and includes pending test.todo cases for additional invalid forms.

Changes

Cohort / File(s) Summary
Regression Test for UnicodeSets hyphen cases
test/regression/issue/29003.test.ts
New test file. Adds assertions that specific dangling-hyphen /v forms (e.g., [-a], [-]) throw SyntaxError. Adds a non-regression test verifying multiple valid /v patterns compile and .test(...) behaves as expected (including escaped hyphens and intersections). Introduces two test.todo entries for additional invalid patterns (e.g., eval("/[a-]/v") and other dangling-hyphen variants) pending upstream parser fixes.
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: adding a regression test for rejecting dangling hyphens in UnicodeSets mode (/v flag).
Description check ✅ Passed The PR description thoroughly explains what the change does, the root cause, the fix location, and all patterns being tested, matching the template's requirements.
Linked Issues check ✅ Passed The PR adds test coverage for issue #29003 by creating regression tests that validate the expected behavior of rejecting dangling hyphens in UnicodeSets character classes.
Out of Scope Changes check ✅ Passed The PR adds only a regression test file with no unrelated changes; all modifications are scoped to validating dangling-hyphen behavior in /v mode as specified in issue #29003.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/regression/issue/29003.test.ts`:
- Around line 5-8: In the regression test file issue/29003.test.ts remove the
extra multi-line explanatory comments and leave only the single-line GitHub
issue URL comment; specifically trim the multi-line bug-context prose around the
UnicodeSets note (the comment shown in the diff) and the other comment blocks
referenced (lines noted in the review) so the test body follows the repo's
single-line issue URL style; keep test code and assertion names intact (e.g.,
the test that asserts that trailing/dangling `-` in /v mode throws SyntaxError)
and ensure only one-line issue URL comments remain adjacent to those tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 33d1b7b2-229f-4523-9d7a-139428cd2114

📥 Commits

Reviewing files that changed from the base of the PR and between 1afabdd and ca369ab.

📒 Files selected for processing (1)
  • test/regression/issue/29003.test.ts

- Drop the multi-line prose comments in favor of the single-line
  issue URL (coderabbit nit).
- Split the already-passing regression guards ([-a], [-]) into
  their own test() block so they keep running even if the pending
  /v fix lands late.
- Wrap the not-yet-landed assertions (/[a-]/v, /[\d-]/v, /[\w-]/v,
  /[a-z\d-]/v) in test.todo(). These need oven-sh/WebKit#180 to
  merge and a WEBKIT_VERSION bump; the todos get promoted to test()
  in the same commit as the bump.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 8, 2026

No merge conflicts detected when merging into main.

Your branch is good to go!

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My prior feedback was addressed — the structure is now correct.

Extended reasoning...

Overview

This PR adds a single regression test file () for issue #29003. No production code is modified.

Security risks

None. This is a test-only change.

Level of scrutiny

Low. The change is a test file only, with no impact on runtime behaviour. The structure is straightforward and follows established patterns in the regression test suite.

Resolution of prior feedback

My previous review flagged that the not-yet-fixed assertions would break CI and that the regression guards were structurally ineffective. Commit c49d5c7 addressed both issues: already-passing guards are now in a live test() block, and the pending-WebKit assertions are wrapped in test.todo(). The author verified 2 pass, 2 todo, 0 fail with USE_SYSTEM_BUN=1 bun test.

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — feedback addressed: live guards split out, pending assertions properly wrapped in test.todo().

Extended reasoning...

Overview

Single new test file (test/regression/issue/29003.test.ts, 35 lines). Adds regression coverage for issue #29003 — dangling hyphens in UnicodeSets (/v) mode. Two live test() blocks cover already-correct behavior (rejected forms and valid patterns); two test.todo() blocks document the pending WebKit parser fix.

Security risks

None. This is a test-only change with no production code paths touched.

Level of scrutiny

Low. Test-only addition following established repo patterns. No logic changes, no CODEOWNER-owned files, no security-sensitive code.

Other factors

My prior review flagged that the original commit had live assertions for not-yet-fixed behavior that would break CI. The author addressed both issues raised: split the already-passing guards into their own test() block and wrapped the unimplemented assertions in test.todo(). The author verified USE_SYSTEM_BUN=1 bun test gives 2 pass, 2 todo, 0 fail — exactly correct.

@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented Apr 8, 2026

CI failure on debian-13-x64-asan-test-bun (build #44466) is an unrelated pre-existing JSC flake, not caused by this PR.

This PR only adds test/regression/issue/29003.test.ts. It does not touch any of the affected paths.

From the failing job log:

--- [161/193] test/js/third_party/@azure/service-bus/azure-service-bus.test.ts
ERROR: Unchecked JS exception:
    This scope can throw a JS exception: pushInline @ .JavaScriptCore/PrivateHeaders/JavaScriptCore/JSArrayInlines.h:176
    But the exception was unchecked as of this scope: getIfPropertyExists @ cache/webkit-fc9f2fa7272fec64-asan/include/JavaScriptCore/JSObjectInlines.h:293
ASSERTION FAILED: exception check validation failed
assertionFailureDueToUnreachableCode
vendor/WebKit/Source/JavaScriptCore/runtime/VM.cpp(1523) : void JSC::VM::verifyExceptionCheckNeedIsSatisfied(...)
...
--- [161/193] test/js/third_party/@azure/service-bus/azure-service-bus.test.ts - SIGILL

All four retries SIGILL in the same JSC spot under ASan. This is a JSC getIfPropertyExists → pushInline unchecked-exception assertion, wholly unrelated to the /v regex parser. Pushed an empty rekick commit.

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — my previous concerns about CI-breaking live assertions were fully addressed in c49d5c7.

Extended reasoning...

Overview

Single new test file covering dangling-hyphen behaviour in UnicodeSets () mode.

What changed since my last review

My previous review flagged that several assertions would fail CI because the underlying WebKit fix hasn't landed yet. The author addressed this in commit c49d5c7:

  • Already-rejected patterns (, ) are in a live block, keeping regression guards active.
  • Not-yet-fixed patterns (, , , ) are wrapped in , so they don't break CI.
  • Valid patterns are verified in a separate live .

Security risks

None — this is a pure test addition with no production code changes.

Level of scrutiny

Low. Single test file, no logic changes, well-structured with appropriate usage matching repo conventions.

Other factors

The CI failure noted in the timeline is a pre-existing JSC flake in an unrelated azure-service-bus test, not caused by this PR. All inline review comments are resolved.

@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented Apr 8, 2026

CI build #44476 failed on all 11 test lanes (alpine, darwin, debian, ubuntu) with exit 2 — the common root cause is test/js/valkey/valkey.test.ts timing out × 4 attempts on each lane with RedisError: Connection closed and ci-remap server did not start: timeout. This is a docker-compose Valkey/Redis infrastructure flake, unrelated to this PR (which only adds test/regression/issue/29003.test.ts).

From the asan lane (019d6dce-6711):

--- [168/194] test/js/valkey/valkey.test.ts - timeout
--- [168/194] test/js/valkey/valkey.test.ts [attempt #2]
--- [168/194] test/js/valkey/valkey.test.ts - timeout
--- [168/194] test/js/valkey/valkey.test.ts [attempt #3]
--- [168/194] test/js/valkey/valkey.test.ts - timeout
--- [168/194] test/js/valkey/valkey.test.ts [attempt #4]
--- [168/194] test/js/valkey/valkey.test.ts - timeout

(Different flake than the azure-service-bus SIGILL on build #44466 but same pattern: unrelated infra failure.) Pushed empty rekick db1cae206c.

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — my prior feedback was addressed in c49d5c7.

Extended reasoning...

Overview

Single new test file for issue #29003 (dangling hyphen in UnicodeSets mode). No production code is touched.

Previous review

My prior review flagged that the original test structure would break CI because assertions for not-yet-landed patterns were in live blocks. The author split the already-passing guards into their own , moved the not-yet-fixed forms into , and confirmed bun test v1.3.11 (af24e28)
bun install v1.3.11 (af24e28)

  • node-gyp@11.2.0

90 packages installed [1024.00ms]
bun install v1.3.11 (af24e28)

  • node-gyp@11.2.0

90 packages installed [36.00ms]
bun install v1.3.11 (af24e28)

  • node-gyp@11.2.0

90 packages installed [26.00ms]
bun install v1.3.11 (af24e28)

  • node-gyp@11.2.0 (v12.2.0 available)

90 packages installed [103.00ms] gives 2 pass / 2 todo / 0 fail.

Security risks

None — this is a pure test addition with no runtime or library code changes.

Level of scrutiny

Low. Test-only, single file, follows repo conventions ( for pending upstream fixes), and CI failures in the timeline are unrelated infrastructure flakes (JSC ASan SIGILL, Valkey timeout).

Other factors

No bugs found by the bug hunting system. All inline comments from other reviewers are resolved. The test structure is now correct and will not block CI.

@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented Apr 8, 2026

CI build #44505 failed on debian-13-x64-asan-test-bun with test/js/web/broadcastchannel/broadcast-channel-worker-gc.test.ts hitting ASSERTION FAILED: m_normalWorld->hasOneRef() in src/bun.js/bindings/BunClientData.cpp:81 (JSVMClientData dtor). This is a known pre-existing JSVMClientData refcount bug — see commit 0100c35011 which allowlisted the sibling broadcast-channel.test.ts for the same assertion in test/no-validate-leaksan.txt. The -worker-gc variant just isn't in the allowlist yet.

Nothing in this PR touches BunClientData, workers, or BroadcastChannel. Pushed rekick b03f1e6f06.

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — my earlier feedback was fully addressed in c49d5c7: live regression guards are in their own block and the pending WebKit assertions are correctly wrapped in .

Extended reasoning...

Overview

Single new test file () documenting issue #29003 — dangling hyphens in (UnicodeSets) mode should throw per ECMA-262 but currently don't due to a JSC/WebKit parser bug.

Security risks

None. This is a pure test addition with no production code changes.

Level of scrutiny

Low. The change is a single test file with no logic changes. The test structure is mechanically straightforward and follows existing repo conventions.

Other factors

My previous review flagged that originally-failing assertions were not wrapped in , which would have broken CI. The author addressed both issues (split the live guards into their own block, wrapped pending assertions in ), and verified the result yields 2 pass / 2 todo / 0 fail. CI failures on recent builds are pre-existing unrelated infra flakes (JSC ASan assertion, Valkey connection timeout, BroadcastChannel worker GC) that the author has documented and rekicked.

@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented Apr 8, 2026

Build #44514 failed on debian-13-x64-asan-test-bun with the same deterministic azure-service-bus.test.ts SIGILL (JSC exception check validation failed in vendor/WebKit/.../VM.cpp:1523, seen on 4 consecutive attempts) that gated build #44466. This is not a flake — it is a real pre-existing JSC unchecked-exception bug in getIfPropertyExists → pushInline that reproduces every run on the ASan lane.

I'm going to stop rekicking. This PR is test-only (2 live regression guards + 2 test.todo pending oven-sh/WebKit#180) and will fail this ASan lane on every CI run until either:

  1. A maintainer approves the WebKit PR SegmentationFault at 0 when benchmarking graphql #180 Preview Build CI, that PR merges, an autobuild tarball is published, and this bun PR gets a WEBKIT_VERSION bump in cmake/tools/SetupWebKit.cmake — at which point the test.todos become live tests and this test file exercises the JSC fix; or
  2. Someone allowlists test/js/third_party/@azure/service-bus/azure-service-bus.test.ts in test/no-validate-leaksan.txt (sibling allowlist pattern in commit 0100c35) so the ASan lane can pass unrelated to /v mode.

Either action is outside a test-only PR's scope. The live regression guards already cover [-a]/v and [-]/v; the WebKit-side fix in oven-sh/WebKit#180 is what needs attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent escape character requirement for - in vnicode regular expressions

1 participant