feat(jsonish): handle special unicode quote chars#3381
feat(jsonish): handle special unicode quote chars#3381
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
📝 WalkthroughWalkthroughWhen fixes are allowed, parsing runs a strict ASCII-only pass and, if the input contains configured non-ASCII quote characters, a conditional Unicode-aware pass. Unicode candidates are prepended, deduplicated by structural Changes
Sequence DiagramsequenceDiagram
participant Entry as Entry Parser
participant Detector as Unicode Detector
participant Strict as Fixing Parser\n(AsciiOnly)
participant Unicode as Fixing Parser\n(AllUnicode)
participant Merger as Result Merger
Entry->>Detector: contains_unicode_quote_char(input)?
Detector-->>Entry: bool
Entry->>Strict: parse(input, options, AsciiOnly)
Strict-->>Entry: Result(strict_candidates / error)
alt contains unicode quotes
Entry->>Unicode: parse(input, options, AllUnicode)
Unicode-->>Entry: Result(unicode_candidates / error)
end
Entry->>Merger: merge(unicode_candidates?, strict_candidates?)
Merger->>Merger: prepend unicode, dedupe by Value equality
Merger-->>Entry: merged candidates or chosen error
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: af82f854d1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
engine/baml-lib/jsonish/src/jsonish/parser/entry.rs (1)
202-269:⚠️ Potential issue | 🟠 MajorDon’t synthesize arrays from cross-pass alternatives.
After merging
strictandunicode,items.len() > 1can mean “alternative repairs of the same input”, not “multiple JSON objects were found”. The existing branch then addsValue::Array(items.clone(), ...), which can introduce a list candidate that was never present in the input and may be selected during list coercion. Preserve the “multiple JSON objects as a list” behavior per parser pass before merging, or track whether the merged items came from one pass before adding the array candidate.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@engine/baml-lib/jsonish/src/jsonish/parser/entry.rs` around lines 202 - 269, The bug is that after merging strict and unicode candidates (merged), the code always synthesizes a Value::Array candidate for items.len() > 1 even when the multiple items come from different passes; fix by preserving each item’s origin and only synthesizing the array candidate when all items originated from the same parser pass. Concretely: change the merged type to carry an origin tag (e.g. enum Origin { Strict, Unicode }) so merged is Result<Vec<(Value, Vec<Fixes>, Origin)>> (or keep parallel boolean flags), populate Origin when constructing strict_items/unicode_items, and then in the multi-item branch only create items_clone = Value::Array(...) and append it to items when items.iter().all(|(_,_,o)| o == same_origin) (or when the original single-pass vector was used). Keep the rest of the Value::FixedJson and Value::AnyOf construction the same.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rs`:
- Around line 85-94: The current AllUnicode branch counts any
UNICODE_QUOTE_CHARS (which includes single-curly quotes like U+2019) as
unescaped quote flips, causing ASCII double-quoted strings to treat apostrophes
like It’s as quote content and break parity; change the check so that when
tracking an ASCII double-quoted string we test membership against a
double-quote-specific unicode set (e.g. DOUBLE_QUOTE_UNICODE_CHARS) instead of
the broad UNICODE_QUOTE_CHARS. Add DOUBLE_QUOTE_UNICODE_CHARS (containing only
codepoints that should flip double-quote parity) in fixing_parser.rs and replace
the condition in the QuoteParityMode::AllUnicode branch that increments
string_quote_tracking.unescaped_quote_count to use
DOUBLE_QUOTE_UNICODE_CHARS.contains(&token) when the current ASCII quote is '"'
(leave UNICODE_QUOTE_CHARS for any fast-path uses that need broader detection).
---
Outside diff comments:
In `@engine/baml-lib/jsonish/src/jsonish/parser/entry.rs`:
- Around line 202-269: The bug is that after merging strict and unicode
candidates (merged), the code always synthesizes a Value::Array candidate for
items.len() > 1 even when the multiple items come from different passes; fix by
preserving each item’s origin and only synthesizing the array candidate when all
items originated from the same parser pass. Concretely: change the merged type
to carry an origin tag (e.g. enum Origin { Strict, Unicode }) so merged is
Result<Vec<(Value, Vec<Fixes>, Origin)>> (or keep parallel boolean flags),
populate Origin when constructing strict_items/unicode_items, and then in the
multi-item branch only create items_clone = Value::Array(...) and append it to
items when items.iter().all(|(_,_,o)| o == same_origin) (or when the original
single-pass vector was used). Keep the rest of the Value::FixedJson and
Value::AnyOf construction the same.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 15ff0bd4-278b-4fbc-9817-4076203992ae
📒 Files selected for processing (5)
engine/baml-lib/jsonish/src/jsonish/parser/entry.rsengine/baml-lib/jsonish/src/jsonish/parser/fixing_parser.rsengine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rsengine/baml-lib/jsonish/src/tests/test_class.rsengine/baml-lib/jsonish/src/tests/test_lists.rs
af82f85 to
cc06235
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
engine/baml-lib/jsonish/src/jsonish/parser/entry.rs (1)
185-192: Unicode-pass error is silently dropped.
fixing_parser::parse(..., AllUnicode).ok()discards any error from the unicode pass with no log breadcrumb, while the strict pass error is debug-logged downstream (Line 274). If the AllUnicode pass starts regressing (e.g., panics-turned-errors from new quote handling), you'll have no trace for inputs where strict still succeeds.🔎 Suggested tweak
- fixing_parser::parse(str, &options, QuoteParityMode::AllUnicode).ok() + fixing_parser::parse(str, &options, QuoteParityMode::AllUnicode) + .map_err(|e| { + log::debug!("AllUnicode parity pass failed: {e:?}"); + e + }) + .ok()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@engine/baml-lib/jsonish/src/jsonish/parser/entry.rs` around lines 185 - 192, The AllUnicode parity parse currently swallows errors via fixing_parser::parse(str, &options, QuoteParityMode::AllUnicode).ok(); change this so parse's Err is not silently discarded: call fixing_parser::parse and, on Err, emit a debug (or warn) log via log::debug!/log::warn! that includes the error and context (e.g., the input indicator and that it was the AllUnicode pass) before leaving unicode as None; on Ok keep the parsed result in the unicode variable as before. Reference contains_unicode_quote_char, fixing_parser::parse, QuoteParityMode::AllUnicode and the unicode variable to locate where to add the error logging.engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser.rs (1)
18-22: Narrow visibility topub(crate).
QuoteParityMode,contains_unicode_quote_char, andparse()are consumed only byentry.rswithin the same crate. Exposing them aspubwidens the crate's public API unnecessarily and makes future changes (e.g., addingQuoteParityModevariants) a breaking change.🔧 Suggested changes
-pub enum QuoteParityMode { +pub(crate) enum QuoteParityMode { AsciiOnly, AllUnicode, }-pub fn contains_unicode_quote_char(s: &str) -> bool { +pub(crate) fn contains_unicode_quote_char(s: &str) -> bool { s.chars().any(|c| UNICODE_QUOTE_CHARS.contains(&c)) }-pub fn parse( +pub(crate) fn parse( str: &str, _options: &ParseOptions, quote_parity: QuoteParityMode, ) -> Result<Vec<(Value, Vec<Fixes>)>> {Also applies to: 53-55
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser.rs` around lines 18 - 22, The public visibility of internal items should be narrowed to crate-only: change the enum QuoteParityMode to pub(crate) and likewise change the functions contains_unicode_quote_char and parse to pub(crate) so they are only exposed within the crate; update any references in the same module or entry.rs to use the now-crate-visible names and run tests to ensure no external usage breaks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@engine/baml-lib/jsonish/src/jsonish/parser/entry.rs`:
- Around line 185-192: The AllUnicode parity parse currently swallows errors via
fixing_parser::parse(str, &options, QuoteParityMode::AllUnicode).ok(); change
this so parse's Err is not silently discarded: call fixing_parser::parse and, on
Err, emit a debug (or warn) log via log::debug!/log::warn! that includes the
error and context (e.g., the input indicator and that it was the AllUnicode
pass) before leaving unicode as None; on Ok keep the parsed result in the
unicode variable as before. Reference contains_unicode_quote_char,
fixing_parser::parse, QuoteParityMode::AllUnicode and the unicode variable to
locate where to add the error logging.
In `@engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser.rs`:
- Around line 18-22: The public visibility of internal items should be narrowed
to crate-only: change the enum QuoteParityMode to pub(crate) and likewise change
the functions contains_unicode_quote_char and parse to pub(crate) so they are
only exposed within the crate; update any references in the same module or
entry.rs to use the now-crate-visible names and run tests to ensure no external
usage breaks.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d7891a84-3ae5-47d5-bba7-2eee6fdeb0ed
📒 Files selected for processing (3)
engine/baml-lib/jsonish/src/jsonish/parser/entry.rsengine/baml-lib/jsonish/src/jsonish/parser/fixing_parser.rsengine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rs
cc06235 to
c6db767
Compare
c6db767 to
4dbe641
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (3)
engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser.rs (3)
45-61: Consider including U+2032/U+2033 (prime/double prime) — optional.The table covers the major language conventions well. One minor gap:
″(U+2033 DOUBLE PRIME) occasionally appears in LLM output as a substitute for". Not required for the reported bug; mentioning only as a potential follow-up if you see further reports.Also note
‘/’(U+2018/U+2019) are single quotation marks — intentional inclusion since models do sometimes use them as ASCII'stand-ins, but worth confirming this matches theshould_close_unescaped_stringsemantics (where ASCII'is not tracked). If single-quote parity is deliberately lumped into the "double-quoted-string parity" counter, a brief note in the doc comment above would help future readers.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser.rs` around lines 45 - 61, UNICODE_QUOTE_CHARS currently omits prime/double-prime characters; add U+2033 (DOUBLE PRIME) and optionally U+2032 (PRIME) to the UNICODE_QUOTE_CHARS array to better catch LLM output that uses ″/′ as substitutes for ASCII quotes, and update the doc comment above UNICODE_QUOTE_CHARS to explicitly state why U+2018/U+2019 (single quotes) are included and how that interacts with the should_close_unescaped_string semantics so future readers understand whether single-quote parity is treated as part of double-quoted-string parity.
163-283: Missing unit test forAllUnicodemode in this file.All updated tests pass
QuoteParityMode::AsciiOnly, so the new enum variant has no direct unit coverage infixing_parser.rsitself — it's only exercised via higher-level tests intest_class.rs/test_lists.rs. A small unit test here (e.g., parsing"intent": { "reasoning": "Blindtext „eins zwei drei\", um …" }withAllUnicode) would pin the contract of this module and guard against regressions without relying on the entry cascade. As per coding guidelines: "Prefer writing Rust unit tests over integration tests where possible".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser.rs` around lines 163 - 283, Add a unit test in the tests mod that covers QuoteParityMode::AllUnicode by calling parse (the same way other tests do) with a JSON snippet containing Unicode smart quotes and ellipses (e.g., a small object like {"intent": {"reasoning": "Blindtext „eins zwei drei\", um …"}}) and assert the parsed Value (use Value::Object / Value::String and CompletionState as appropriate); target the parse function and use QuoteParityMode::AllUnicode instead of QuoteParityMode::AsciiOnly to ensure the parser’s Unicode quote-handling branch is exercised (name the test e.g. test_partial_unicode_mode or similar and follow the existing pattern for assertions).
66-68: Micro-optimization available but not needed.
UNICODE_QUOTE_CHARS.contains(&c)is O(n) per char scan; with 15 entries and typical input sizes it's negligible, but if profiles ever show this hot, amatches!(c, '\u{00AB}' | '\u{00BB}' | …)generated alongside the const (or a smallphf/sorted binary search) would inline better. Safe to defer.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser.rs` around lines 66 - 68, The current contains_unicode_quote_char function uses UNICODE_QUOTE_CHARS.contains(&c) inside s.chars().any(...), which does an O(n) scan per character; replace that check with a direct pattern match (e.g., use matches!(c, '\u{00AB}' | '\u{00BB}' | ... ) listing the same 15 unicode quote codepoints) so the per-char test in contains_unicode_quote_char is inlined and constant-time; update the match to mirror the entries in UNICODE_QUOTE_CHARS and keep the function signature contains_unicode_quote_char(s: &str) -> bool and its callers unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser.rs`:
- Around line 45-61: UNICODE_QUOTE_CHARS currently omits prime/double-prime
characters; add U+2033 (DOUBLE PRIME) and optionally U+2032 (PRIME) to the
UNICODE_QUOTE_CHARS array to better catch LLM output that uses ″/′ as
substitutes for ASCII quotes, and update the doc comment above
UNICODE_QUOTE_CHARS to explicitly state why U+2018/U+2019 (single quotes) are
included and how that interacts with the should_close_unescaped_string semantics
so future readers understand whether single-quote parity is treated as part of
double-quoted-string parity.
- Around line 163-283: Add a unit test in the tests mod that covers
QuoteParityMode::AllUnicode by calling parse (the same way other tests do) with
a JSON snippet containing Unicode smart quotes and ellipses (e.g., a small
object like {"intent": {"reasoning": "Blindtext „eins zwei drei\", um …"}}) and
assert the parsed Value (use Value::Object / Value::String and CompletionState
as appropriate); target the parse function and use QuoteParityMode::AllUnicode
instead of QuoteParityMode::AsciiOnly to ensure the parser’s Unicode
quote-handling branch is exercised (name the test e.g. test_partial_unicode_mode
or similar and follow the existing pattern for assertions).
- Around line 66-68: The current contains_unicode_quote_char function uses
UNICODE_QUOTE_CHARS.contains(&c) inside s.chars().any(...), which does an O(n)
scan per character; replace that check with a direct pattern match (e.g., use
matches!(c, '\u{00AB}' | '\u{00BB}' | ... ) listing the same 15 unicode quote
codepoints) so the per-char test in contains_unicode_quote_char is inlined and
constant-time; update the match to mirror the entries in UNICODE_QUOTE_CHARS and
keep the function signature contains_unicode_quote_char(s: &str) -> bool and its
callers unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: be24fd8b-7b0d-45b9-9dee-d63d26402a43
📒 Files selected for processing (5)
engine/baml-lib/jsonish/src/jsonish/parser/entry.rsengine/baml-lib/jsonish/src/jsonish/parser/fixing_parser.rsengine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rsengine/baml-lib/jsonish/src/tests/test_class.rsengine/baml-lib/jsonish/src/tests/test_lists.rs
✅ Files skipped from review due to trivial changes (1)
- engine/baml-lib/jsonish/src/tests/test_lists.rs
🚧 Files skipped from review as they are similar to previous changes (3)
- engine/baml-lib/jsonish/src/jsonish/parser/entry.rs
- engine/baml-lib/jsonish/src/tests/test_class.rs
- engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rs
4dbe641 to
0a232cb
Compare
0a232cb to
bc28c41
Compare
bc28c41 to
a195d87
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rs (1)
462-490:⚠️ Potential issue | 🟠 MajorApply quote parity when the comma follows whitespace.
The immediate comma path checks
closing_char_count, but the whitespace lookahead closes unconditionally on,. Inputs like"Blindtext „eins" , um ..."still close early underAllUnicode.Proposed fix
- ',' if in_object_value => return true, - ',' | ']' if in_array => return true, + ',' if in_object_value => return closing_char_count % 2 == 0, + ',' if in_array => return closing_char_count % 2 == 0, + ']' if in_array => return true,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rs` around lines 462 - 490, The whitespace lookahead branch incorrectly treats a comma as an unconditional close; update the match arm inside the while-let in the function/method that uses next and closing_char_count so that when encountering ',' it applies the same parity check as the direct ',' arm (i.e., only return true if closing_char_count % 2 == 0) and still respects in_object_value/in_array/in_object_key conditions; adjust the ',', ',' | ']' and ',' if in_array cases to use closing_char_count where appropriate so inputs like `"Blindtext „eins" , um ..."` do not prematurely close.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rs`:
- Around line 84-92: The AllUnicode branch (QuoteParityMode::AllUnicode)
incorrectly increments self.string_quote_tracking.unescaped_quote_count for
unicode quote characters from UNICODE_QUOTE_CHARS without verifying they are
unescaped; update the branch in the parser where UNICODE_QUOTE_CHARS is checked
to apply the same even-backslash guard used for the ASCII double-quote handling
(i.e., count preceding backslashes and only treat the unicode quote as unescaped
when the count is even) so escaped unicode quote marks like `\„` do not flip
parity.
---
Outside diff comments:
In
`@engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rs`:
- Around line 462-490: The whitespace lookahead branch incorrectly treats a
comma as an unconditional close; update the match arm inside the while-let in
the function/method that uses next and closing_char_count so that when
encountering ',' it applies the same parity check as the direct ',' arm (i.e.,
only return true if closing_char_count % 2 == 0) and still respects
in_object_value/in_array/in_object_key conditions; adjust the ',', ',' | ']' and
',' if in_array cases to use closing_char_count where appropriate so inputs like
`"Blindtext „eins" , um ..."` do not prematurely close.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: c6a3b5c6-ff26-4847-a399-aa46b8d19be6
📒 Files selected for processing (5)
engine/baml-lib/jsonish/src/jsonish/parser/entry.rsengine/baml-lib/jsonish/src/jsonish/parser/fixing_parser.rsengine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rsengine/baml-lib/jsonish/src/tests/test_class.rsengine/baml-lib/jsonish/src/tests/test_lists.rs
✅ Files skipped from review due to trivial changes (1)
- engine/baml-lib/jsonish/src/tests/test_lists.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- engine/baml-lib/jsonish/src/tests/test_class.rs
| } else if quote_parity == QuoteParityMode::AllUnicode | ||
| && UNICODE_QUOTE_CHARS.contains(&token) | ||
| { | ||
| // Under AllUnicode, double-quote-role unicode marks (e.g. | ||
| // `„`, `"`, `»`, `「`) also flip parity so a stray opener | ||
| // inside an ASCII-quoted string prevents early close on | ||
| // the next `,`. Single-quote-role marks are intentionally | ||
| // excluded — see `UNICODE_QUOTE_CHARS` for why. | ||
| self.string_quote_tracking.unescaped_quote_count += 1; |
There was a problem hiding this comment.
Honor escaping before Unicode quote marks.
The AllUnicode branch increments unescaped_quote_count without the even-backslash guard used for ASCII ". In malformed-but-repairable strings, \„ will flip parity even though it is escaped.
Proposed fix
} else if quote_parity == QuoteParityMode::AllUnicode
&& UNICODE_QUOTE_CHARS.contains(&token)
+ && self
+ .string_quote_tracking
+ .trailing_backslashes
+ .is_multiple_of(2)
{🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@engine/baml-lib/jsonish/src/jsonish/parser/fixing_parser/json_parse_state.rs`
around lines 84 - 92, The AllUnicode branch (QuoteParityMode::AllUnicode)
incorrectly increments self.string_quote_tracking.unescaped_quote_count for
unicode quote characters from UNICODE_QUOTE_CHARS without verifying they are
unescaped; update the branch in the parser where UNICODE_QUOTE_CHARS is checked
to apply the same even-backslash guard used for the ASCII double-quote handling
(i.e., count preceding backslashes and only treat the unicode quote as unescaped
when the count is even) so escaped unicode quote marks like `\„` do not flip
parity.
User report in #3307: opus-4.6 returned a JSON object with interestingly-quoted german, which in turn caused the outer parses to fail:
The rule that jsonish uses is that if there are an even number of " chars, it ingests the entire string, but if there's an odd number of " chars, it's ambiguous where to terminate the string and therefore just chooses the first one. This however doesn't work in this case, where there's a German string start quote paired with an ascii string start/end quote.
Solution: change string value parsing to use either the existing strategy or a new string parsing strategy, where any ascii or unicode quote character (any of "«„ etc) is allowed to contribute to the parity count.
This will allow us to handle the user's case, but also allow jsonish to parse something like
"items": ["„eins", "zwei"](which, to a human, has a very obvious parse, and therefore should have an obvious parse in jsonish).Alternatives considered
"items": ["„eins", "zwei"]caseproblem: when explaining a jsonish/SAP parse result to a user, this would be impossible to explain
Summary by CodeRabbit
New Features
Tests