refactor(escape): use regex replace, remove Node < 4 polyfill by fb55 · Pull Request #2197 · fb55/entities

fb55 · 2026-03-29T11:28:28Z

Summary

Replaces the getEscaper factory pattern (manual regex.exec loop) with direct String.prototype.replace calls using a shared escapeReplacer switch function for escapeUTF8, escapeAttribute, and escapeText
Removes the getCodePoint Node < 4 polyfill — uses native String.prototype.codePointAt directly
Updates encodeXML and encodeHTMLTrieRe (in encode.ts) to call codePointAt instead of the removed getCodePoint wrapper

Breaking changes

getCodePoint is no longer exported from escape.ts. Any downstream code importing this helper will break. This was an internal/undocumented export.
The escapeUTF8, escapeAttribute, and escapeText exports change from const arrow functions to named function declarations. This should not affect normal usage, but code relying on .name or identity checks may notice.
Drops the Node < 4 codePointAt polyfill — Node < 4 is no longer supported.

Benchmark results

Geometric mean across 12 text variants (short/medium/long × low/high entity density × named/numeric), via html-entity-benchmarks compare:

	Geo mean (ops/s)
entities v8 (published)	2.23M
escape-perf	1.81M
Δ	−19%

escapeUTF8 regresses on short inputs where the old getEscaper bitset loop was faster than String.prototype.replace. Longer inputs are closer to parity. The change is primarily a code simplification (−45 lines, −51% escape bundle size).

Bundle size (esbuild --minify --tree-shaking, esm)

Export	Before	After	Δ
`escapeUTF8`	483 B (319 gz)	238 B (162 gz)	−51% (−49% gz)
`encodeHTML`	14,455 B (6,242 gz)	14,273 B (6,153 gz)	−1%
Full bundle	51,718 B (28,091 gz)	51,483 B (27,969 gz)	−0.5%

🤖 Generated with Claude Opus (Claude Code)

Summary by CodeRabbit

Refactor
- Simplified internal implementation of encoding and escaping utilities by removing intermediate helper functions and consolidating escape logic into dedicated function implementations.
Style
- Adjusted formatting in test type annotations.

Copilot

Pull request overview

Refactors escaping/encoding internals by simplifying escape routines to use String.prototype.replace with a shared replacer callback and by removing legacy Node < 4 codePointAt compatibility code, aligning with the package’s current Node engine requirements.

Changes:

Replace the getEscaper factory + manual regex.exec loop with replace(..., escapeReplacer) for escapeUTF8, escapeAttribute, and escapeText.
Remove the exported getCodePoint polyfill and switch callers to native String.prototype.codePointAt.
Update encodeXML and encodeHTMLTrieRe to use codePointAt directly.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
src/escape.ts	Removes `getCodePoint` polyfill/export; refactors escape helpers to regex `replace` with shared `escapeReplacer`.
src/encode.ts	Drops `getCodePoint` import and uses `codePointAt` for numeric entity emission.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/escape.ts

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

fb55 · 2026-03-30T09:31:23Z

@coderabbitai review

coderabbitai · 2026-03-30T09:31:29Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2026-03-30T09:31:39Z

📝 Walkthrough

Walkthrough

The pull request refactors character encoding functions throughout the module. It removes the getCodePoint helper function and replaces a generic escaper factory with a shared callback-based approach. Public functions are converted from const assignments to function declarations, and native codePointAt() API is used directly.

Changes

Cohort / File(s)	Summary
Test Formatting `src/decode.spec.ts`	Type annotation reformatted across multiple lines for improved readability; no runtime logic changes.
Character Encoding Refactoring `src/encode.ts`, `src/escape.ts`	Removed `getCodePoint` helper; replaced generic `getEscaper` factory with dedicated regex patterns and shared `escapeReplacer` callback. Public functions converted from `const` to `function` declarations. Native `input.codePointAt(index)` used directly instead of helper function.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 Hopping through the escape routes,
Code points now hop on their own,
No helper needed, just native speed,
Functions declared, not assigned—
A cleaner warren for all to see! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: refactoring escape functions to use regex replace instead of a factory pattern, and removing the Node < 4 polyfill.
Docstring Coverage	✅ Passed	Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch escape-perf

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/escape.ts`:
- Around line 110-134: Add a fast no-match guard to avoid the replace(callback)
overhead for short/unchanged strings: in escapeUTF8, escapeAttribute, and
escapeText, check quickly for the presence of any characters that would require
escaping using simple string.indexOf/lookups (not regex.test on a global regex)
and return the original input early if none found, then fall back to the
existing return data.replace(..., escapeReplacer). For escapeAttribute check for
'"' '&' and '\u00A0'; for escapeText check for '&' '<' '>' and '\u00A0'; for
escapeUTF8 use the minimal set of characters that xmlEscapeRegex would match.
Ensure you do not call regex.test on a /g regex to avoid stateful behavior.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2547eb3b-fb6c-4cf8-9e71-0815353d400e

📥 Commits

Reviewing files that changed from the base of the PR and between 1dad19d and 4831be9.

📒 Files selected for processing (3)

src/decode.spec.ts
src/encode.ts
src/escape.ts

coderabbitai · 2026-03-30T09:35:47Z

src/escape.ts

+export function escapeUTF8(data: string): string {
+    return data.replace(xmlEscapeRegex, escapeReplacer);
+}
+
+const attributeEscapeRegex = /["&\u00A0]/g;

 /**
 * Encodes all characters that have to be escaped in HTML attributes,
 * following {@link https://html.spec.whatwg.org/multipage/parsing.html#escapingString}.
 * @param data String to escape.
 */
-export const escapeAttribute: (data: string) => string =
-    /* #__PURE__ */ getEscaper(
-        /["&\u00A0]/g,
-        new Map([
-            [34, "&quot;"],
-            [38, "&amp;"],
-            [160, "&nbsp;"],
-        ]),
-    );
+export function escapeAttribute(data: string): string {
+    return data.replace(attributeEscapeRegex, escapeReplacer);
+}
+
+const textEscapeRegex = /[&<>\u00A0]/g;

 /**
 * Encodes all characters that have to be escaped in HTML text,
 * following {@link https://html.spec.whatwg.org/multipage/parsing.html#escapingString}.
 * @param data String to escape.
 */
-export const escapeText: (data: string) => string = /* #__PURE__ */ getEscaper(
-    /[&<>\u00A0]/g,
-    new Map([
-        [38, "&amp;"],
-        [60, "&lt;"],
-        [62, "&gt;"],
-        [160, "&nbsp;"],
-    ]),
-);
+export function escapeText(data: string): string {
+    return data.replace(textEscapeRegex, escapeReplacer);
+}


⚠️ Potential issue | 🟠 Major

Short-input performance regression in new replace-based hot paths

The PR benchmark shows a measurable throughput drop (~19% geometric mean). The new callback replace path is likely the source for short strings in escapeUTF8, escapeAttribute, and escapeText.

⚡ Proposed mitigation (fast no-match guard)

const xmlEscapeRegex = /["&'<>]/g; +const xmlEscapeCheckRegex = /["&'<>]/; @@ export function escapeUTF8(data: string): string { - return data.replace(xmlEscapeRegex, escapeReplacer); + return xmlEscapeCheckRegex.test(data) + ? data.replace(xmlEscapeRegex, escapeReplacer) + : data; } const attributeEscapeRegex = /["&\u00A0]/g; +const attributeEscapeCheckRegex = /["&\u00A0]/; @@ export function escapeAttribute(data: string): string { - return data.replace(attributeEscapeRegex, escapeReplacer); + return attributeEscapeCheckRegex.test(data) + ? data.replace(attributeEscapeRegex, escapeReplacer) + : data; } const textEscapeRegex = /[&<>\u00A0]/g; +const textEscapeCheckRegex = /[&<>\u00A0]/; @@ export function escapeText(data: string): string { - return data.replace(textEscapeRegex, escapeReplacer); + return textEscapeCheckRegex.test(data) + ? data.replace(textEscapeRegex, escapeReplacer) + : data; }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/escape.ts` around lines 110 - 134, Add a fast no-match guard to avoid the replace(callback) overhead for short/unchanged strings: in escapeUTF8, escapeAttribute, and escapeText, check quickly for the presence of any characters that would require escaping using simple string.indexOf/lookups (not regex.test on a global regex) and return the original input early if none found, then fall back to the existing return data.replace(..., escapeReplacer). For escapeAttribute check for '"' '&' and '\u00A0'; for escapeText check for '&' '<' '>' and '\u00A0'; for escapeUTF8 use the minimal set of characters that xmlEscapeRegex would match. Ensure you do not call regex.test on a /g regex to avoid stateful behavior.

refactor(escape): use regex replace, remove Node < 4 polyfill

91e61ac

Copilot AI review requested due to automatic review settings March 29, 2026 11:28

Copilot started reviewing on behalf of fb55 March 29, 2026 11:28 View session

Copilot AI reviewed Mar 29, 2026

View reviewed changes

src/escape.ts Outdated Show resolved Hide resolved

docs(escape): fix stale JSDoc on escapeReplacer

4831be9

fb55 force-pushed the escape-perf branch from 4967c66 to 4831be9 Compare March 29, 2026 17:26

fb55 requested a review from Copilot March 29, 2026 17:28

Copilot started reviewing on behalf of fb55 March 29, 2026 17:28 View session

Copilot AI reviewed Mar 29, 2026

View reviewed changes

coderabbitai bot reviewed Mar 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(escape): use regex replace, remove Node < 4 polyfill#2197

refactor(escape): use regex replace, remove Node < 4 polyfill#2197
fb55 wants to merge 2 commits intomainfrom
escape-perf

fb55 commented Mar 29, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

fb55 commented Mar 30, 2026

Uh oh!

coderabbitai bot commented Mar 30, 2026

Uh oh!

coderabbitai bot commented Mar 30, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

fb55 commented Mar 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Breaking changes

Benchmark results

Bundle size (esbuild --minify --tree-shaking, esm)

Summary by CodeRabbit

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

fb55 commented Mar 30, 2026

Uh oh!

coderabbitai bot commented Mar 30, 2026

Uh oh!

coderabbitai bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fb55 commented Mar 29, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 30, 2026 •

edited

Loading