Skip to content

Bugfix: Text/Replace becomes a no-op when its input contains appends#2711

Open
nikita-volkov wants to merge 9 commits intodhall-lang:mainfrom
nikita-volkov:text-replace-noop-bug
Open

Bugfix: Text/Replace becomes a no-op when its input contains appends#2711
nikita-volkov wants to merge 9 commits intodhall-lang:mainfrom
nikita-volkov:text-replace-noop-bug

Conversation

@nikita-volkov
Copy link
Copy Markdown
Contributor

@nikita-volkov nikita-volkov commented Mar 2, 2026

I've managed to manually reproduce the bug and added a regression test. Then I fed it to an LLM to fix it.

In this run you'll see the reproduction causing a failure.

Following is the report about the fix produced by the LLM.

Root Cause

The bug was in Normalize.hs, in two places inside the Text/replace normalization logic.

The scenario: Text/replace "." "!" (x ++ "!") is normalized with x as a free variable. x ++ "!" becomes TextLit (Chunks [("", Var "x")] "!") — a text literal with one interpolated chunk.

Bug 1 (lines 332–345): The case that handles (needle: plain text, replacement: plain text, haystack: any TextLit) applied Text.replace to all static string parts of the chunks but left the interpolated expressions (y in each (x, y) pair) untouched and returned the result as if the replacement was complete. Since neither "" nor "!" contains ".", the result was identical to the input — the Text/replace was silently dropped. When x = "..." was substituted later, there was no longer a Text/replace to handle the dots.

Bug 2 (line 378): In the chunked-haystack case (lines 363–385), when the needle is not found in firstText, the code emitted firstInterpolation verbatim without wrapping it in Text/replace. Same problem: the replacement was never applied to the interpolated expression's eventual value.

Fix

Two targeted edits to Normalize.hs:

  1. Case 332–345: Narrowed the pattern from TextLit (Chunks xys z) to TextLit (Chunks [] z) — it now only fires for pure-text haystacks (no interpolations). Haystacks with interpolations fall through to the chunked case.

  2. Line 378 (if Text.null suffix then): Changed firstInterpolation to App (App (App TextReplace needle) replacement) firstInterpolation — wrapping it in a Text/replace application so that when the interpolation becomes a concrete text, the replacement is applied to it.

## Root Cause

The bug was in `Normalize.hs`, in two places inside the `Text/replace` normalization logic.

**The scenario:** `Text/replace "." "!" (x ++ "!")` is normalized with `x` as a free variable. `x ++ "!"` becomes `TextLit (Chunks [("", Var "x")] "!")` — a text literal with one interpolated chunk.

**Bug 1 (lines 332–345):** The case that handles `(needle: plain text, replacement: plain text, haystack: any TextLit)` applied `Text.replace` to all static string parts of the chunks but left the interpolated expressions (`y` in each `(x, y)` pair) untouched and returned the result as if the replacement was complete. Since neither `""` nor `"!"` contains `"."`, the result was identical to the input — the `Text/replace` was silently dropped. When `x = "..."` was substituted later, there was no longer a `Text/replace` to handle the dots.

**Bug 2 (line 378):** In the chunked-haystack case (lines 363–385), when the needle is not found in `firstText`, the code emitted `firstInterpolation` verbatim without wrapping it in `Text/replace`. Same problem: the replacement was never applied to the interpolated expression's eventual value.

## Fix

Two targeted edits to `Normalize.hs`:

1. **Case 332–345**: Narrowed the pattern from `TextLit (Chunks xys z)` to `TextLit (Chunks [] z)` — it now only fires for pure-text haystacks (no interpolations). Haystacks with interpolations fall through to the chunked case.

2. **Line 378 (`if Text.null suffix then`)**: Changed `firstInterpolation` to `App (App (App TextReplace needle) replacement) firstInterpolation` — wrapping it in a `Text/replace` application so that when the interpolation becomes a concrete text, the replacement is applied to it.
@nikita-volkov nikita-volkov changed the title Text/Replace becomes a no-op when its input contains appends Bugfix: Text/Replace becomes a no-op when its input contains appends Mar 2, 2026
@winitzki
Copy link
Copy Markdown
Collaborator

winitzki commented Mar 2, 2026

I wonder - is this bug also present in the standard specification for Text/replace, or is it just an implementation bug?

@nikita-volkov
Copy link
Copy Markdown
Contributor Author

@Gabriella439 Please take a look. This is getting stale

@winitzki
Copy link
Copy Markdown
Collaborator

The standard says https://github.com/dhall-lang/dhall-lang/blob/master/standard/beta-normalization.md#text that Text/replace requires its target argument to be in normal form. It appears that Text/replace "." "!" (x ++ "!") should not have been reduced because x remains free. According to the standard, the normal form of λ(x : Text) → Text/replace "." "a" (x ++ "b") is λ(x : Text) → Text/replace "." "a" "${x}b". And indeed this is what I see:

Welcome to the Dhall v1.42.3 REPL! Type :help for more information.
 λ(x : Text)  Text/replace "." "a" (x ++ "b")

λ(x : Text)  Text/replace "." "a" "${x}b"

 (λ(x : Text)  Text/replace "." "a" (x ++ "b") ) "..."

"aaab"

 λ(x : Text)  Text/replace "." "a" ("c" ++ x ++ "d")

λ(x : Text)  Text/replace "." "a" "c${x}d"

 ( λ(x : Text)  Text/replace "." "a" ("c" ++ x ++ "d") ) "..."

"caaad"

I don't seem to reproduce a bug in this way.
Is this bug only found when importing Dhall values via the Haskell bindings?

@Gabriella439
Copy link
Copy Markdown
Collaborator

Okay, I think I understand better what's going on. The standard is a bit unclearly worded, but the reference implementation is that Text/replace only β-normalizes further if the haystack is a string literal without any interpolations. The unclear wording in the standard is:

The needle and the haystack must have reached normal form before any further beta normalization can be applied

It probably meant to say something more like this:

The needle and the haystack must have not have any interpolated expression before any further beta normalization can be applied

Given that, I think the correct change here is to modify all of the Text/replace β-normalization branches to match the standard and require that the haystack have no interpolated expressions (i.e. Chunk [] lastText). I'll add my remaining comments on the diff related to that.

Comment thread dhall/src/Dhall/Normalize.hs Outdated
Comment thread dhall/src/Dhall/Normalize.hs Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants