Fix O(N²) parsing of large decimal Natural literals#2732
Open
nikita-volkov wants to merge 1 commit intodhall-lang:mainfrom
Open
Fix O(N²) parsing of large decimal Natural literals#2732nikita-volkov wants to merge 1 commit intodhall-lang:mainfrom
nikita-volkov wants to merge 1 commit intodhall-lang:mainfrom
Conversation
The previous 'decimal' parser in 'naturalLiteral' (Token.hs) read digits
one by one via 'many (satisfy digit)' and then converted with:
foldl' (\acc x -> acc * 10 + x) 0 digits
For an N-digit number this performs N big-integer multiplications, where
the k-th multiplication costs O(k) (Karatsuba grows with operand size), so
the total is O(1+2+…+N) = O(N²). For a 1.26 M-digit literal this caused
0.79 s of parse time per number; with 8+ such literals in a single file
parsing alone took ~13 s out of the observed ~39 s total.
Fix
---
Replace the naive left-fold with a divide-and-conquer conversion:
1. Capture all digits in one shot using 'takeWhileP' (single O(N) scan).
2. Recursively split the digit string in half: convert the high half and
the low half independently, then combine as
hi_value * 10^lo_len + lo_value
For strings ≤ 18 digits the plain left-fold is used (fits in 64-bit).
This reduces the work to O(M(N)·log N) where M(N) is the cost of one
N-digit multiplication (O(N^1.585) via GMP Karatsuba), which is far
better than O(N²).
Measurements (aarch64-osx, GHC 9.12.2):
Single 1.26 M-digit literal: 0.79 s → 0.17 s (4.6×)
All 120 large literals: 13.2 s → 2.0 s (6.6×)
Full resolved.dhall (type): ~39 s → 7.4 s (5.3×)
Regression test and benchmark
------------------------------
* Added 'largeNaturalLiteralParsing' to Dhall.Test.Regression: parses a
100,000-digit literal and asserts completion within 10 seconds.
* Added 'Large natural number literal (1M digits)' to the parser benchmark
so future regressions are visible in benchmark runs.
75ce831 to
e4dd12e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I've encountered this issue upon attempts to load large resolved files, which led to Dhall getting stuck on the loading phase. I then fed the reproduction of the problem to an LLM. Following is what it has come up with. I have confirmed on my use case that the fix works.
The previous 'decimal' parser in 'naturalLiteral' (Token.hs) read digits one by one via 'many (satisfy digit)' and then converted with:
For an N-digit number this performs N big-integer multiplications, where the k-th multiplication costs O(k) (Karatsuba grows with operand size), so the total is O(1+2+…+N) = O(N²). For a 1.26 M-digit literal this caused 0.79 s of parse time per number; with 8+ such literals in a single file parsing alone took ~13 s out of the observed ~39 s total.
Fix
Replace the naive left-fold with a divide-and-conquer conversion:
This reduces the work to O(M(N)·log N) where M(N) is the cost of one N-digit multiplication (O(N^1.585) via GMP Karatsuba), which is far better than O(N²).
Measurements (aarch64-osx, GHC 9.12.2):
Single 1.26 M-digit literal: 0.79 s → 0.17 s (4.6×)
All 120 large literals: 13.2 s → 2.0 s (6.6×)
Full resolved.dhall (type): ~39 s → 7.4 s (5.3×)
Regression test and benchmark