Skip to content

perf(gnovm): avoid heap-boxed byte access in copy, range and index reads#5812

Draft
thehowl wants to merge 1 commit into
dev/morgan/gnovm-test-parallelfrom
dev/morgan/gnovm-byte-fastpaths
Draft

perf(gnovm): avoid heap-boxed byte access in copy, range and index reads#5812
thehowl wants to merge 1 commit into
dev/morgan/gnovm-test-parallelfrom
dev/morgan/gnovm-byte-fastpaths

Conversation

@thehowl

@thehowl thehowl commented Jun 11, 2026

Copy link
Copy Markdown
Member

Note

Part 2 of 6 of the gnovm performance stack (split from #5800), to be merged in order:

  1. perf(gnovm): parallelize test suites and add gno test -jobs #5811 — perf(gnovm): parallelize test suites and add gno test -jobs
  2. perf(gnovm): avoid heap-boxed byte access in copy, range and index reads #5812 — perf(gnovm): avoid heap-boxed byte access in copy, range and index reads
  3. perf(gnovm): recycle runtime blocks through a per-machine pool #5813 — perf(gnovm): recycle runtime blocks through a per-machine pool
  4. perf(gnovm): share interface-held values when copying arrays #5814 — perf(gnovm): share interface-held values when copying arrays (gas-visible)
  5. ci(gnovm): skip print-only coverage instrumentation #5815 — ci(gnovm): skip print-only coverage instrumentation
  6. perf(gnovm): reduce per-call and per-op allocations #5816 — perf(gnovm): reduce per-call and per-op allocations

Each PR is based on the previous one's branch; this one diffs against part 1. Together: ci / gnovm ~14m → 6m18s; pkg/gnolang test time −64%; VM heap allocations −84% on the heaviest suite.

Summary

Profiling the gnovm test suites (dominated by interpreted Gno) showed ~50% of CPU in Go GC/malloc, with 67% of all heap allocations (694M objects in the bytes stdlib suite alone) coming from ArrayValue.GetPointerAtIndexInt2 materializing a heap *TypedValue + boxed DataByteValue per byte accessed:

  • the copy() builtin allocated two boxes per byte copied (554M objects; the code carried a TODO: consider an optimization if dstv.Data != nil);
  • for i, c := range byteslice allocated three objects per iteration (the index TypedValue escaping through GetPointerAtIndex(&iv), plus the view box) that Deref immediately discarded;
  • b[i] reads in doOpIndex1 did the same box-then-Deref dance.

This adds TypedValue.GetValueAtIntIndex — a read-only fast path mirroring GetPointerAtIndex's checks and panics for strings and Data-backed arrays/slices — used by doOpIndex1 and the range loop, and gives copy() direct byte copies when both sides are Data-backed (or the source is a string).

Gas is unchanged: the view boxes were raw Go allocations never charged to the VM allocator, and CPU gas for copy() was already charged before the per-element loops. Verified empirically: all 2344 filetest goldens byte-identical (including Gas: and MAXALLOC-sensitive alloc tests), gno.land/pkg/sdk/vm gas tests, txtar suite, examples, cmd/gno.

Measurements

before after
TestStdlibs/bytes solo 151.5s 105.2s (−31%)
heap objects allocated (bytes suite) 1.03G 0.30G (−71%)
BenchmarkOpIndex1_ByteArray 185.7ns 130.0ns
full pkg/gnolang long mode, 16 cores 245.0s 184.6s

Untouched paths benchmark flat (OpRangeIter_1000 30.9µs → 30.2µs, OpIndex1_MapHit_100 187.0ns → 187.7ns). Byte writes (b[i] = x) still box — the pointer protocol spans multiple ops; left as follow-up.

Profiling the gnovm test suites (dominated by interpreted Gno) showed
~50% of CPU in Go GC/malloc, with 67% of all heap allocations (694M
objects in the bytes stdlib suite alone) coming from
ArrayValue.GetPointerAtIndexInt2 materializing a *TypedValue +
DataByteValue box per byte accessed:

- the copy() builtin allocated two boxes per byte copied;
- range over a byte slice allocated three objects per iteration (the
  index TypedValue escaping through GetPointerAtIndex, plus the view
  box) that Deref immediately discarded;
- b[i] reads in doOpIndex1 did the same box-then-Deref dance.

Add TypedValue.GetValueAtIntIndex, a read-only fast path mirroring
GetPointerAtIndex's checks and panics for strings and Data-backed
arrays/slices, and use it in doOpIndex1 and the range loop. Give the
copy() builtin direct byte copies when both sides are Data-backed (or
the source is a string); bounds, readonly checks, DidUpdate and CPU gas
are unchanged (charged before the loop, as before), and Go's copy is
overlap-safe so the backward-copy setup only remains for the List
fallback.

Gas is unchanged: the view boxes were raw Go allocations, never charged
to the VM allocator. All 2344 filetest goldens (including Gas: and
MAXALLOC-sensitive alloc tests) pass unmodified, as do the gno.land vm
Gas tests and the txtar integration suite.

bytes stdlib suite: 151.5s -> 105.2s; full pkg/gnolang long mode:
245.0s -> 184.6s; allocated objects in the bytes suite: 1.03G -> 0.30G;
BenchmarkOpIndex1_ByteArray: 185.7ns -> 130.0ns.
@Gno2D2

Gno2D2 commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

🛠 PR Checks Summary

All Automated Checks passed. ✅

Manual Checks (for Reviewers):
  • IGNORE the bot requirements for this PR (force green CI check)
Read More

🤖 This bot helps streamline PR reviews by verifying automated checks and providing guidance for contributors and reviewers.

✅ Automated Checks (for Contributors):

No automated checks match this pull request.

☑️ Contributor Actions:
  1. Fix any issues flagged by automated checks.
  2. Follow the Contributor Checklist to ensure your PR is ready for review.
    • Add new tests, or document why they are unnecessary.
    • Provide clear examples/screenshots, if necessary.
    • Update documentation, if required.
    • Ensure no breaking changes, or include BREAKING CHANGE notes.
    • Link related issues/PRs, where applicable.
☑️ Reviewer Actions:
  1. Complete manual checks for the PR, including the guidelines and additional checks if applicable.
📚 Resources:
Debug
Manual Checks
**IGNORE** the bot requirements for this PR (force green CI check)

If

🟢 Condition met
└── 🟢 On every pull request

Can be checked by

  • Any user with comment edit permission

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

📦 🤖 gnovm Issues or PRs gnovm related

Projects

Development

Successfully merging this pull request may close these issues.

2 participants