Skip to content

Use flat arrays for BLAS traversal (fix Metal pointer-in-buffer issue)#13

Merged
SimonDanisch merged 1 commit intoJuliaGeometry:sd/multitype-vecfrom
jkrumbiegel:jk/fix-metal-blas-pointers
Mar 2, 2026
Merged

Use flat arrays for BLAS traversal (fix Metal pointer-in-buffer issue)#13
SimonDanisch merged 1 commit intoJuliaGeometry:sd/multitype-vecfrom
jkrumbiegel:jk/fix-metal-blas-pointers

Conversation

@jkrumbiegel
Copy link
Copy Markdown

Summary

  • On Metal, device pointers (Core.LLVMPtr) stored inside GPU buffers cannot be reliably dereferenced by kernels — inline data reads correctly but following embedded pointers returns zeros
  • Replace pointer-based BLAS architecture in StaticTLAS with flat concatenated arrays (all_blas_nodes, all_blas_prims) and BLASDescriptor structs containing offsets
  • closest_hit and any_hit now use offset-based indexing instead of dereferencing per-BLAS pointer arrays
  • Management kernels still use blas_array but only read root_aabb (inline data, unaffected)

How these changes were tested

  • CPU and Metal render of 3-sphere test scene produce identical mean pixel values (~0.327)
  • Metal render of 20-material gallery scene (glass, volumetrics, metals, coated, emissive) produces correct results matching CPU
  • Stable across multiple renders and sample counts (4, 8, 16, 32 spp)

Created with the help of Claude Code

On Metal, device pointers (Core.LLVMPtr) stored inside GPU buffers
cannot be reliably dereferenced by kernels. The inline data (root_aabb)
reads correctly, but following embedded pointers to per-BLAS node/primitive
arrays returns zeros.

Replace the pointer-based BLAS architecture in StaticTLAS with:
- BLASDescriptor: lightweight struct with nodes_offset, primitives_offset, root_aabb
- Flat concatenated arrays (all_blas_nodes, all_blas_prims) built from per-BLAS GPU arrays
- Offset-based indexing in closest_hit/any_hit traversal

Management kernels (update_tlas_leaf_aabbs_kernel!, etc.) still use
blas_array but only read root_aabb (inline, unaffected).

Verified: CPU and Metal produce identical results (mean pixel ~0.327
on 3-sphere test scene).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jkrumbiegel jkrumbiegel force-pushed the jk/fix-metal-blas-pointers branch from b063cf7 to 811a580 Compare March 1, 2026 20:02
@SimonDanisch SimonDanisch merged commit b8b2be5 into JuliaGeometry:sd/multitype-vec Mar 2, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants