Skip to content

GPU: WebGPU back-end#1356

Open
slomp wants to merge 18 commits into
masterfrom
slomp/tracy-webgpu
Open

GPU: WebGPU back-end#1356
slomp wants to merge 18 commits into
masterfrom
slomp/tracy-webgpu

Conversation

@slomp
Copy link
Copy Markdown
Collaborator

@slomp slomp commented May 5, 2026

Prototype status:

  • 99.99% reliance on WebGPU C API (webgpu.h)
  • works with both Google Dawn and Rust's wgpu-native implementations
  • multi-threaded command encoding support (but not thoroughly tested)
  • CPU-GPU clock calibration happens ad-hoc during initialization (no WebGPU API support)
  • Timestamp resolve/copy commands are "batched" (32 queries, due to WebGPU alignment requirements)
  • Host access to timestamp buffer is "demultiplexed" across 3 readback buffers operating in round-robin (due to WebGPU async buffer mapping restrictions)
  • [TODO] Timestamp queries may not be fully collected during context shutdown (to be addressed by future PRs)
  • [TODO] Only one context object allowed until SetupDevice is made re-entrant (to be addressed by future PRs)

The most convoluted portions of the back-end are the "readback buffer reel" and the "clock calibration".

  • For readbacks, due to byte-alignment and due to buffer mapping restrictions, I had to resort to a "triple-buffering" scheme to manage timestamp queries. At any given time, one readback buffer is available for instrumentation (QueryID+Resolve+Copy), another is being mapped to the host (wgpuBufferMapAsync), and the other buffer is being processed by Collect(). They rotate in round-robin as needed (Collect orchestrates the rotation by updating an atomic variable).
  • For clock calibration, I used an incremental regression scheme to estimate the anchoring cpu-gpu timestamp pair, and the GPU tick period (ns / gpu-tick). The regression is not updated when there's too much uncertainty in the cpu-tick range.

@slomp slomp force-pushed the slomp/tracy-webgpu branch 4 times, most recently from f8a7334 to fc33759 Compare May 18, 2026 21:34
Copy link
Copy Markdown
Contributor

@starmole starmole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just two small comments around event loop

Comment thread public/tracy/TracyWebGPU.hpp Outdated
static_cast<uint64_t>(m_queryLimit) * sizeof(uint64_t), cbInfo);

// Optimistic immediate poll: deliver any already-completed callbacks.
wgpuInstanceProcessEvents(m_instance);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would recommend not calling process events here.
a client integrating it will call process events in it's own frame loop.
and it will be very confusing if their callbacks start to come from within Collect.

Comment thread public/tracy/TracyWebGPU.hpp Outdated
m_writeIdx = newWriteIdx;

WGPUBufferMapCallbackInfo cbInfo = {};
cbInfo.mode = WGPUCallbackMode_AllowSpontaneous;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would recommend AllowProcessEvents only. AllowSpontaneous is quite dangerous. It can fire on arbitrary different threads.

@slomp slomp force-pushed the slomp/tracy-webgpu branch 4 times, most recently from 430d601 to 44112fb Compare May 22, 2026 22:34
@slomp slomp marked this pull request as ready for review May 22, 2026 22:44
@slomp slomp changed the title initial prototype for WebGPU back-end GPU: WebGPU back-end May 23, 2026
@slomp slomp force-pushed the slomp/tracy-webgpu branch from e3dd7b7 to bf5d92b Compare May 24, 2026 17:07
@slomp slomp force-pushed the slomp/tracy-webgpu branch from 04ceb48 to 400f587 Compare May 24, 2026 21:15
@slomp slomp force-pushed the slomp/tracy-webgpu branch from 03f3ba0 to 1f391bb Compare May 25, 2026 01:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants