Open Bug 1870699 Opened 1 year ago Updated 11 days ago

Don't poll WebGPU from a timer

Categories

(Core :: Graphics: WebGPU, defect, P3)

defect

Tracking

()

People

(Reporter: jimb, Unassigned)

References

(Blocks 3 open bugs)

Details

Right now, when Firefox creates a PWebGPU IPDL parent, the WebGPUParent constructor starts a timer thread calling wgpu_core::global::device::Global::poll_all_devices, to respond to the completion of work on the GPU. This polling runs ten times a second, and probably has an unacceptable impact on battery life.

Instead, creating a WebGPUParent should start a separate thread that polls the wgpu devices only as necessary. Submitting a command buffer to a queue should dispatch a runnable to that thread (in classic Gecko style, using nsIEventTarget::Dispatch or something like it) that calls Global::device_poll, passing Maintain::WaitForSubmissionIndex with the submission index returned by the submission. This will cause the thread to block until the GPU has completed the submitted work.

Various other WebGPU operations require calling device_poll as well, and these can be handled by sending the thread appropriate runnables.

This approach should ensure that WebGPU consumes the CPU's attention only when we are actually waiting for work.

Blocks: webgpu-v1
Severity: -- → S3
Priority: -- → P3

Linking this other report with an attached test case and some discussion: https://bugzilla.mozilla.org/show_bug.cgi?id=1900273 and the issue on wgpu github with a link to a solution to this problem on the servo browser: https://github.com/gfx-rs/wgpu/issues/6660 https://github.com/servo/servo/pull/32266

In terms of severity, this makes compute shaders so slow as to be unusuable on firefox.

Just pinging here to see if there is any chance of fixing this sometime soon? :) Thanks!

This is definitely a priority for before the release we're hoping for by June. However, we haven't gotten more fine-grained with our planning than that right now, sorry!

Here's your periodic ping to see if there is any update on when this might be fixed :)

Unfortunately, I think this task is going to wait until closer to EOY. It's not clear, given our current backlog and decision to ship (which will, in turn, impose more backlog that we haven't found yet).

NI'ing :jimb for his thoughts as the team's tech lead.

Flags: needinfo?(jimb)

Actually, I think we may want to raise the priority here.

I just tried lowering POLL_MS_TIME in WebGPUParent.cpp from 100ms to 5ms, and it lowers the time needed to run the WebGPU CTS test webgpu:shader,execution,expression,call,builtin,abs from 12s to 1.2s. So it seems plausible that our current polling approach is behind some of our CTS timeouts.

Polling at 100ms is ridiculous: that's an average 50s of latency, unless you're submitting continuously and taking advantage of the implicit poll that submission does.

Flags: needinfo?(jimb)

We did this in servo last year and let me tell you it was worth it. We slashed CTS run times in half.

Polling at 100ms is ridiculous: that's an average 50s of latency, unless you're submitting continuously and taking advantage of the implicit poll that submission does.

That is probably true for most rendering scenarios where you have requestAnimationFrame doing submits for every frame.

So is the plan to just change this POLL_MS_TIME constant or do the solution outlined above? Any timeframe for this happening? :)

Randall: I know that :jimb has been doing some implementation design research, and has converged on a design for wait-polling in a separate thread when Firefox knows that some GPU work is in flight. I don't know when we'll be getting to implementing it (minus what I said in comment 5 for before EOY). The changes seem straightforward—we just need to figure out when it will happen compared to the other priorities we have.

I was looking at this a bit today. Things are a bit hairy around WebGPUParent::GetFrontBufferSnapshot because it calls ffi::wgpu_server_poll_all_devices itself.

We talked about the best way to fit this into the solution proposed above. JimB proposed having the callback set a condition variable and have GetFrontBufferSnapshot wait on that condition variable and then do the readback. This will keep all the polling (expect maybe for the shutdown poll) on the polling thread.

Blocks: 1993333
You need to log in before you can comment on or make changes to this bug.