WebGPU makes too many IPDL calls
Categories
(Core :: Graphics: WebGPU, enhancement, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox142 | --- | fixed |
People
(Reporter: jimb, Assigned: teoxoy)
References
(Blocks 4 open bugs, Regressed 1 open bug)
Details
Attachments
(29 files, 1 obsolete file)
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review |
Firefox's WebGPU bindings spend a quarter of their time in the kernel, simply sending IPDL messages.
This profile view shows a 283ms requestAnimationFrame
call, with about a quarter of that time spent in IPDL-generated code. In the content process, if we focus on the frame
JS function, 27% of samples have IPC::Channel::ChannelImpl::ProcessOutgoingMessages
near the top of their stack.
Our WebGPU implementation should buffer messages and send them in large blocks, flushing the buffer at calls to GPUQueue
methods.
As a less ambitious optimization, we could make command buffer recording entirely local to the content process, rather than sending a separate IPDL CommandEncoderAction
for each call to a GPUCommandEncoder
method. Among the above call to frame
's 4,387 samples, 1083 are in GPUCommandEncoder
methods, and 691 of those are in PWebGPU
methods. I think this means that local command buffer recording would save about 16% of frame
's time.
Reporter | ||
Comment 1•3 months ago
|
||
(I first heard this idea from Nical. But WebGL also does its own serialization, buffering, and flushing.)
Assignee | ||
Updated•3 months ago
|
Reporter | ||
Comment 2•3 months ago
|
||
Bug 1969364 shows that it is not sufficient to flush buffered calls on operations like submit
and mapAsync
. We must place a bound on how long calls stay buffered in the content process --- ideally, a deterministic bound. It seems like putting flushing jobs on the microtask queue would work.
Reporter | ||
Updated•3 months ago
|
Assignee | ||
Comment 3•3 months ago
|
||
When I was thinking about the buffering/flushing strategy I was concerned that if we flush only on submit + the other async methods, we would increase the latency of some future work in situations where the application tries to create resources in advance.
Using a microtask to flush buffered messages sounds like the best way to ensure that we don't wait too long to flush since as far as I understand the microtask would run after let's say the requestAnimationFrame
callback right?
Assignee | ||
Comment 4•3 months ago
|
||
Jamie mentioned that external texture expiry will also use a microtask, we should ensure that the expiry microtask runs after our flushing microtask.
Comment 5•3 months ago
|
||
I've looked at this a bit from the IPC side, and some of this overhead is potentially avoidable.
A good chunk of the time in the profile (though not the bulk of it - 5.1% in the sample shown above) is spent in NtQueryVolumeInformationFile
which is being run from the WinIOAutoObservation
constructor to determine the file type (https://searchfox.org/mozilla-central/rev/3b58bde321bf705a91c32c7a6574a3427ce81d3f/xpcom/build/PoisonIOInterposerWin.cpp#133). If we were able to avoid interposing these particular calls to WriteFile
withthe IOInterposer, it would be possible to avoid this part of the overhead. In addition, if the interposer avoided calling GetFileType
when it is suppressed (as we do for ipc_channel_win.cc
here: https://searchfox.org/mozilla-central/rev/3b58bde321bf705a91c32c7a6574a3427ce81d3f/ipc/chromium/src/chrome/common/ipc_channel_win.cc#463,466), we could also avoid the unnecessary call.
The actual call to WriteFile
though is somewhat tricky to reduce the overhead of. Before bug 1792474, we actually would always do the WriteFile
call on the IPC I/O thread, but we moved the overlapped IO call to the main thread to reduce latency for IPC messages. The assumption was that due to an OVERLAPPED
being specified, the write would largely be happening async, and would therefore have minimal overhead on the sending thread, instead just avoiding some thread hops.
Not knowing whether this operation is slow because we're doing a lot of individual system calls vs. because the actual OVERLAPPED WriteFile
calls are expensive does make this a bit tricky unfortunately. Would be nice if we could see what Windows is doing under the hood which is taking time within NtWriteFile
.
Assignee | ||
Updated•2 months ago
|
Assignee | ||
Comment 6•2 months ago
|
||
Assignee | ||
Comment 7•2 months ago
|
||
Assignee | ||
Comment 8•2 months ago
|
||
Assignee | ||
Comment 9•2 months ago
|
||
Assignee | ||
Comment 10•2 months ago
|
||
Assignee | ||
Comment 11•2 months ago
|
||
Assignee | ||
Comment 12•2 months ago
|
||
Assignee | ||
Comment 13•2 months ago
|
||
Assignee | ||
Comment 14•2 months ago
|
||
Assignee | ||
Comment 15•2 months ago
|
||
Assignee | ||
Comment 16•2 months ago
|
||
Assignee | ||
Comment 17•2 months ago
|
||
Assignee | ||
Comment 18•2 months ago
|
||
Assignee | ||
Comment 19•2 months ago
|
||
Assignee | ||
Comment 20•2 months ago
|
||
Assignee | ||
Comment 21•2 months ago
|
||
Assignee | ||
Comment 22•2 months ago
|
||
Assignee | ||
Comment 23•2 months ago
|
||
Assignee | ||
Comment 24•2 months ago
|
||
Assignee | ||
Comment 25•2 months ago
|
||
Assignee | ||
Comment 26•2 months ago
|
||
Assignee | ||
Comment 27•2 months ago
|
||
Assignee | ||
Comment 28•2 months ago
|
||
Assignee | ||
Comment 29•2 months ago
|
||
Assignee | ||
Comment 30•2 months ago
|
||
Assignee | ||
Comment 31•2 months ago
|
||
Assignee | ||
Comment 32•2 months ago
|
||
Updated•2 months ago
|
Assignee | ||
Comment 33•2 months ago
|
||
I've run all the demos/apps we are tracking for release (Bug 1956044) with the patch stack applied and I didn't find any new unexpected behavior.
Assignee | ||
Comment 34•2 months ago
|
||
Jobs part of recent CTS runs have been timing out, this is probably due to recent passes being promoted out of backlog.
Assignee | ||
Comment 35•2 months ago
|
||
I had to demote 2 tests because they are testing promise ordering that
is not guaranteed by the specification. I will open an issue on the
spec repo to clarify the situation.
The new timeouts are actually existing timeouts that were not surfaced
by the harness. I've looked at a recent CTS run without the patch stack
and the tests time out but the harness reports TEST-OK. Locally,
the tests also run at the same speed with and without the patch stack.
I'm not sure what change in the patch stack caused the harness to now
report the timeout but I think this is now working as intended.
Assignee | ||
Comment 36•2 months ago
|
||
We were previously always creating and returning async pipelines even
though there was a validation error, the tests were running but when
reading back the results all they got were 0s, we now reject the
promise which changes the test outcome from FAIL to ERROR.
Assignee | ||
Comment 37•2 months ago
|
||
Comment 38•2 months ago
|
||
Comment 39•2 months ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/7bde5c633884
https://hg.mozilla.org/mozilla-central/rev/6d6355a3b148
https://hg.mozilla.org/mozilla-central/rev/e44206c0b7cc
https://hg.mozilla.org/mozilla-central/rev/9532eabe579b
https://hg.mozilla.org/mozilla-central/rev/34c823ba8400
https://hg.mozilla.org/mozilla-central/rev/b7a17e072855
https://hg.mozilla.org/mozilla-central/rev/ae534cb1b5b3
https://hg.mozilla.org/mozilla-central/rev/eda9efca62e6
https://hg.mozilla.org/mozilla-central/rev/6dad2b345750
https://hg.mozilla.org/mozilla-central/rev/83a72bfebf3e
https://hg.mozilla.org/mozilla-central/rev/510a7b4f6203
https://hg.mozilla.org/mozilla-central/rev/84f895015c01
https://hg.mozilla.org/mozilla-central/rev/b7cd4914c875
https://hg.mozilla.org/mozilla-central/rev/c9e106a02c59
https://hg.mozilla.org/mozilla-central/rev/ef3ca838d33c
https://hg.mozilla.org/mozilla-central/rev/a0f2c6fe8d03
https://hg.mozilla.org/mozilla-central/rev/a5f7b9eaf0c8
https://hg.mozilla.org/mozilla-central/rev/64a6525d29f1
https://hg.mozilla.org/mozilla-central/rev/283fa57cd389
https://hg.mozilla.org/mozilla-central/rev/9305dede4d1c
https://hg.mozilla.org/mozilla-central/rev/17df0a2dba00
https://hg.mozilla.org/mozilla-central/rev/98c197ab32a8
https://hg.mozilla.org/mozilla-central/rev/ac748e201657
https://hg.mozilla.org/mozilla-central/rev/38d2580426fa
https://hg.mozilla.org/mozilla-central/rev/c47b7c9db6b0
https://hg.mozilla.org/mozilla-central/rev/85f42f42bb8d
https://hg.mozilla.org/mozilla-central/rev/6a034a351c96
https://hg.mozilla.org/mozilla-central/rev/12619b90dd5e
https://hg.mozilla.org/mozilla-central/rev/29605bf21ee0
Updated•1 month ago
|
Description
•