Web Worker Canvas Memory Leak
Categories
(Core :: Graphics: Canvas2D, defect)
Tracking
()
People
(Reporter: tomxor, Assigned: aosmond)
References
(Regressed 1 open bug)
Details
(Keywords: memory-leak, perf-alert)
Attachments
(6 files)
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:120.0) Gecko/20100101 Firefox/120.0
Steps to reproduce:
Set up a Web Worker, with an OffscreenCanvas, and RequestAnimationFrame loop, that uses the 2D context API to affect the canvas in any way, on every frame.
The issue has some kind of dependence on browser resource usage. For the simplified test case it was sufficient to use a very large canvas of 8k x 8k pixels for my machine. Beneath a certain size threshold the leak does not occur at all.
It's possible to reproduce the issue under realistic conditions through multiple small off screen canvases on a single page - which is how I discovered the issue. Even when all but one workers are 100% idle it's possible to trigger the leak through only painting to a single small canvas at a time. i.e the dependence on resource usage seems to be shared across off screen canvases on the same page regardless of active use.
Actual results:
The browser consumes all available RAM and crashes the tab. In my case 16 GiB is consumed in around 15 seconds.
Expected results:
No memory leaks.
Reporter | ||
Comment 1•2 years ago
|
||
Online reduced test case for convenience:
Comment 2•2 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Graphics: Canvas2D' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Comment 3•2 years ago
|
||
Can you type "about:support" in your browser and copy-paste its contents to this bug?
Reporter | ||
Comment 4•2 years ago
|
||
Reporter | ||
Comment 5•2 years ago
|
||
I've also tried this with a fresh Firefox config, i.e defaults and no extensions, and found the behaviour to be slightly different: While the tab no leak occurs, but as soon as I close the tab, all RAM is quickly consumed and then the whole browser crashes instead of a tab.
When I restore my original Firefox config the behaviour reverts to what I initially described. But both configurations result in a leak and crash.
Comment 6•2 years ago
|
||
I can't reproduce on macOS with either the attachment or the jsfiddle. We'll discuss in triage.
Assignee | ||
Comment 8•2 years ago
|
||
This seems more me than Lee. Did you reproduce this on release, or a recent nightly? We've been making a lot of changes in the past week.
Assignee | ||
Comment 9•2 years ago
|
||
Also, if on release, does disabling accelerated canvas help? Flip gfx.canvas.accelerated
to false, restart and try to reproduce as normal.
Comment 10•2 years ago
•
|
||
I found a version of leak on Windows with GPU-canvas, and filed bug 1871207.
Filed Bug 1871208 for d2d-canvas leak
Reporter | ||
Comment 11•2 years ago
|
||
Andrew, I've tried setting gfx.canvas.accelerated
to false and restarting, the leak still occurs for me.
Note, I do not know whether this is a regression or has been here all along, since I discovered it while developing something new.
Reporter | ||
Comment 12•2 years ago
|
||
Andrew, this is on release, via Flatpack flathub:
Name Application ID Version Branch Arch Origin Ref Active commit
Firefox org.mozilla.firefox 120.0.1 stable x86_64 flathub org.mozilla.firefox/x86_64/stable f75dc4d15a98
Reporter | ||
Comment 13•2 years ago
|
||
Noticed there was an update, can also reproduce on v121:
Name Application ID Version Branch Arch Origin Ref Active commit
Firefox org.mozilla.firefox 121.0 stable x86_64 flathub org.mozilla.firefox/x86_64/stable b4ed37eec155
Reporter | ||
Comment 14•2 years ago
|
||
I've noticed the behaviour is a little slower when nothing else open in the browser.
Using the test case HTML I have to wait 20 to 30 seconds during which my available RAM hovers around 13-14 GiB, after that initial period available RAM suddenly plummets and the tab crashes just before hitting 0.
Comment 15•2 years ago
|
||
The severity field is not set for this bug.
:lsalzman, could you have a look please?
For more information, please visit BugBot documentation.
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Comment 16•2 years ago
|
||
Sorry for the delay due to the holidays, could you provide an about:memory
report? That might shed some light at least on where the allocations are. Thanks!
Updated•2 years ago
|
Comment 17•2 years ago
|
||
(In reply to Andrew Osmond [:aosmond] (he/him) from comment #16)
Sorry for the delay due to the holidays, could you provide an
about:memory
report? That might shed some light at least on where the allocations are. Thanks!
If it helps, bug 1871208 has an about:memory report.
Reporter | ||
Comment 18•2 years ago
|
||
I captured this after the attached test case consumed around 12GB, with only a couple GB of RAM left (it crashes when out of RAM).
Assignee | ||
Comment 19•2 years ago
|
||
Looks like it is mostly consumed by shmems in the content process:
12,288.19 MB ── shmem-allocated
12,288.19 MB ── shmem-mapped
Reporter | ||
Comment 20•2 years ago
|
||
I've narrowed the difference in behaviour between defaults and my config to layers.acceleration.disabled
.
When setting this to true, the test case leaks while open in foreground as per my initial description.
When setting this to false (the default) , no leak occurs while in foreground, however the tab eventually crashes on it's own, unless I close the tab prematurely after ~10 seconds then all RAM is consumed and the whole browser crashes.
I've also played with the interval by using a setInterval instead of a RAF loop and it seems to be very sensitive to the timing, if I make the interval greater than or less than the refresh rate interval (~16ms for my machine) then it doesn't occur so easily. Except for some magic numbers like 4ms where it occurs as easily as 16ms.
Comment 21•2 years ago
|
||
Profile with all threads, IPC and allocation: https://share.firefox.dev/3HhWXhW
Assignee | ||
Comment 22•2 years ago
|
||
I can reproduce this with SW-WR.
Assignee | ||
Comment 23•2 years ago
|
||
If we use the buffer provider, the problem goes away. I can land that patch to fix this.
Assignee | ||
Comment 24•2 years ago
|
||
Assignee | ||
Comment 25•2 years ago
|
||
This patch adds support for allocationg shmem sections for
ImageBridgeChild. The recording infrastructure depends on it.
Assignee | ||
Comment 26•2 years ago
|
||
This allows us to create a TextureClient on a different thread than the
actor without special effort on the part of the allocator. Similarly, it
also allows us to destroy a TextureClient on a different thread if it
has a readlock bound to it.
Updated•2 years ago
|
Comment 27•2 years ago
|
||
Comment 28•2 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/a8db51f72457
https://hg.mozilla.org/mozilla-central/rev/a3e17caaa159
https://hg.mozilla.org/mozilla-central/rev/cf3ce7d3c82d
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment 31•2 years ago
|
||
(In reply to Pulsebot from comment #27)
Pushed by aosmond@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a8db51f72457
Part 1. Implement ImageBridgeChild::GetTileLockAllocator.
r=gfx-reviewers,lsalzman
https://hg.mozilla.org/integration/autoland/rev/a3e17caaa159
Part 2. Ensure TextureClient's mReadLock is only created on the IPDL actor
thread. r=lsalzman
https://hg.mozilla.org/integration/autoland/rev/cf3ce7d3c82d
Part 3. Make OffscreenCanvas use PersistentBufferProvider on the display
pipeline. r=lsalzman
We're recorded a large improvement in CI from your patches!
== Change summary for alert #41064 (as of Thu, 18 Jan 2024 23:29:28 GMT) ==
Improvements:
Ratio | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|
19% | offscreencanvas_webcodecs_worker_2d_vp9 offscreencanvas_webcodecs_worker_2d_vp9 Mean time across 100 frames: | windows10-64-ref-hw-2017-qr | e10s fission stylo webgl-ipc webrender | 25.73 -> 20.81 |
19% | offscreencanvas_webcodecs_worker_2d_av1 offscreencanvas_webcodecs_worker_2d_av1 Mean time across 100 frames: | windows10-64-ref-hw-2017-qr | e10s fission stylo webgl-ipc webrender | 26.53 -> 21.53 |
19% | offscreencanvas_webcodecs_main_2d_vp9 offscreencanvas_webcodecs_main_2d_vp9 Mean time across 100 frames: | windows10-64-ref-hw-2017-qr | e10s fission stylo webgl-ipc webrender | 25.81 -> 20.98 |
18% | offscreencanvas_webcodecs_main_2d_av1 offscreencanvas_webcodecs_main_2d_av1 Mean time across 100 frames: | windows10-64-ref-hw-2017-qr | e10s fission stylo webgl-ipc webrender | 26.50 -> 21.64 |
18% | offscreencanvas_webcodecs_main_2d_vp9 offscreencanvas_webcodecs_main_2d_vp9 Mean time across 100 frames: | windows10-64-shippable-qr | e10s fission stylo webgl-ipc webrender | 13.65 -> 11.16 |
... | ... | ... | ... | ... |
2% | offscreencanvas_webcodecs_main_2d_av1 offscreencanvas_webcodecs_main_2d_av1 Mean time across 100 frames: | windows10-64-shippable-qr | e10s fission stylo webrender-sw | 13.76 -> 13.48 |
For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=41064
Updated•2 years ago
|
Description
•