GPU acceleration of canvas creates a memory leak that can rapidly crash Firefox
Categories
(Core :: Graphics: Canvas2D, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr78 | --- | unaffected |
firefox86 | --- | wontfix |
firefox87 | --- | wontfix |
firefox88 | --- | wontfix |
firefox89 | --- | verified |
People
(Reporter: Zolhungaj, Assigned: bobowen)
References
Details
Attachments
(6 files, 1 obsolete file)
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:86.0) Gecko/20100101 Firefox/86.0
Steps to reproduce:
Prerequsites: GPU-acceleration must be enabled
Tested on Windows 10, on Firefox 86 and the latest Nightly release.
GPU: nvidia RTX 2080,
- open the attached index.html in a new tab
- press the "Click me!" button
- observe the memory usage of Firefox increasing
- be aware that if memory usage exceeds available memory the system might hang
Actual results:
The memory usage of the GPU thread will rapidly and acceleratingly increase in size, potentially to the point where the user's machine will hang.
In my tests it levels out around 9,5GB of used memory (which was when the memory usage on my system reached 100%). I also managed to hang the system once when another application started digging into the already empty memory.
Occasionally the GC will come in and clear memory, but it will not do this enough to prevent Firefox from accumulating RAM.
Expected results:
The memory should not increase at such a rapid rate. And the application should not crash the system
Comment 1•3 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Canvas: 2D' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.
Narrowed down the source of the problem to the creation of a CanvasGradient with at least one color stop; of the methods returning CanvasGradient, CanvasRenderingContext2D.createRadialGradient() is the one leaking fastest.
Updated•3 years ago
|
Assignee | ||
Comment 3•3 years ago
|
||
Looks like this is down to the CanvasGradient holding onto a reference to the GradientStops, coupled with not using the GradientStops cache for recording DrawTargets.
Updated•3 years ago
|
Updated•3 years ago
|
Assignee | ||
Comment 4•3 years ago
|
||
After some more digging it appears that it is this call to CreateRadialGradientBrush that uses a lot of memory (128K on my machine).
This appears to live as long as the GradientStops.
One way of cleaning these up more quickly in the remote canvas case, is to not cache the stops on the CanvasGradient (patch attached), although that would be worse in the case where the gradient isn't changing and with no "global" caching.
So, we probably need to look at caching for recording DrawTargets as well.
Assignee | ||
Comment 5•3 years ago
|
||
jgilbert pointed out on element that we could cause a similar issue even with the cache, so I tweaked the script to deliberately miss the cache.
This causes us to use lots of memory in the content process (with remote canvas disabled).
It copes a bit better (at least on my machine that has a lot of memory), possibly because the memory pressure in the content process causes GC and maybe cache invalidation.
If you let it run for a long time and then stop with a refresh and let it clean up, it doesn't seem to reclaim all the memory, which probably needs looking into as well.
Assignee | ||
Comment 7•3 years ago
|
||
This is so that we can use it in the canvas worker threads.
It also sets a maximum number of entries because on Windows the associated
Direct2D objects can be fairly big.
Assignee | ||
Comment 8•3 years ago
|
||
Depends on D109790
Assignee | ||
Comment 9•3 years ago
|
||
In the DrawTargetRecording case we create new GradientStopsRecording each time
and holding onto them in the content process can mean they take a very large
amount of memory in the GPU process, if a script deliberately creates lots of
unique stops.
In the non-recording case then the GradientStops are cached in the content
process anyway.
Depends on D109791
Assignee | ||
Comment 10•3 years ago
|
||
Assignee | ||
Updated•3 years ago
|
Comment 11•3 years ago
|
||
Pushed by bobowencode@gmail.com: https://hg.mozilla.org/integration/autoland/rev/c991ad5e4a43 p1: Make GradientCache thread safe. r=jrmuizel https://hg.mozilla.org/integration/autoland/rev/b335fb3dfdea p2: Use the gradient cache in CanvasTranslator. r=jrmuizel https://hg.mozilla.org/integration/autoland/rev/76a3bcdeaa9f p3: Don't hold the GradientStops object on CanvasGradient. r=jrmuizel
Comment 12•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/c991ad5e4a43
https://hg.mozilla.org/mozilla-central/rev/b335fb3dfdea
https://hg.mozilla.org/mozilla-central/rev/76a3bcdeaa9f
Comment 13•3 years ago
|
||
The patch landed in nightly and beta is affected.
:bobowen, is this bug important enough to require an uplift?
If not please set status_beta
to wontfix
.
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 14•3 years ago
|
||
(In reply to Release mgmt bot [:sylvestre / :calixte / :marco for bugbug] from comment #13)
The patch landed in nightly and beta is affected.
:bobowen, is this bug important enough to require an uplift?
If not please setstatus_beta
towontfix
.For more information, please visit auto_nag documentation.
While the issue is easier to cause and more severe with remote canvas, part of it has essentially been around for a long time.
So, I think given the size of the change, it is probably best to just let this roll out normally.
Updated•3 years ago
|
Comment 15•3 years ago
|
||
I used two machines, one with GTX 1070ti and another with RTX 2070 Super and I got the following results:
-
Firefox 86.0:
-- GTX 1070ti: Memory grows constantly, I stopped it at around 9000MB, GPU percentage spikes between 30-70-100%
-- RTX 2070 Super: Memory grows constantly, I stopped it at around 9000MB, GPU percentage was under 20% without spikes -
Firefox 89.0:
-- GTX 1070ti: Memory stays at a max of 400MB, GPU percentage does not go over 35% (constant under 20 but had spikes to 35)
-- GTX 2070 Super: Memory stays at a max of 400MB, GPU percentage does not go over 35% (constant under 20 but had spikes to 35)
I did not get any hangs/crashes because I ended the task at 9000MB, and had plenty of memory left.
Based on the above I'll mark this as verified fixed.
Description
•