Long css filter crash the whole firefox process group
Categories
(Core :: Graphics: WebRender, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | unaffected |
firefox64 | --- | unaffected |
firefox65 | --- | disabled |
firefox66 | --- | fixed |
People
(Reporter: mmis1000, Assigned: nical)
References
(Blocks 1 open bug, )
Details
(4 keywords, Whiteboard: 66b-qatriage-p2)
Crash Data
Attachments
(1 file, 1 obsolete file)
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:66.0) Gecko/20100101 Firefox/66.0
Steps to reproduce:
https://codepen.io/mmis1000/pen/WLPzpm
-
WebRender enabled
-
make a very long filter consists of many many blur pass
.test {
/*repeat over 1000x*/
filter: blur(1px) blur(1px) blur(1px)....... blur(1px);
}
- apply it on some element
<div class="test">123</div>
Actual results:
The whole firefox crashed (include the main process and the ui, literally everything)
Expected results:
If firefox really can't handle such malformed filter, it at least should limit the crash to content process instead of the whole process group.
Or if it should be able to handle that, it will render them without problem.
Comment 1•5 years ago
•
|
||
Thanks! :)
Command used
mozregression --pref gfx.webrender.all:true --bad=2019-01-17 --good=2017-11-01
Comment 4•5 years ago
|
||
Thank you for finding the regression range!
Confirmed on Win10 (GTX1060) and Debian Testing (Macbook Pro; with layers.gpu-process.enabled:true).
Win10: bp-b767ef29-b312-4c27-b7aa-95d280190117
Last good: Nightly just gets slow, but shows blur. The graphics card is bleeping.
mozregression --repo autoland --launch 5a1beccacebf371da00b6622586396df368bb245 --pref gfx.webrender.all:true -a https://codepen.io/mmis1000/pen/WLPzpm
First bad: Crash.
mozregression --repo autoland --launch 68aa8f295bd5bd25ad05b003972cedb977e5a570 --pref gfx.webrender.all:true -a https://codepen.io/mmis1000/pen/WLPzpm
https://github.com/servo/webrender/pull/3249
https://github.com/servo/webrender/pull/3252
https://github.com/servo/webrender/pull/3244 Most likely this one?
Updated•5 years ago
|
Comment 5•5 years ago
|
||
MacOS: bp-a6b8ff03-3bad-4875-8a83-e00fe0190117 "OS: Unknown" [@ EMPTY: no crashing thread identified; ERROR_NO_THREAD_LIST ]
Updated•5 years ago
|
Comment 6•5 years ago
|
||
None of those referenced patches seem likely to be the specific cause of this - it might just be chance that it starts to fall over around then for unrelated reasons.
Matt, Bobby, do you know if Gecko starts to drop filters if there are a huge number on an element (> 1000 in this case)? Is there a reasonable constant we could pick, and start to drop filters beyond that?
Comment 7•5 years ago
|
||
I'm not aware of any Gecko code that drops parts of long filter chains. But, looking at that page in non-WebRender Firefox, it looks like non-WebRender Firefox hangs the content process completely, and sometimes even crashes it. So the big difference is that WebRender moves the bad behavior into the parent process. This bug was reported on macOS where we don't have a GPU process.
Chrome doesn't seem to handle this page very well either, but it does stop hanging at some point and it doesn't crash on my machine.
Maybe the expected behavior here should be "crash and restart the GPU process without taking down the browser", or in other words, ship a GPU process on macOS.
Comment 8•5 years ago
|
||
I'm not familiar with filter management on the Gecko side. I'll defer to Markus and Matt.
Comment 9•5 years ago
|
||
Win10, GTX1060 | Behavior |
---|---|
Chrome Dev 73 | Clicking on "Destroy my firefox" instantly shows the blur, it may flicker once or twice in red, coil whine. Resizing preview area or window is slow and causes coil whine. Scrolling the "CSS" or "JS" text fields, switching tabs, etc. is smooth. |
Direct3D 11 (Advanced Layers) | Tab crash: bp-d5842d26-720f-48c8-a4a4-22e710190118 |
WebRender, "Last good" | Memory usage explosion: Whole browser + Win10 desktop are slowed down, coil whine, but blur is shown. |
WebRender, today | GPU process crash: bp-e06ee825-1ccb-4120-a737-2753e0190118, bp-9ef7a1bd-aba1-46af-b2a9-be22b0190118, Fallback to Basic compositing |
Windows users only get a fallback to Basic, but Linux and Mac users a browser crash.
non-WR does not fall back to Basic. But the current behavior might be better as Firefox keeps usable, so this could be even P4 or stage-wr-next if falling back to Basic is acceptable?
Since bug 1415020 is fixed, you could enable layers.gpu-process.enabled for MOZ_X11. (Wayland looked bad 5 months ago: bug 1481405 comment 10)
A GPU process on Mac would be nice.
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 10•5 years ago
|
||
I'm tempted to add a limit to WR which limits the depth of the render task tree, dropping any elements after that.
A new tree level is only introduced when there is an off-screen surface involved (e.g. mix-blend-mode, CSS filters, perspective rasterization roots).
I think it's probably reasonable, even if not technically spec compliant, to drop any elements with a render task tree of > 64 depth. I can't imagine any real world web content needing more than that, and if it does, it will render very slowly and / or exhaust memory anyway.
If we do add a limit for this, we could potentially revisit this later (esp. once we support full SVG filters), but it seems like a reasonable workaround for now.
Thoughts?
Assignee | ||
Comment 11•5 years ago
|
||
Adding a limit sounds good to me. Not sure what the number should be but making it configurable through a pref sounds good (to be able to quickly check whether the limit is causing a page to render incorrectly). If we hit this limit in meaningful ways we'll probably have to rethink how we render long filter chains anyway.
Assignee | ||
Comment 12•5 years ago
|
||
Actually there was this test case with a heap of drop shadows to create that "long shadow" effect (I don't have the bug number handy) which would break with a "small" limit on the number of filters, so we may want to pick a more conservative number.
If web authors really do chain filters such as blurs in a way similar to this test case, we could also look into simplifying filter chains by collapsing filters when possible.
Comment 13•5 years ago
|
||
Luckily that test case uses text-shadow and not filter: drop-shadow. So we don't need to take it into consideration here.
(In reply to Glenn Watson [:gw] from comment #10)
I think it's probably reasonable, even if not technically spec compliant, to drop any elements with a render task tree of > 64 depth. I can't imagine any real world web content needing more than that, and if it does, it will render very slowly and / or exhaust memory anyway.
I agree, such a limit sound reasonable. And as Nical said, let's have it be controlled by a pref so that we can identify visual issues caused by it more easily.
Assignee | ||
Comment 15•5 years ago
|
||
Updated•5 years ago
|
Comment 16•5 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #7)
I'm not aware of any Gecko code that drops parts of long filter chains. But, looking at that page in non-WebRender Firefox, it looks like non-WebRender Firefox hangs the content process completely, and sometimes even crashes it. So the big difference is that WebRender moves the bad behavior into the parent process. This bug was reported on macOS where we don't have a GPU process.
Chrome doesn't seem to handle this page very well either, but it does stop hanging at some point and it doesn't crash on my machine.
Maybe the expected behavior here should be "crash and restart the GPU process without taking down the browser", or in other words, ship a GPU process on macOS.
Having a GPU process is definitely something we should aim for on all platforms for reasons like this.
That said, if we restart the GPU process, then we'll try to render the filter again, and crash again. If the eventual fallback to software (regardless of which process that happens is) then causes a hang, then we're not much better off.
Adding the pref controlled limit sounds like a great idea :)
Comment 17•5 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #16)
That said, if we restart the GPU process, then we'll try to render the filter again, and crash again.
Hmm, unless we also crash the content process. That was the original idea in bug 1510900 - not sure if that's actually what landed?
Comment 18•5 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #17)
(In reply to Matt Woodrow (:mattwoodrow) from comment #16)
That said, if we restart the GPU process, then we'll try to render the filter again, and crash again.
Hmm, unless we also crash the content process. That was the original idea in bug 1510900 - not sure if that's actually what landed?
It isn't, we just report URLs from the GPU process directly.
Comment 19•5 years ago
|
||
Comment 20•5 years ago
|
||
Pushed by mwoodrow@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/768f85969239 Limit the amount of filter ops per stacking context in WebRender. r=jrmuizel
Comment 21•5 years ago
|
||
bugherder |
Updated•5 years ago
|
Updated•5 years ago
|
Description
•