Crash in [@ core::option::expect_failed | webrender::spatial_tree::SpatialTree::get_relative_transform]
Categories
(Core :: Graphics: WebRender, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox76 | --- | wontfix |
firefox77 | --- | wontfix |
firefox78 | --- | fixed |
People
(Reporter: gsvelto, Assigned: gw)
References
(Regression)
Details
(Keywords: crash, regression, testcase)
Crash Data
Attachments
(5 files)
This bug is for crash report bp-4d67e4a7-c97a-4c22-8244-f7e6a0200223.
Top 10 frames of crashing thread:
0 xul.dll RustMozCrash mozglue/static/rust/wrappers.cpp:16
1 xul.dll mozglue_static::panic_hook mozglue/static/rust/lib.rs:89
2 xul.dll core::ops::function::Fn::call<fn src/libcore/ops/function.rs:232
3 xul.dll std::panicking::rust_panic_with_hook src/libstd/panicking.rs:475
4 xul.dll std::panicking::begin_panic_handler src/libstd/panicking.rs:375
5 xul.dll core::panicking::panic_fmt src/libcore/panicking.rs:84
6 xul.dll core::option::expect_failed src/libcore/option.rs:1188
7 xul.dll webrender::spatial_tree::SpatialTree::get_relative_transform gfx/wr/webrender/src/spatial_tree.rs:297
8 xul.dll webrender::prim_store::SpaceMapper<webrender_api::units::LayoutPixel, webrender_api::units::PicturePixel>::set_target_spatial_node<webrender_api::units::LayoutPixel, webrender_api::units::PicturePixel> gfx/wr/webrender/src/prim_store/mod.rs:286
9 xul.dll webrender::picture::PicturePrimitive::post_update gfx/wr/webrender/src/picture.rs:5100
This seems to have started in buildid 20200130095303. The raw crash reason is: invalid parent!
Comment 2•4 years ago
•
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #1)
I wonder if this is related to bug 1589800
It is difficult to say from just this information. It might be possible if the crashing sites use blend modes (or break RDL in other ways).
Comment 3•4 years ago
|
||
This looks like it might be fixed.
Comment 4•4 years ago
|
||
It looks like it might not be fixed, Miko - any other info that might help here?
Comment 5•4 years ago
|
||
(In reply to Jessie [:jbonisteel] pls NI from comment #4)
It looks like it might not be fixed, Miko - any other info that might help here?
The four crash reports we have received in the last week seem to be from the same person and within two minutes. Besides checking which site crashed, I do not think there is much we can do here.
Comment 6•4 years ago
|
||
Updated•4 years ago
|
Updated•4 years ago
|
Comment 7•4 years ago
|
||
Miko, it looks like this crash has increased a bit again
Updated•4 years ago
|
Comment 8•4 years ago
|
||
Let's continue to wait and see
Updated•4 years ago
|
Updated•4 years ago
|
Comment 9•4 years ago
|
||
Comment 10•4 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #9)
Some more urls:
https://destroyallhumansgame.com/
https://www.crehana.com/home/
https://www.rickandmorty.com/videos/story-lord
Unfortunately none of these URLs crash the latest Nightly (WR enabled) on my Windows desktop (Nvidia GPU) or MBP.
Looking at the crash reports, it seems plausible that this is caused by bad interaction with WebRender and video decoding. For example, this crash1 has GraphicsCriticalError: "updating unknown shared surface" and it's followed by another crash2 four seconds later with "DXVA2D3D11? DXVA2D3D11+ " flags missing. More pairs like this: (3, 4), (5, 6), (7, 8).
Alternatively, this could be just GPU crash recovery not working properly.
Comment 11•4 years ago
•
|
||
I can repro on https://destroyallhumansgame.com/
https://crash-stats.mozilla.org/report/index/05c9f596-ac6a-4c90-86f5-7e3580200515#tab-bugzilla
The browser flashed white before the page fully loads. So to repro it 100%, the trick might be to keep the tab laoding that URL as he active tab(keep the tab in the foreground)
Edit: The page contents also end up on top of the URL bar. I tried to file a new bug, but bugzilla had some error)
Edit2: If I manually set gfx.webrender.compositor to false, I dont get the crash
Comment 12•4 years ago
|
||
Comment 13•4 years ago
|
||
Regression points to :
2020-05-16T11:42:35.242000: DEBUG : Found commit message:
Bug 1631312. Move the check for UseWebRenderDCompWin(). r=sotaro
Bug 1630629 moved the check UseWebRenderDCompWin before it had been
initialized. This moves it below so that DirectComposition isn't
disabled everywhere.
Differential Revision: https://phabricator.services.mozilla.com/D71498
2020-05-16T11:42:35.242000: DEBUG : Did not find a branch, checking all integration branches
2020-05-16T11:42:35.242000: INFO : The bisection is done.
2020-05-16T11:42:35.242000: INFO : Stopped
Comment 14•4 years ago
|
||
Updated•4 years ago
|
Comment 15•4 years ago
•
|
||
This is the minimized testcase from https://destroyallhumansgame.com/ . This crashes reliably for me if I enable WR compositor (which is enabled default for me)
If I manually disable the WR compositor, the tab just... freezes
Edit: testcase fixed by https://phabricator.services.mozilla.com/D75465 , which landed today
Updated•4 years ago
|
Comment 16•4 years ago
|
||
further regression with compositor force enabled, and gfx.webrender.quality.force-subpixel-aa-where-possible set to true
2020-05-17T17:45:07.048000: DEBUG : Found commit message:
Bug 1592891 - Disable picture caching when picture caches have complex transforms. r=kvark,nical
With the recent changes to compositing in WR, the scene must either
only produce produce cache tiles, or disable picture caching and
rasterize directly.
This patch removes the (currently broken) path where WR would attempt
to disable only the picture cache slice(s) that have a complex
transform.
In future, we should disable creating picture cache slices with
complex transforms at the API level, and remove this path completely.
Differential Revision: https://phabricator.services.mozilla.com/D51228
2020-05-17T17:45:07.048000: DEBUG : Did not find a branch, checking all integration branches
2020-05-17T17:45:07.048000: INFO : The bisection is done.
2020-05-17T17:45:07.064000: INFO : Stopped
Assignee | ||
Comment 17•4 years ago
|
||
I wasn't able to reproduce this on any of the URLs or the attached test case.
I think I had all the config values set correctly, but it's possible I missed one, or there is something else different between our systems causing it to not occur here.
Could you change layout.display-list.retain
in about:config
to false, restart the browser and see if you can still reproduce the bug with that setting changed? That might help us identify if it's similar to https://bugzilla.mozilla.org/show_bug.cgi?id=1589800.
Comment 18•4 years ago
•
|
||
(In reply to Glenn Watson [:gw] from comment #17)
I wasn't able to reproduce this on any of the URLs or the attached test case.
I think I had all the config values set correctly, but it's possible I missed one, or there is something else different between our systems causing it to not occur here.
Could you change
layout.display-list.retain
inabout:config
to false, restart the browser and see if you can still reproduce the bug with that setting changed? That might help us identify if it's similar to https://bugzilla.mozilla.org/show_bug.cgi?id=1589800.
layout.display-list.retain = false or true has no effect on the crash. What makes Nightly crash are 2 options:
gfx.webrender.picture-caching = true (If I disable picture caching, I dont get any crash)
gfx.webrender.quality.force-subpixel-aa-where-possible = true (this pref wasn't needed before https://phabricator.services.mozilla.com/D75465)
and the compositor enabled.
Other factor could be intel iGPU notebook, screen resolution 1360x768
Assignee | ||
Comment 19•4 years ago
|
||
I've re-tested with the attached test case and URLs, and ensuring that those values are set. I tested at the 1366 x 768 resolution you listed, and some other resolutions, on both Linux and Windows. I also tried with both compositor enabled/disabled.
So far, it still doesn't reproduce locally for me, so there must be some other factor.
Miko, Jeff, could you see if you can reproduce locally?
Comment 20•4 years ago
|
||
I was able to repro the crash using https://destroyallhumansgame.com/. The other URLs given produced no crashes for me.
Comment 21•4 years ago
|
||
(In reply to Mayank Bansal from comment #18)
gfx.webrender.picture-caching = true (If I disable picture caching, I dont get any crash)
gfx.webrender.quality.force-subpixel-aa-where-possible = true (this pref wasn't needed before https://phabricator.services.mozilla.com/D75465)
and the compositor enabled.
Other factor could be intel iGPU notebook, screen resolution 1360x768
This was great information. I can reproduce the crash with the minified testcase on my desktop machine with Nvidia/Win10, when I enable gfx.webrender.quality.force-subpixel-aa-where-possible = true
.
Comment 22•4 years ago
|
||
I was able to repro the crash using the minimized testcase. I've attached my about:support output for debugging purposes.
Updated•4 years ago
|
Assignee | ||
Comment 23•4 years ago
|
||
I'm not sure what changed overnight, but I can now reproduce this on my current local build with the test case, investigating today.
Assignee | ||
Comment 24•4 years ago
|
||
I can see what is occurring, and why it only triggers in these very specific circumstances.
For reference, the spatial tree for the test case is below.
We end up with an invalid state for picture caching, due to finding a scrollframe embedded within a reference frame that contains a perspective transform.
We can't detect this during scene building, as the value of the transform may be animated by property animation during frame building.
Normally, when we encounter this case we temporarily disable picture caching and draw straight to the framebuffer. However, when we have native OS compositor enabled, the code to disable picture caching fails, since native compositor mode relies on picture caching being enabled.
There are a number of potential fixes for this case, including:
- At the start of the frame build, change the spatial node of the problematic picture cache to be the root spatial node. This would invalidate things but mean we don't hit the assertion in this case.
- Support a mode with native compositor enabled where we create a single framebuffer sized tile, and draw to that with picture caching disabled, similar to how we currently do without native compositing.
I'll experiment with these and see if I run into any unforeseen issues.
┌ spatial tree
│ ├─ ReferenceFrame
│ │ ├─ kind: Transform
│ │ ├─ transform_style: Flat
│ │ ├─ source_transform: Value([I])
│ │ ├─ origin_in_parent_reference_frame: (0.0,0.0)
│ │ ├─ index: SpatialNodeIndex(0)
│ │ ├─ content_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,0.0) }
│ │ ├─ viewport_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,0.0) }
│ │ ├─ snapping_transform: Some(ScaleOffset { scale: (1.0,1.0), offset: (0.0,0.0) })
│ │ ├─ coordinate_system_id: CoordinateSystemId(0)
│ │ ├─ ScrollFrame
│ │ │ ├─ viewport: Rect(1920.0×2099.0 at (0.0,0.0))
│ │ │ ├─ scrollable_size: 0.0×0.0
│ │ │ ├─ scroll offset: (0.0,0.0)
│ │ │ ├─ external_scroll_offset: (0.0,0.0)
│ │ │ ├─ kind: PipelineRoot
│ │ │ ├─ index: SpatialNodeIndex(1)
│ │ │ ├─ content_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,0.0) }
│ │ │ ├─ viewport_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,0.0) }
│ │ │ ├─ snapping_transform: Some(ScaleOffset { scale: (1.0,1.0), offset: (0.0,0.0) })
│ │ │ ├─ coordinate_system_id: CoordinateSystemId(0)
│ │ │ ├─ ReferenceFrame
│ │ │ │ ├─ kind: Transform
│ │ │ │ ├─ transform_style: Flat
│ │ │ │ ├─ source_transform: Value([I])
│ │ │ │ ├─ origin_in_parent_reference_frame: (0.0,112.0)
│ │ │ │ ├─ index: SpatialNodeIndex(2)
│ │ │ │ ├─ content_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,112.0) }
│ │ │ │ ├─ viewport_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,112.0) }
│ │ │ │ ├─ snapping_transform: Some(ScaleOffset { scale: (1.0,1.0), offset: (0.0,112.0) })
│ │ │ │ ├─ coordinate_system_id: CoordinateSystemId(0)
│ │ │ │ ├─ ScrollFrame
│ │ │ │ │ ├─ viewport: Rect(1920.0×1989.0 at (0.0,0.0))
│ │ │ │ │ ├─ scrollable_size: 0.0×0.0
│ │ │ │ │ ├─ scroll offset: (0.0,0.0)
│ │ │ │ │ ├─ external_scroll_offset: (0.0,0.0)
│ │ │ │ │ ├─ kind: PipelineRoot
│ │ │ │ │ ├─ index: SpatialNodeIndex(3)
│ │ │ │ │ ├─ content_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,112.0) }
│ │ │ │ │ ├─ viewport_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,112.0) }
│ │ │ │ │ ├─ snapping_transform: Some(ScaleOffset { scale: (1.0,1.0), offset: (0.0,112.0) })
│ │ │ │ │ ├─ coordinate_system_id: CoordinateSystemId(0)
│ │ │ │ │ ├─ ScrollFrame
│ │ │ │ │ │ ├─ viewport: Rect(1920.0×1989.0 at (0.0,0.0))
│ │ │ │ │ │ ├─ scrollable_size: 0.0×0.0
│ │ │ │ │ │ ├─ scroll offset: (0.0,0.0)
│ │ │ │ │ │ ├─ external_scroll_offset: (0.0,0.0)
│ │ │ │ │ │ ├─ kind: Explicit
│ │ │ │ │ │ ├─ index: SpatialNodeIndex(4)
│ │ │ │ │ │ ├─ content_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,112.0) }
│ │ │ │ │ │ ├─ viewport_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,112.0) }
│ │ │ │ │ │ ├─ snapping_transform: Some(ScaleOffset { scale: (1.0,1.0), offset: (0.0,112.0) })
│ │ │ │ │ │ └─ coordinate_system_id: CoordinateSystemId(0)
│ │ │ │ │ ├─ ReferenceFrame
│ │ │ │ │ │ ├─ kind: Transform
│ │ │ │ │ │ ├─ transform_style: Flat
│ │ │ │ │ │ ├─ source_transform: Value([0.8660254, 0.5, 0.0, 0.0, -0.5, 0.8660254, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 133.58435, -478.66864, 0.0, 1.0])
│ │ │ │ │ │ ├─ origin_in_parent_reference_frame: (12.0,12.0)
│ │ │ │ │ │ ├─ index: SpatialNodeIndex(5)
│ │ │ │ │ │ ├─ content_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,0.0) }
│ │ │ │ │ │ ├─ viewport_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,0.0) }
│ │ │ │ │ │ ├─ snapping_transform: None
│ │ │ │ │ │ ├─ coordinate_system_id: CoordinateSystemId(1)
│ │ │ │ │ │ ├─ ScrollFrame
│ │ │ │ │ │ │ ├─ viewport: Rect(1920.0×19.875 at (0.0,0.0))
│ │ │ │ │ │ │ ├─ scrollable_size: 0.0×1908.0
│ │ │ │ │ │ │ ├─ scroll offset: (0.0,0.0)
│ │ │ │ │ │ │ ├─ external_scroll_offset: (0.0,0.0)
│ │ │ │ │ │ │ ├─ kind: Explicit
│ │ │ │ │ │ │ ├─ index: SpatialNodeIndex(6)
│ │ │ │ │ │ │ ├─ content_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,0.0) }
│ │ │ │ │ │ │ ├─ viewport_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,0.0) }
│ │ │ │ │ │ │ ├─ snapping_transform: None
│ │ │ │ │ │ │ └─ coordinate_system_id: CoordinateSystemId(1)
│ │ │ │ │ ├─ ReferenceFrame
│ │ │ │ │ │ ├─ kind: Transform
│ │ │ │ │ │ ├─ transform_style: Flat
│ │ │ │ │ │ ├─ source_transform: Value([0.8660254, 0.5, 0.0, 0.0, -0.5, 0.8660254, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 140.6156, -476.7846, 0.0, 1.0])
│ │ │ │ │ │ ├─ origin_in_parent_reference_frame: (12.0,12.0)
│ │ │ │ │ │ ├─ index: SpatialNodeIndex(7)
│ │ │ │ │ │ ├─ content_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,0.0) }
│ │ │ │ │ │ ├─ viewport_transform: ScaleOffset { scale: (1.0,1.0), offset: (0.0,0.0) }
│ │ │ │ │ │ ├─ snapping_transform: None
│ │ │ │ │ │ └─ coordinate_system_id: CoordinateSystemId(2)
Assignee | ||
Comment 25•4 years ago
|
||
Previously, WR would attempt to detect at the start of frame
building if the spatial node of any picture cache contained
a non-axis-aligned transform, and disable picture caching in
that edge case.
However, picture caching can't (currently) be disabled when the
native compositor is active. In this mode, picture caching was
force enabled, causing an assertion failure due to unexpected
coordinate systems when updating pictures.
This patch changes the way the detection of scroll root logic
works such that we don't consider any scroll frame inside a
reference frame to be a valid scroll root for picture caching
purposes. Thus it's not possible to create a picture cache
where the reference spatial node has a non-axis-aligned transform.
Assignee | ||
Comment 26•4 years ago
|
||
I think the attached patch is a good solution, although I need to do a try run and make sure it doesn't break anything else.
The only issue I can see this causing is if we ever insert zooming reference frame transforms as parents of scroll frames. From what I can tell, I don't think we do this (I think the zooming reference frame is a child of the scroll root). If we do, we can still use this method, but we'd need to add an additional tag to reference frames, which says "This is a zoom transform, it will always be axis-aligned" or similar.
kats, botond, jnicol, does that sound fine to you?
Updated•4 years ago
|
Comment 27•4 years ago
|
||
If I understand correctly (and it's quite possible that I don't), I think this is ok. My understanding is that the zooming reference frame is descendant of the root scrollframe for a pipeline, because there are things inside that scrollframe (e.g. scrollbars) that are outside the zooming reference frame and don't get affected by zooming. So your code that discards any scrollframes inside reference frames should be ok, as it will still find the root scrollframe for the pipeline as being outside any reference frames.
(In reply to Glenn Watson [:gw] from comment #26)
If we do, we can still use this method, but we'd need to add an additional tag to reference frames, which says "This is a zoom transform, it will always be axis-aligned" or similar.
This should be easy to do if it's needed, as we can easily identify (on the gecko side) the stacking context/reference frame that was inserted for the zoom transform.
Assignee | ||
Comment 28•4 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #27)
If I understand correctly (and it's quite possible that I don't), I think this is ok. My understanding is that the zooming reference frame is descendant of the root scrollframe for a pipeline, because there are things inside that scrollframe (e.g. scrollbars) that are outside the zooming reference frame and don't get affected by zooming. So your code that discards any scrollframes inside reference frames should be ok, as it will still find the root scrollframe for the pipeline as being outside any reference frames.
(In reply to Glenn Watson [:gw] from comment #26)
If we do, we can still use this method, but we'd need to add an additional tag to reference frames, which says "This is a zoom transform, it will always be axis-aligned" or similar.
This should be easy to do if it's needed, as we can easily identify (on the gecko side) the stacking context/reference frame that was inserted for the zoom transform.
It does seem that on android there are reference frames inserted between the main scroll frame and the pipeline root. Below is the spatial tree for a reftest on android that is failing. It's not clear to me yet why there are two reference frames with property animations. I'll do some more investigations on the Gecko side of what's setting them.
ReferenceFrame
kind: Transform
transform_style: Flat
source_transform: Value([I])
origin_in_parent_reference_frame: (0.0,0.0)
index: SpatialNodeIndex(0)
ScrollFrame
viewport: Rect(800.0×1136.0 at (0.0,0.0))
scrollable_size: 0.0×0.0
scroll offset: (0.0,0.0)
external_scroll_offset: (0.0,0.0)
kind: PipelineRoot
index: SpatialNodeIndex(1)
ReferenceFrame
kind: Transform
transform_style: Flat
source_transform: Value([I])
origin_in_parent_reference_frame: (0.0,0.0)
index: SpatialNodeIndex(2)
ScrollFrame
viewport: Rect(800.0×1000.0 at (0.0,0.0))
scrollable_size: 0.0×0.0
scroll offset: (0.0,0.0)
external_scroll_offset: (0.0,0.0)
kind: PipelineRoot
index: SpatialNodeIndex(3)
ReferenceFrame
kind: Transform
transform_style: Flat
source_transform: Binding(PropertyBindingKey { id: PropertyBindingId { namespace: IdNamespace(11895), uid: 1 }, _phantom: PhantomData }, [I])
origin_in_parent_reference_frame: (0.0,0.0)
index: SpatialNodeIndex(4)
ScrollFrame
viewport: Rect(325.0×406.25 at (0.0,0.0))
scrollable_size: 0.0×7837.783
scroll offset: (0.0,0.0)
external_scroll_offset: (0.0,0.0)
kind: Explicit
index: SpatialNodeIndex(5)
ReferenceFrame
kind: Transform
transform_style: Flat
source_transform: Binding(PropertyBindingKey { id: PropertyBindingId { namespace: IdNamespace(11895), uid: 2 }, _phantom: PhantomData }, [I])
origin_in_parent_reference_frame: (0.0,0.0)
index: SpatialNodeIndex(6)
ReferenceFrame
kind: Transform
transform_style: Flat
source_transform: Value([I])
origin_in_parent_reference_frame: (0.0,1000.0)
index: SpatialNodeIndex(7)
ScrollFrame
viewport: Rect(800.0×136.0 at (0.0,0.0))
scrollable_size: 0.0×0.0
scroll offset: (0.0,0.0)
external_scroll_offset: (0.0,0.0)
kind: PipelineRoot
index: SpatialNodeIndex(8)
Assignee | ||
Comment 29•4 years ago
|
||
Patch is updated with support for Zoom
reference frame kinds, which fixes the android test failures and retains the original bug fix.
Comment 30•4 years ago
|
||
Pushed by gwatson@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/b04340bad99d Fix crash in get_relative_transform edge case. r=jnicol
Comment 31•4 years ago
|
||
bugherder |
Updated•4 years ago
|
Description
•