Closed
Bug 1502717
Opened 6 years ago
Closed 6 years ago
Crash in webrender::clip_scroll_tree::ClipScrollTree::get_relative_transform
Categories
(Core :: Graphics: WebRender, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla65
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | unaffected |
firefox63 | --- | unaffected |
firefox64 | --- | unaffected |
firefox65 | --- | fixed |
People
(Reporter: calixte, Assigned: gw)
References
(Blocks 2 open bugs)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(2 files)
3.08 KB,
patch
|
Details | Diff | Splinter Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review |
This bug was filed from the Socorro interface and is
report bp-a8e5a4cc-a547-4cdb-a51c-bc9240181027.
=============================================================
Top 10 frames of crashing thread:
0 xul.dll static void std::panicking::rust_panic_with_hook src/libstd/panicking.rs:485
1 xul.dll static void std::panicking::continue_panic_fmt src/libstd/panicking.rs:390
2 xul.dll static void std::panicking::rust_begin_panic src/libstd/panicking.rs:325
3 xul.dll static void core::panicking::panic_fmt src/libcore/panicking.rs:77
4 xul.dll static void core::panicking::panic_bounds_check src/libcore/panicking.rs:59
5 xul.dll static union core::option::Option<euclid::transform3d::TypedTransform3D<f32, webrender_api::units::LayoutPixel, webrender_api::units::LayoutPixel>> webrender::clip_scroll_tree::ClipScrollTree::get_relative_transform gfx/webrender/src/clip_scroll_tree.rs:158
6 xul.dll static void webrender::prim_store::PrimitiveStore::update_picture gfx/webrender/src/prim_store.rs:1667
7 xul.dll static void webrender::prim_store::PrimitiveStore::update_picture gfx/webrender/src/prim_store.rs:1672
8 xul.dll static void webrender::prim_store::PrimitiveStore::update_picture gfx/webrender/src/prim_store.rs:1672
9 xul.dll static void webrender::prim_store::PrimitiveStore::update_picture gfx/webrender/src/prim_store.rs:1672
=============================================================
There are 45 crashes (from 32 installations) in nightly 65 starting with buildid 20181026100128. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1501675.
[1] https://hg.mozilla.org/mozilla-central/rev?node=13372afaba77
Flags: needinfo?(kats)
Comment 1•6 years ago
|
||
Flags: needinfo?(gwatson)
Updated•6 years ago
|
Blocks: stage-wr-trains, wr-stability
Priority: -- → P2
Updated•6 years ago
|
Flags: needinfo?(kats)
Assignee | ||
Comment 2•6 years ago
|
||
From the backtraces, this looks to be an array bounds indexing bug.
It's one of those bugs that is probably a trivial fix once we have a repro, but very difficult to work out what's going on otherwise.
Do we have any URL and/or repro steps for this?
Flags: needinfo?(gwatson)
Assignee | ||
Comment 3•6 years ago
|
||
I think it's also likely that this crash existed previously, but has just been moved around to a different part of the code, as a result of the patch referenced above.
I vaguely recall seeing this crash signature in a previous bug, but I don't see it anymore in the P2/P3 bug lists - I might just have missed it.
Reporter | ||
Comment 4•6 years ago
|
||
Unfortunately, there are neither urls nor user comments in bug reports.
All the crashes are in gpu process (and it seems that we don't have urls in general for gpu process crashes).
Comment 5•6 years ago
|
||
(In reply to Glenn Watson [:gw] from comment #3)
> I think it's also likely that this crash existed previously, but has just
> been moved around to a different part of the code, as a result of the patch
> referenced above.
>
> I vaguely recall seeing this crash signature in a previous bug, but I don't
> see it anymore in the P2/P3 bug lists - I might just have missed it.
Bug 1486218 is probably it. It is blocked against wr-stage-next, not wr-stage-trains :).
See Also: → 1486218
Comment 6•6 years ago
|
||
I'll take this for now, and see if I can puzzle it out the hard way. It is the top crash WR users experience on nightly, so it is only going to get worse on the next beta cycle.
Assignee: nobody → aosmond
Comment 7•6 years ago
|
||
I have STR from nightly on macOS:
- Go to https://www.theburgerspriest.com/secret-menu/
- Click on "370 days"
- Click on the "Religious Hypocrite" or the "Jarge Style Shake"
https://crash-stats.mozilla.com/report/index/578e4e4b-4704-4bee-98d2-df64e0181109
Comment 8•6 years ago
|
||
(This is in my regular browsing profile, I didn't try on a clean one. Let me know if you can/cannot repro.. will check back after I pick up some burgers :))
Comment 9•6 years ago
|
||
Thanks kats! I was able to reproduce as well. Looks like it is the signature variant from bug 1486218, but I suspect the same root cause.
Comment 10•6 years ago
|
||
Examining it in gdb (unfortunately no luck with rr reproduction) has convinced them they are indeed the same bug. In the signature for this bug, we overflowed the buffer boundary, and in the signature for the other bug, we had an underlying allocated buffer that it could fit the deref'ed contents, but the number of properly stored values in that buffer is fewer (e.g. we just read garbage).
Comment 11•6 years ago
|
||
Comment 13•6 years ago
|
||
This fixes kats' STR, although I doubt it is the proper fix. I have a reproduction in rr by modifying the STR to iterate through all of the burgers if not hit immediately, until it happens. Unfortunately rr isn't cooperating and besides the one time it hit my breakpoint, I haven't been able to get back to where it adds the spatial nodes in the first place (to figure out why the IDs are bad).
Comment 14•6 years ago
|
||
So it looks like the spatial node hierarchy is correct and the invariant of the method is true relative to that -- it seems it is the coordinate system ID hierarchy which violates that assumption.
Assignee | ||
Comment 15•6 years ago
|
||
Opened a PR with a fix - https://github.com/servo/webrender/pull/3316 - still needs a try run to see if it breaks anything, but it fixes the https://www.theburgerspriest.com/secret-menu/ case for me locally at least.
Assignee | ||
Comment 16•6 years ago
|
||
The try run is good, and this is merging now, so should be in the next WR update for nightly.
Comment 17•6 years ago
|
||
I'm going to land servo/webrender#3316 out-of-band as part of this bug, as it resolves the crashtest failures that are blocking the regular WR update process.
Assignee: aosmond → gwatson
See Also: → https://github.com/servo/webrender/pull/3316
Comment 18•6 years ago
|
||
Comment 19•6 years ago
|
||
Comment 20•6 years ago
|
||
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0f0b77b3d36e
Cherry pick webrender commit 7f889ccf165ef0bcbf3688ccb1c51bddd84a7b6f (WR PR #3316). r=kats
Comment 22•6 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
Comment 24•6 years ago
|
||
I am still getting this issue. As mentioned in bug 1507323, that was duplicated to this bug, if you visit https://drive.google.com you will get an instant crash taking the entire browser out. Today's crash is https://crash-stats.mozilla.com/report/index/3d29810a-ba7d-4d77-b337-17cd30181114
Status: RESOLVED → REOPENED
Flags: needinfo?(gwatson)
Resolution: FIXED → ---
Comment 25•6 years ago
|
||
I'll un-dupe bug 1507323 to track the remaining crashes.
Status: REOPENED → RESOLVED
Closed: 6 years ago → 6 years ago
Flags: needinfo?(gwatson)
Resolution: --- → FIXED
Comment 26•6 years ago
|
||
Still crashes on https://www.rondomark.jp/
Firefox nightly 66.0a1 (2018-12-20) (64-bit), Build ID 20181220215605. Windows 10 1809 64-bit.
Comment 27•6 years ago
|
||
Same crash signature as resolved duplicate bug 1507705.
bp-c681d5be-233d-45e7-941c-351040181221
You need to log in
before you can comment on or make changes to this bug.
Description
•