Frequent Beta Crash in [@ core::option::expect_failed | core::ops::function::FnOnce::call_once<T>]
Categories
(Toolkit :: Telemetry, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr78 | --- | unaffected |
firefox83 | --- | unaffected |
firefox84 | --- | unaffected |
firefox85 | blocking | verified |
firefox86 | --- | verified |
People
(Reporter: aryx, Assigned: janerik)
Details
(Keywords: crash, regression, topcrash)
Crash Data
Attachments
(2 files)
42 bytes,
text/x-github-pull-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta+
|
Details | Review |
This only affects Firefox 85.0b1 so far. It's frequent.
Crash report: https://crash-stats.mozilla.org/report/index/13e1daf2-1669-48ff-9114-90f0f0201215
MOZ_CRASH Reason: Global Glean object not initialized
Top 10 frames of crashing thread:
0 xul.dll RustMozCrash mozglue/static/rust/wrappers.cpp:16
1 xul.dll mozglue_static::panic_hook mozglue/static/rust/lib.rs:89
2 xul.dll core::ops::function::Fn::call<fn ../7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/core/src/ops/function.rs:227
3 xul.dll std::panicking::rust_panic_with_hook ../7eac88abb2e57e752f3302f02be5f3ce3d7adfb4//library/std/src/panicking.rs:581
4 xul.dll std::panicking::begin_panic_handler::{{closure}} ../7eac88abb2e57e752f3302f02be5f3ce3d7adfb4//library/std/src/panicking.rs:484
5 xul.dll std::sys_common::backtrace::__rust_end_short_backtrace<closure-0, !> ../7eac88abb2e57e752f3302f02be5f3ce3d7adfb4//library/std/src/sys_common/backtrace.rs:153
6 xul.dll std::panicking::begin_panic_handler ../7eac88abb2e57e752f3302f02be5f3ce3d7adfb4//library/std/src/panicking.rs:483
7 xul.dll core::panicking::panic_fmt ../7eac88abb2e57e752f3302f02be5f3ce3d7adfb4//library/core/src/panicking.rs:85
8 xul.dll core::option::expect_failed ../7eac88abb2e57e752f3302f02be5f3ce3d7adfb4//library/core/src/option.rs:1226
9 xul.dll core::ops::function::FnOnce::call_once<closure-0, tuple<>> ../7eac88abb2e57e752f3302f02be5f3ce3d7adfb4/library/core/src/ops/function.rs:227
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Updated•4 years ago
|
Comment 2•4 years ago
|
||
Jeepers, we need to get a better stack than this somehow.
Alrighty, so this crash can happen in one of three places: one in FOG and two in the Glean SDK.
FOG
FOG can expect and crash with that message within with_glean
in the fog
crate before init
From searchfox, calling with_glean
(regardless of before or after init) is only done in datetimes, string lists, and timing distributions, none of which exist to be called see metrics.yaml.
Verdict: Likely isn't FOG
Glean SDK
Same thing, but both with_glean
and with_glean_mut
. A lot of these are dealt with by being launched by or happening after a block against the Dispatcher, which doesn't come out of its hole until after init.
There are a few exceptions, mostly inside named threads (like glean.uploader
and glean.init
), but the crashing thread in the report (37) is nameless.
There are a few suspicious cases around things like set_debug_view_tag
set_source_tags
and set_log_pings
where we only check if initialize was called, not that initialize had completed and the global glean is present.
I'm at a loss. Jan-Erik and Alessio are the two I'd ask for help on this, and they'll be back tomorrow CET morning
Assignee | ||
Comment 3•4 years ago
|
||
I think I know what's going on, looking at the other threads:
Thread 0 is calling glean::shutdown()
, so all this is happening when the browser is being shutdown and we quite correctly try to shutdown Glean.
I assume what is happening is that the browser starts, FOG is not intialized (for whatever reason), then we try to shut down, which unblocks the queue before it is actually ending. If the queue contains any recordings they will try to access the Glean object.
What we should do if we never flushed is to not flush the queue now but skip it. This will require some changes to the dispatcher, I'm taking a look now.
Assignee | ||
Updated•4 years ago
|
Updated•4 years ago
|
Comment 4•4 years ago
•
|
||
Assignee | ||
Comment 5•4 years ago
|
||
(In reply to Alexandru Trif, QA [:atrif] from comment #4)
Hello! It seems that I can reproduce the crash on Windows 10x64/x86 with Firefox 85.0b2 (20201215185920) by restarting or closing Firefox while opened with this profile: link
This is one of the crash reports: link
Thanks, that will be helpful! Was this profile newly created or did that profile exist before upgrading to 85 beta 2?
Comment 6•4 years ago
•
|
||
(In reply to Jan-Erik Rediger [:janerik] (Away 2020-12-21 to 2021-01-04) from comment #5)
(In reply to Alexandru Trif, QA [:atrif] from comment #4)
Hello! It seems that I can reproduce the crash on Windows 10x64/x86 with Firefox 85.0b2 (20201215185920) by restarting or closing Firefox while opened with this profile: link
This is one of the crash reports: link
Thanks, that will be helpful! Was this profile newly created or did that profile exist before upgrading to 85 beta 2?
This was newly created today with 85.0b2 from what I can remeber and there was only some youtube navigation performed while testing PiP functionality on Windows 10x86. I observed this first time when I closed Firefox and then I saw that is reproducible while restarting it too. I then transferred the profile and used it on another two machines with Windows 10x64 and I could reproduce the crash. If more information is needed please let me know. Thank you!
Assignee | ||
Comment 7•4 years ago
|
||
Thanks, that information and the profile was very helpful and I at least identified why this leads to crashes such as the above.
For some reason the database file is empty and that is not properly handled by FOG (why the database file is empty is another mystery I need to solve)
Comment 8•4 years ago
|
||
Comment 9•4 years ago
|
||
FWIW I'd like to get a fix or backout in beta today or tomorrow if at all possible, so we don't leave this crash there over the holiday break.
Updated•4 years ago
|
Assignee | ||
Comment 10•4 years ago
|
||
(In reply to Julien Cristau [:jcristau] from comment #9)
FWIW I'd like to get a fix or backout in beta today or tomorrow if at all possible, so we don't leave this crash there over the holiday break.
We're working on it. The above pull request was done yesterday and is reviewed, I also kicked off some try runs last night. I will now take this to m-c, preparing it to land and be uplifted then.
Assignee | ||
Comment 11•4 years ago
|
||
Comment 12•4 years ago
|
||
Comment 13•4 years ago
|
||
Thanks, much appreciated.
Assignee | ||
Comment 14•4 years ago
|
||
Comment on attachment 9193713 [details]
Bug 1682638 - Update to Glean v33.9.1. r?dexter
Beta/Release Uplift Approval Request
- User impact if declined: Potential crashes on shutdown.
(it's about to land in Nightly, so it's only verified manually by me on a local build there for now)
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): The changes are covered by automated tests in Glean (external repository), I also did manual tests on a local build.
- String changes made/needed:
Comment 15•4 years ago
|
||
Comment on attachment 9193713 [details]
Bug 1682638 - Update to Glean v33.9.1. r?dexter
topcrash fix, approved for 85.0b3
Comment 16•4 years ago
|
||
bugherder uplift |
Comment 17•4 years ago
|
||
bugherder |
Updated•4 years ago
|
Updated•4 years ago
|
Comment 18•4 years ago
|
||
Hello! Verified the issue using the attached profile on comment 4 with Firefox 85.0b3 (20201217185930) and 86.0a1 (20201217214927) on Windows 10x86/x64 and Windows7x64. Firefox is no longer crashing when closing/restarting while opened with the mentioned profile. On Windows 10x64 I have also updated from 85.0b2 to 85.0b3 while using the profile and restarted/closed Firefox after the update was applied and there were no crashes encountered with 85.0b3.
Description
•