Crash in shutdownhang | nsThread::Shutdown | nsThreadManager::ShutdownNonMainThreads
Categories
(Core :: XPCOM, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | wontfix |
firefox-esr68 | --- | wontfix |
firefox-esr78 | --- | wontfix |
firefox63 | --- | wontfix |
firefox64 | --- | wontfix |
firefox65 | --- | wontfix |
firefox67 | --- | wontfix |
firefox68 | --- | wontfix |
firefox69 | --- | wontfix |
firefox70 | --- | wontfix |
firefox71 | --- | wontfix |
firefox72 | --- | wontfix |
firefox84 | --- | wontfix |
firefox85 | --- | wontfix |
firefox86 | --- | wontfix |
People
(Reporter: skywalker333, Assigned: jstutte)
References
(Depends on 4 open bugs, Blocks 2 open bugs)
Details
(4 keywords, Whiteboard: [Thunderbird see bug 1524247])
Crash Data
Attachments
(1 file)
Comment 1•6 years ago
|
||
Jim, should we put shutdown hangs in a different component?
![]() |
||
Updated•6 years ago
|
Comment 2•6 years ago
|
||
FYI I observed two crashes today in TB version 65.0b2 (32-bit)
bp-dddcf79f-eab6-451c-a9f6-923c10190115 15/01/2019, 15:58
bp-9d9da253-2cbf-4896-8864-4542e0190115 15/01/2019, 12:42
I believe the first one occurred as I was trying to open a .eml attachment from a received email (opened in separate tab). IMAP/SMTP mailbox setup set to synchronise only 3 days worth of emails.
The second time when it happened I was not even actively using Thunderbird, but it was opened and close by itself... without my intervention :-)
Wayne I don't need info from you, it was just a way for me to let you know about crashes with my TB for your information as you seems to follow them up...
Comment 3•6 years ago
|
||
Your first crash better aligns with bug 1526127 - bp-dddcf79f-eab6-451c-a9f6-923c10190115 [@ shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown ] - trying to open a .eml attachment from a received email (opened in separate tab). IMAP/SMTP mailbox setup set to synchronise only 3 days worth of emails.
Your second crash is bug 1534119 - bp-9d9da253-2cbf-4896-8864-4542e0190115 [@ nsImapProtocol::HandleMessageDownLoadLine ] - not actively using Thunderbird
![]() |
||
Updated•6 years ago
|
Reporter | ||
Comment 4•6 years ago
|
||
Top Crashers for Firefox 69.0a1 (Nightly) - 7 days ago
#13 0.45% 0.39% shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown 37 37 0 0 34 0 2017-09-27
Signature report for shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown
Showing results from 7 days ago
2,710 Results
Windows 10 1401 51.7%
Windows 7 1013 37.4%
Windows 8.1 187 6.9%
Windows Vista 81 3.0%
Windows 8 27 1.0%
Windows XP 1 0.0%
Product
Firefox 69.0a1 37 1.4% 46
Thunderbird 69.0a1 7 0.3% 5
Firefox 68.0b7 131 4.8% 113
Firefox 68.0b8 130 4.8% 133
Firefox 68.0b6 71 2.6% 81
Firefox 68.0b5 32 1.2% 34
Firefox 68.0b4 17 0.6% 20
Firefox 68.0b9 15 0.6% 13
Firefox 68.0b3 10 0.4% 9
Firefox 67.0.1 376 13.9% 234
Firefox 67.0 227 8.4% 179
Firefox 67.0.2 43 1.6% 42
Firefox 60.7.0esr 289 10.7% 285
Thunderbird 60.7.0 425 15.7% 313
Thunderbird 60.7.1 2 0.1% 2
Firefox 60.6.2esr 43 1.6% 43
Firefox 60.6.3esr 42 1.5% 37
Firefox 60.6.1esr 28 1.0% 21
Firefox 52.9.0esr 81 3.0% 41
Architecture
amd64 1465 54.1%
x86 1245 45.9%
Comment 5•6 years ago
|
||
Judging by release date of firefox 68, it looks like crash rate more than doubled.
https://crash-stats.mozilla.org/signature/?signature=shutdownhang%20%7C%20nsThread%3A%3AShutdown%20%7C%20nsThreadManager%3A%3AShutdown&date=>%3D2019-01-26T00%3A00%3A00.000Z&date=<2019-07-26T23%3A59%3A00.000Z#graphs
Reporter | ||
Comment 6•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #5)
Judging by release date of firefox 68, it looks like crash rate more than doubled.
https://crash-stats.mozilla.org/signature/?signature=shutdownhang%20%7C%20nsThread%3A%3AShutdown%20%7C%20nsThreadManager%3A%3AShutdown&date=>%3D2019-01-26T00%3A00%3A00.000Z&date=<2019-07-26T23%3A59%3A00.000Z#graphs
[@ shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown ]
There is still a high volume of crashes with this signature. It is the #2 Top Crasher on versions 68.0.1 and 68.0 (second only to the [@ OOM | small] signature). It is #3 Top Crasher on versions 69.0b13 and 60.7.2esr.
It is affecting the currently Nightly version of Firefox (70.0a1) and is the #19 Top Crasher for that version.
Could the priority for this bug be re-evaluated? I would suggest raising the priority from P2 to P1.
Could this bug be assigned to someone?
Thanks
Top Crashers for Firefox 68.0.1
Top 50 Crashing Signatures. 7 days ago
#1 9.01% -7.36% shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown 2225 2225 0 0 2101 0 2017-09-27
Top Crashers for Firefox 68.0.2
Top 50 Crashing Signatures. 7 days ago
#15 0.6% -0.02% shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown 645 645 0 0 647 0 2017-09-27
Top Crashers for Firefox 60.8.0esr
Top 50 Crashing Signatures. 7 days ago
#7 1.15% 0.02% shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown 318 318 0 0 274 0 2017-09-27
Top Crashers for Firefox 69.0b14
Top 50 Crashing Signatures. 7 days ago
#5 1.86% -1.59% shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown 193 193 0 0 180 0 2017-09-27
Top Crashers for Firefox 69.0b13
Top 50 Crashing Signatures. 7 days ago
#2 6.69% -5.61% shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown 177 177 0 0 149 0 2017-09-27
Top Crashers for Firefox 60.7.2esr
Top 50 Crashing Signatures. 7 days ago
#2 6.23% 1.25% shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown 168 168 0 0 155 0 2017-09-27
Top Crashers for Firefox 68.0
Top 50 Crashing Signatures. 7 days ago
#1 7.21% -0.95% shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown 148 148 0 0 139 0 2017-09-27
Top Crashers for Firefox 70.0a1
Top 50 Crashing Signatures. 7 days ago
#18 0.76% 0.63% shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown 55 55 0 0 54 0 2017-09-27
Signature report for shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown
Showing results from 7 days ago
Windows 10 2637 47.5%
Windows 7 1910 34.4%
Windows 8.1 886 15.9%
Windows Vista 62 1.1%
Windows 8 60 1.1%
Windows XP 2 0.0%
Firefox 68.0.1 2225 40.0% 1026
Firefox 68.0.2 646 11.6% 725
Firefox 60.8.0esr 318 5.7% 215
Firefox 69.0b14 192 3.5% 111
Firefox 69.0b13 182 3.3% 125
Firefox 60.7.2esr 168 3.0% 149
Firefox 68.0 147 2.6% 152
Firefox 69.0b15 82 1.5% 90
Firefox 60.6.2esr 74 1.3% 56
Firefox 52.9.0esr 65 1.2% 71
Firefox 69.0b12 65 1.2% 47
Firefox 70.0a1 55 1.0% 55
Firefox 69.0b11 51 0.9% 36
Uptime Range
1 hour 1643 29.6%
1-5 min 1507 27.1%
5-15 min 1287 23.2%
15-60 min 1117 20.1%
< 1 min 3 0.1%
Architecture
amd64 3924 70.6%
x86 1633 29.4%
Updated•5 years ago
|
![]() |
||
Comment 7•5 years ago
|
||
The rate of shutdown hangs has recently doubled. Jimm, is the reason for this known?
Comment 8•5 years ago
•
|
||
(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #7)
The rate of shutdown hangs has recently doubled. Jimm, is the reason for this known?
It is doubled mainly from increased crashes of Thunderbird 68.x, probably due to users updating from 60.x, being tracked in bug 1526127.
Per the graph https://crash-stats.mozilla.org/signature/?signature=shutdownhang%20%7C%20nsThread%3A%3AShutdown%20%7C%20nsThreadManager%3A%3AShutdown&date=%3E%3D2019-10-21T01%3A45%3A00.000Z&date=%3C2020-01-21T01%3A45%3A00.000Z#graphs
![]() |
||
Updated•5 years ago
|
Comment 9•5 years ago
|
||
Thunderbird crashes should be referred to bug 1524247
Assignee | ||
Comment 10•4 years ago
•
|
||
Looking at the aggregation of MOZ_CRASH_REASON I see here now:
1 MOZ_CRASH(Shutdown too long, probably frozen, causing a crash.) 24732 100.00 %
2 MOZ_CRASH(Shutdown hanging before starting.) 1 0.00 %
For the vast majority then, looking at the RunWatchdog this seems to indicate that:
- All observers have been successfully notified and unblocked (
sShutdownNotified
istrue
) - No workers are hanging (
runtimeService->CrashIfHanging();
did not crash)
To me this seems to indicate that there is nothing left worth waiting for (otherwise we would report it here) and we are just bloating our crash statistics. Still this is unexpected behavior, but for the end-user it should be handled without harm and I do not see how we could transform those crash reports into something more actionable if there is nothing concrete left we wait for.
A possible fix might be, to replace this MOZ_CRASH with a MOZ_ASSERT_UNREACHABLE
and an exit(0) ? Or to dispatch a last "kill yourself now" message to the main thread and wait another second or two?
Assignee | ||
Updated•4 years ago
|
Updated•4 years ago
|
Assignee | ||
Comment 12•4 years ago
•
|
||
The signature just added by the release bot seems to be a different situation. If I look at MOZ_CRASH_REASON
there, I see only the MOZ_CRASH(Shutdown hanging before starting.)
case, which is quite the opposite extreme (not even the first shutdown phase succeeded).
Sylvestre, can we prevent the bot somehow from adding this signature here?
Comment 13•4 years ago
|
||
Jens, the bot is doing that because bug 1678330 is marked as dup of this bug.
So, either you undup (and reopen) bug 1678330 or you remove the signature from bug 1678330
Assignee | ||
Comment 14•4 years ago
|
||
Sorry, that was too easy...
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 15•4 years ago
|
||
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 16•4 years ago
|
||
Comment 17•4 years ago
|
||
bugherder |
Updated•4 years ago
|
Assignee | ||
Comment 18•4 years ago
|
||
While the fix is expected to potentially reduce the number of crashes, it is not necessary a complete solution. We expect QM shutdown to be faster in general now thanks to bug 1666214, and we hope to have better diagnostics about the remaining cases through this patch here and will continue to observe this.
And for some of these crashes the MOZ_CRASH_REASON might just change slightly to be more explicit about the phase we are stuck in.
Comment 19•4 years ago
|
||
The patch landed in nightly and beta is affected.
:jstutte, is this bug important enough to require an uplift?
If not please set status_beta
to wontfix
.
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 20•4 years ago
|
||
Comment on attachment 9192576 [details]
Bug 1505660: Promote profile-change-net-teardown and -before-change-qm to have a shutdown timer reset. r?#dom-workers-and-storage,dthayer
Beta/Release Uplift Approval Request
- User impact if declined: Some users might experience less shutdown crashes thanks to this patch. And at least we might gain some more diagnostic insights on the different shutdown hangs in different phases.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): The patch does only extend the explicit shutdown handling on two more shutdown phases. Normal operation (including successful shutdown) is unaffected.
- String changes made/needed:
Updated•4 years ago
|
Comment 21•4 years ago
|
||
Comment on attachment 9192576 [details]
Bug 1505660: Promote profile-change-net-teardown and -before-change-qm to have a shutdown timer reset. r?#dom-workers-and-storage,dthayer
approved for 85.0b6
Comment 22•4 years ago
|
||
uplift |
Comment on attachment 9192576 [details]
Bug 1505660: Promote profile-change-net-teardown and -before-change-qm to have a shutdown timer reset. r?#dom-workers-and-storage,dthayer
https://hg.mozilla.org/releases/mozilla-beta/rev/f6ecda1b2d689320f4a3ce86893afb9efbbdef1a
[clearing approval to get this off the needs-uplift queries]
Assignee | ||
Comment 23•3 years ago
|
||
FWIW, the current statistics show two interesting data points:
- Thunderbird is not appearing any more since mid-november. Is this just a change in reporting or did they actually solve something here?
- Since FF 94 each release causes a high spike, presumably right after the updates happened.
![]() |
||
Comment 24•3 years ago
|
||
Thunderbird crash reports: https://crash-stats.thunderbird.net/ is used for crash reports now, and https://crash-stats.mozilla.org/ rejects submissions for Thunderbird (bug 1608971).
Comment 25•3 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #23)
FWIW, the current statistics show two interesting data points:
- Thunderbird is not appearing any more since mid-november. Is this just a change in reporting or did they actually solve something here?
Sadly, yes. crash-stats terminated supporting Thunderbird crashes. Collection stopped Nov 9.
There is public access to individual crash reports. But the detailed data on backtrace.io requires a paid account.
Comment 26•3 years ago
|
||
There is public access to individual crash reports. But the detailed data on backtrace.io requires a paid account.
How expensive is such an account? How do you get one? What's the public interface for searching this information, once you have an account?
I'm not necessarily going to go through with this. But the https://backtrace.io site has no information on this, or on Thunderbird at all. So it'd be nice to know.
Comment 29•3 years ago
|
||
Copying crash signatures from duplicate bugs.
Comment 30•3 years ago
|
||
crash-stats has the XPCOMSpinEventLoopStack
field (under crash annotations) that is useful in this context. These signatures are going to be popular and will change over time but, as of today, I'm seeing:
@ shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown
By far the most popular cause is nsThread::Shutdown: BitsCommander
, which is trying to stop a thread that is in the middle of a COM operation to do a background download. There are also BitsMonitor {some_guid}
threads that are also related to downloads. I don't know if we expect to wait for or to kill incomplete downloads at shutdown. There are a few other stray thread names under this hang but most of the rest (about half) are just empty names. With those, it seems common to find BitsCommander doing COM on some thread (like with this crash ).
@ shutdownhang | mozilla::TaskController::GetRunnableForMTTask | nsThread::Shutdown | nsThreadManager::Shutdown
This only goes to Fx 84 -- all of the others are Thunderbird crashes. Most of the labels are empty -- maybe Tb doesn't label the relevant threads.
@ shutdownhang | nsThread::Shutdown | nsThreadManager::ShutdownNonMainThreads
Again, mostly BitsCommander
threads not shutting down, although this signature brings in a healthy number of Update Watcher
and TelemetryModule
threads. There isn't a strong pattern here but the Telemetry hangs suggest AuthenticodeImpl::GetBinaryOrgName is too slow. See e.g. thread 24 of this crash. The Update Watcher thread is stuck waiting for the updater process to finish in nsUpdateProcessor::WaitForProcess.
Updated•3 years ago
|
Assignee | ||
Comment 31•3 years ago
|
||
I scraped crashstats for nightly instances of this signature where some of the thread names contain SHDRCV
and/or SHDACK
(added by bug 1770451).
FWIW, here is the result.
There is definitely some correlation with the Updater process being waited on, as in bug 1772908. But we should also look at the other ones.
Assignee | ||
Comment 32•3 years ago
•
|
||
There is also one interesting instance concerning the BHMgr Monitor
and Cookie
threads.
- We enter phase
ShutdownPhase::AppShutdown
- (
MainThread
) notifies theCookieService
to shutdown. - (
MainThread
) TheCookieService
starts a synchronousshutdown()
. - (
MainThread
)nsThread::Shutdown()
is spinning the event loop, waiting for the thread to finish - (
BHMgr Monitor
)BackgroundHangManager::MonitorThread
detects a hang and wants to report it (might have happened even earlier) - (
BHMgr Monitor
)BackgroundHangThread::ReportHang
is called withmManager->mLock IS locked
- (
BHMgr Monitor
) For whatever cosmic reason, writing to file takes some time insidensLocalFile::OpenNSPRFileDesc
- (
MainThread
)ProcessNextEvent
wants toBackgroundHangMonitor().NotifyWait();
which wants to lock the samemManager->mLock
already held byBackgroundHangThread::ReportHang
and starves - (
Cookie
thread) In the meantime, theCookie
thread reached its end and is just waiting for thensThreadShutdownAckEvent
to be processed on theMainThread
.
Also for performance reasons it seems questionable to me that a lock used by frequent event processing is held on the background hang monitor thread for long-running tasks like hang-report disk writings.
Bob, :jya, the lock that blocks the main thread has been introduced relatively recently (2 years ago) by bug 1634253.
Dragana, on a different side note the synchronous nsThread::Shutdown()
in CookiePersistentStorage::Close
could maybe be avoided in favor of nsThread::AsyncShutdown()
? It would not stop this from happening, of course, but there would be less blocking layers potentially stacking on the main thread.
Comment 33•3 years ago
|
||
Doug, you you recall any of this stuff. Bug 1594577 changed some of the relevant code, though https://hg.mozilla.org/mozilla-central/annotate/2426bd765f8b74744fa7042826423ec5ddca53a7/xpcom/threads/nsThread.cpp#l1094 was added later.
Comment 34•3 years ago
|
||
Andrew, I know this is a necro, but would you mind having a cursory look at this when you get a chance just in case Jean-Yves fails to respond? (See comment 32)
Comment 35•3 years ago
|
||
I will open a bug to investigate a possible async shutdown of the cookie storage.
Comment 36•3 years ago
|
||
A bit of additional archaeology and a bit of additional correlation.
(In reply to David Parks [:handyman] from comment #30)
crash-stats has the
XPCOMSpinEventLoopStack
field (under crash annotations) that is useful in this context. These signatures are going to be popular and will change over time
And indeed:
@ shutdownhang | nsThread::Shutdown | nsThreadManager::Shutdown
@ shutdownhang | nsThread::Shutdown | nsThreadManager::ShutdownNonMainThreads
These are effectively the same callstack, only before and after bug 1764119.
@ shutdownhang | mozilla::TaskController::GetRunnableForMTTask | nsThread::Shutdown | nsThreadManager::Shutdown
This only goes to Fx 84 -- all of the others are Thunderbird crashes. Most of the labels are empty -- maybe Tb doesn't label the relevant threads.
This is also the same callstack.
The primary difference — the presence of mozilla::TaskController::GetRunnableForMTTask — is spurious. The call stack is essentially identical in Fx and Tb, as can be seen in Visual Studio in the minidumps; but this frame is consistently inlined in Fx and consistently not inlined in Tb. Presumably this is due to some locally-uninteresting choice of compiler flags.
The secondary difference -- the absence of XPCOMSpinEventLoopStack annotations -- is mostly due to the fact that Tb didn't start writing these until about v95. As with Fx, the signature has changed to @ shutdownhang | mozilla::TaskController::GetRunnableForMTTask | nsThread::Shutdown | nsThreadManager::ShutdownNonMainThreads in recent versions, and XPCOMSpinEventLoopStack is generally present.
Thunderbird crashes with this signature are being investigated in bug 1768344 and bug 1749142.
(I've added the latter signature to the crash data, above, but perhaps I ought to have removed the former?)
More than 80% of these shutdown hangs appear to be in Thunderbird; of those, about 80% seem to be blocked on the "IMAP" thread. The remainder are stuck on "Update Watcher" and "TelemetryModule".
On the Firefox side (post-bug 1764119 only, for simplicity's sake), the vast majority of hangs are in "Update Watcher" and "BitsCommander", which are very nearly identical — it's not surprising that they're correlated, given that the only thing we use BITS for is downloading updates, but I have no explanation for why their ratio is approximately 1. The other hangs with this signature are mostly attributed to "TelemetryModule" and "Wifi Monitor", with "BitsMonitor {various uuids}" and "BHMgr Processor" being distant runners-up.
Updated•3 years ago
|
Updated•3 years ago
|
Assignee | ||
Comment 37•2 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #31)
I scraped crashstats for nightly instances of this signature where some of the thread names contain
SHDRCV
and/orSHDACK
(added by bug 1770451).FWIW, here is the result.
I updated the sheet.
Comment 38•2 years ago
|
||
Moving to XPCOM, since many of the hangs are in the parent process.
Assignee | ||
Comment 39•2 years ago
|
||
(In reply to Olli Pettay [:smaug][bugs@pettay.fi] from comment #38)
Moving to XPCOM, since many of the hangs are in the parent process.
Just to be precise: All of them.
Updated•2 years ago
|
Assignee | ||
Comment 40•2 years ago
|
||
Moved ni?s to bug 1796060.
Assignee | ||
Comment 41•2 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #31)
I scraped crashstats for nightly instances of this signature where some of the thread names contain
SHDRCV
and/orSHDACK
(added by bug 1770451).FWIW, here is the result.
I updated the sheet. (Unfortunately this is not a simple query in crash-stats)
Comment 42•2 years ago
|
||
On the Firefox side (post-bug 1764119 only, for simplicity's sake), the vast majority of hangs are in "Update Watcher" and "BitsCommander", which are very nearly identical — it's not surprising that they're correlated, given that the only thing we use BITS for is downloading updates, but I have no explanation for why their ratio is approximately 1. The other hangs with this signature are mostly attributed to "TelemetryModule" and "Wifi Monitor", with "BitsMonitor {various uuids}" and "BHMgr Processor" being distant runners-up.
Jens observed that some of these have the BackgroundTaskName: backgroundupdate
annotation. Such crashes during shutdown are exactly the kind of crash we would expect from background tasks: lots of Gecko doesn't have clean shutdown and background tasks can exit in a rush; there's often not much work to do. (And there's a timeout mechanism but last I checked, which was a while back, we don't see timeouts frequently.)
The two listed threads are very likely to be involved: backgroundupdate
tasks install updates and interact with BITS (to start downloads and check on transfer progress).
Let me know if I can be of assistance debugging or reproducing these crashes. Thanks!
![]() |
||
Updated•2 years ago
|
Comment 43•2 years ago
•
|
||
I hit this on 107.0b3 a bit ago. TB had presented me with the OAuth2-type username and PW dialogue but I was in the middle of shutting down TB when it hung and then crashed. So either an OAuth2 or PW manager issue.
https://crash-stats.mozilla.org/report/index/0a3b7415-dc4d-48d4-ba59-1eccf0221104
Comment 44•2 years ago
|
||
And a second shortly afterwards: https://crash-stats.mozilla.org/report/index/5dd1a9ea-54a5-4db2-8d65-1163e0221104
Maybe related to bug 1768344 since that's what I did before it crashed this time.
Comment 45•2 years ago
|
||
Added a signature
Comment 46•2 years ago
|
||
@gsvelto: would it be feasible to bucket these crashes by their XPCOMSpinEventLoopStack
value, rather than by their callstack? It should be both more stable and more indicative of the actual problem.
(Although at the moment it'd require some special-casing for BitsMonitor
threads, due to the unfortunate embedded GUID.)
Assignee | ||
Comment 47•2 years ago
|
||
(In reply to Nick Alexander :nalexander [he/him] from comment #42)
On the Firefox side (post-bug 1764119 only, for simplicity's sake), the vast majority of hangs are in "Update Watcher" and "BitsCommander", which are very nearly identical — it's not surprising that they're correlated, given that the only thing we use BITS for is downloading updates, but I have no explanation for why their ratio is approximately 1. The other hangs with this signature are mostly attributed to "TelemetryModule" and "Wifi Monitor", with "BitsMonitor {various uuids}" and "BHMgr Processor" being distant runners-up.
Jens observed that some of these have the
BackgroundTaskName: backgroundupdate
annotation. Such crashes during shutdown are exactly the kind of crash we would expect from background tasks: lots of Gecko doesn't have clean shutdown and background tasks can exit in a rush; there's often not much work to do. (And there's a timeout mechanism but last I checked, which was a while back, we don't see timeouts frequently.)The two listed threads are very likely to be involved:
backgroundupdate
tasks install updates and interact with BITS (to start downloads and check on transfer progress).Let me know if I can be of assistance debugging or reproducing these crashes. Thanks!
From recent nightly numbers (1 week) I see:
- 15 non-background task instances. All I looked at have to do with
BitsCommander
orUpdateWatcher
. - 172 background task ones, all of which have "backgroundupdate" as task name.
We should probably re-triage (and maybe rename) bug 1741675.
Comment 48•2 years ago
|
||
(In reply to Ray Kraesig [:rkraesig] from comment #46)
@gsvelto: would it be feasible to bucket these crashes by their
XPCOMSpinEventLoopStack
value, rather than by their callstack? It should be both more stable and more indicative of the actual problem.
Yes, it's possible and I'll file a bug for it.
(Although at the moment it'd require some special-casing for
BitsMonitor
threads, due to the unfortunate embedded GUID.)
That's a problem. For something to be in the signature it should be stable over time and crashes, otherwise bucketing won't work. We have steps within Socorro to sanitize stuff (mostly for privacy reason) but they add complexity and I don't think we want that in signature generation.
Assignee | ||
Comment 49•2 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #47)
From recent nightly numbers (1 week) I see:
- 15 non-background task instances. All I looked at have to do with
BitsCommander
orUpdateWatcher
.- 172 background task ones, all of which have "backgroundupdate" as task name.
We should probably re-triage (and maybe rename) bug 1741675.
This is still the case, it seems. And this is our second top-most crash in nightly, currently.
Updated•2 years ago
|
Assignee | ||
Comment 50•2 years ago
|
||
Updating signatures to the active ones for current versions.
Assignee | ||
Updated•2 years ago
|
Comment 51•2 years ago
|
||
FWIW, most Thunderbird crashes are German locale (German 21%, English 19%, Japanese 15%, in that order).
Firefox crashes are overwhelmingly English - 38%
Assignee | ||
Comment 52•1 years ago
|
||
Currently ~95% of the Firefox crashes are from backgroundupdate
, see bug 1741675.
![]() |
||
Updated•1 year ago
|
Comment hidden (off-topic) |
Comment hidden (off-topic) |
Updated•1 year ago
|
Comment 55•1 year ago
|
||
I'm going to unmark this main bug as S2, it makes more sense to prioritize the actual signatures.
Assignee | ||
Comment 56•1 year ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #55)
I'm going to unmark this main bug as S2, it makes more sense to prioritize the actual signatures.
I agree, but we need some Sx value to make the triage bot happy.
Comment 57•1 year ago
|
||
The severity field for this bug is set to S3
. However, the following bug duplicate has higher severity:
- Bug 1769211: S2
:jstutte, could you consider increasing the severity of this bug to S2
?
For more information, please visit BugBot documentation.
Comment 59•9 months ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:jstutte, maybe it's time to close this bug?
For more information, please visit BugBot documentation.
Assignee | ||
Updated•9 months ago
|
Comment 60•7 months ago
|
||
Based on the topcrash criteria, the crash signatures linked to this bug are not in the topcrash signatures anymore.
For more information, please visit BugBot documentation.
Description
•