Closed
Bug 1377277
Opened 8 years ago
Closed 8 years ago
Crash in shutdownhang | NtWaitForAlertByThreadId | RtlSleepConditionVariableSRW | SleepConditionVariableSRW | mozilla::detail::ConditionVariableImpl::wait | mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::nsChainedEventQueue::GetEvent | nsT...
Categories
(Core :: Networking, defect, P1)
Tracking
()
RESOLVED
WORKSFORME
| Tracking | Status | |
|---|---|---|
| firefox-esr52 | --- | unaffected |
| firefox54 | --- | unaffected |
| firefox55 | --- | wontfix |
| firefox56 | --- | wontfix |
People
(Reporter: calixte, Unassigned)
Details
(Keywords: crash, topcrash-thunderbird, Whiteboard: [necko-active][tbird topcrash])
Crash Data
This bug was filed from the Socorro interface and is
report bp-8dc5b2ff-2bcd-446f-8552-ea31d0170629.
=============================================================
There are 68 crashes in beta 55 and 9 in nightly 56, they all appeared the 2017-06-29.
:erahm, could you investigate please ?
Flags: needinfo?(erahm)
Comment 1•8 years ago
|
||
This looks like a mozilla::net::nsSocketTransportService::Shutdown hang, not sure I can add much here.
Component: XPCOM → Networking
Flags: needinfo?(erahm)
Comment 2•8 years ago
|
||
Dragana might have some thoughts?
Flags: needinfo?(dd.mozilla)
Whiteboard: [necko-active]
Comment 3•8 years ago
|
||
I think this is out standard shutdown hangs.
I looked at couple of crashes and I notice a lot of nss and psm hangs:
(in mozilla::psm::StopSSLServerCertVerificationThreads())
https://crash-stats.mozilla.com/report/index/8dc5b2ff-2bcd-446f-8552-ea31d0170629#allthreads
https://crash-stats.mozilla.com/report/index/75c92d0f-1aaf-484e-9492-313400170711#allthreads
https://crash-stats.mozilla.com/report/index/c719826d-fc6a-4683-b6cd-cf7940170711#allthreads
https://crash-stats.mozilla.com/report/index/64a66245-22c6-48e6-b791-2a5c00170711#allthreads
(more nss hangs):
https://crash-stats.mozilla.com/report/index/a65a1e27-1440-47bf-a0f4-d96ec0170711#allthreads
https://crash-stats.mozilla.com/report/index/80526016-c258-407f-be40-c51970170711#allthreads
https://crash-stats.mozilla.com/report/index/ac858f02-12de-4d4f-b8e1-14df30170711#allthreads
socketThread is already shutdown at these hangs:
https://crash-stats.mozilla.com/report/index/f5e03c42-6861-4bb4-9df4-cb78c0170711#allthreads
https://crash-stats.mozilla.com/report/index/a843582d-7d2f-43ba-99b2-f14a20170711#allthreads
https://crash-stats.mozilla.com/report/index/789873ed-a4bb-4a6e-86a0-14fe80170711#allthreads
https://crash-stats.mozilla.com/report/index/55b46a98-f916-499d-90ee-017f50170711#allthreads
https://crash-stats.mozilla.com/report/index/cd13db6f-8d29-4232-adcf-9e43e0170711#allthreads
ttaubert, keeler, ca you please look at some psm and nss hangs?
Flags: needinfo?(ttaubert)
Flags: needinfo?(dkeeler)
Flags: needinfo?(dd.mozilla)
Comment 4•8 years ago
|
||
FWIW this signature appeared on June 29 because that's when bug 1375511 was deployed on crash-stats, there was an even more generic signature for these issues before then.
Comment 5•8 years ago
|
||
(In reply to Dragana Damjanovic [:dragana] from comment #3)
> I think this is out standard shutdown hangs.
>
> I looked at couple of crashes and I notice a lot of nss and psm hangs:
> (in mozilla::psm::StopSSLServerCertVerificationThreads())
> https://crash-stats.mozilla.com/report/index/8dc5b2ff-2bcd-446f-8552-
> ea31d0170629#allthreads
> https://crash-stats.mozilla.com/report/index/75c92d0f-1aaf-484e-9492-
> 313400170711#allthreads
> https://crash-stats.mozilla.com/report/index/c719826d-fc6a-4683-b6cd-
> cf7940170711#allthreads
> https://crash-stats.mozilla.com/report/index/64a66245-22c6-48e6-b791-
> 2a5c00170711#allthreads
>
Some of these appear to be hanging in nsNSSHttpRequestSession::internal_send_receive_attempt, much like some of the reports in bug 1375726.
Others seem to be hanging while attempting to acquire a reentrant lock in NSS. Interestingly, these all have loaded the PKCS#11 module "aetpkss1.dll", which I've seen in many of these hangs. Either there's a bug in NSS and/or PSM that's exacerbated by having a PKCS#11 module or this particular module is misbehaving and causing these hangs.
> (more nss hangs):
> https://crash-stats.mozilla.com/report/index/a65a1e27-1440-47bf-a0f4-
> d96ec0170711#allthreads
> https://crash-stats.mozilla.com/report/index/80526016-c258-407f-be40-
> c51970170711#allthreads
> https://crash-stats.mozilla.com/report/index/ac858f02-12de-4d4f-b8e1-
> 14df30170711#allthreads
These all have a PKCS#11 module loaded (aetpkss11.dll or bit4xpki.dll, although the latter doesn't directly show up in the stacks)
> socketThread is already shutdown at these hangs:
> https://crash-stats.mozilla.com/report/index/f5e03c42-6861-4bb4-9df4-
> cb78c0170711#allthreads
> https://crash-stats.mozilla.com/report/index/a843582d-7d2f-43ba-99b2-
> f14a20170711#allthreads
> https://crash-stats.mozilla.com/report/index/789873ed-a4bb-4a6e-86a0-
> 14fe80170711#allthreads
> https://crash-stats.mozilla.com/report/index/55b46a98-f916-499d-90ee-
> 017f50170711#allthreads
> https://crash-stats.mozilla.com/report/index/cd13db6f-8d29-4232-adcf-
> 9e43e0170711#allthreads
These don't seem to have anything to do with NSS or PSM.
Flags: needinfo?(dkeeler)
Comment 6•8 years ago
|
||
(In reply to David Keeler [:keeler] (use needinfo?) from comment #5)
> (In reply to Dragana Damjanovic [:dragana] from comment #3)
> Others seem to be hanging while attempting to acquire a reentrant lock in
> NSS. Interestingly, these all have loaded the PKCS#11 module "aetpkss1.dll",
> which I've seen in many of these hangs. Either there's a bug in NSS and/or
> PSM that's exacerbated by having a PKCS#11 module or this particular module
> is misbehaving and causing these hangs.
Yeah, these seem similar to what I wrote in bug 1372505 comment #13.
The last two have the socket thread and multiple pkix threads hanging at nssSlot_EnterMonitor(). The SmartCard thread looks suspicious, and I think that `mod->refLock` as well as `PK11SlotInfo->sessionLock` and `nssSlot->lock` actually all refer to the same locks.
https://searchfox.org/nss/rev/54740990248e08713f43ce1ea0e0440ed28df2dc/lib/pk11wrap/pk11slot.c#365
https://searchfox.org/nss/rev/54740990248e08713f43ce1ea0e0440ed28df2dc/lib/pk11wrap/dev3hack.c#122
There's probably plenty of possibility to deadlock...
Another thing that I wondered, we seem to call/use SECMOD_WaitForAnyTokenEvent(), but never use SECMOD_CancelWait(). Not sure if that's a problem.
https://searchfox.org/nss/search?q=symbol:_Z17SECMOD_CancelWait&redirect=false
> > socketThread is already shutdown at these hangs:
> > https://crash-stats.mozilla.com/report/index/f5e03c42-6861-4bb4-9df4-
> > cb78c0170711#allthreads
> > https://crash-stats.mozilla.com/report/index/a843582d-7d2f-43ba-99b2-
> > f14a20170711#allthreads
> > https://crash-stats.mozilla.com/report/index/789873ed-a4bb-4a6e-86a0-
> > 14fe80170711#allthreads
> > https://crash-stats.mozilla.com/report/index/55b46a98-f916-499d-90ee-
> > 017f50170711#allthreads
> > https://crash-stats.mozilla.com/report/index/cd13db6f-8d29-4232-adcf-
> > 9e43e0170711#allthreads
>
> These don't seem to have anything to do with NSS or PSM.
I can't find anything that points to NSS/PSM either.
Flags: needinfo?(ttaubert)
Comment 7•8 years ago
|
||
#2 crash for Thunderbird 55.0b2. most of users never crashed prior to 55.0b2. Some examples
bp-4a5221ef-f6f9-49f4-a9a7-cd8670170726
bp-7ee2f376-c54c-49a1-a86f-7080f0170719
bp-8e663371-57c7-4b6e-b964-1241b0170719
bp-ae4e7ffc-69d3-4a18-be8f-fe38a0170726
A few use frontier. Not sure what to make of it.
Keywords: topcrash-thunderbird
Whiteboard: [necko-active] → [necko-active][tbird topcrash]
Comment 8•8 years ago
|
||
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #7)
> #2 crash for Thunderbird 55.0b2. most of users never crashed prior to
> 55.0b2. Some examples
> bp-4a5221ef-f6f9-49f4-a9a7-cd8670170726
> bp-7ee2f376-c54c-49a1-a86f-7080f0170719
> bp-8e663371-57c7-4b6e-b964-1241b0170719
> bp-ae4e7ffc-69d3-4a18-be8f-fe38a0170726
>
> A few use frontier. Not sure what to make of it.
bp-7ee2f376-c54c-49a1-a86f-7080f0170719 and bp-ae4e7ffc-69d3-4a18-be8f-fe38a0170726 are well known IMAP password dialog issue. User doesn't close this password dialog, so shutdown isn't processed.
Comment 9•8 years ago
|
||
Bulk priority update: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P1
Comment 10•8 years ago
|
||
Comment 11•8 years ago
|
||
Can I work around this in my (legacy XPCOM/embedded web) extension? Am I supposed to be calling a shut down in addition to a start up?
This is happening in Greasemonkey 3.12, but not 3.11, where ~the only change is embedding a webext, for migrating data into.
Or is there a way for me to verify that this is the real cause of the failure we're seeing (comment #10)?
Comment 12•8 years ago
|
||
I don't think the GreaseMonkey issue is related to the NSS issue that's discussed at the top of this bug. The GreaseMonkey stacks are hanging while a synchronous XMTLHttpRequest hangs spinning the event loop, e.g: https://crash-stats.mozilla.com/report/index/da1e71e9-3d7d-47c4-a237-7cf3a1170922
Comment 13•8 years ago
|
||
A crash report from the Greasemonkey issue: https://crash-stats.mozilla.com/report/index/7c18a99c-428b-40a8-8e30-360e81170921
If you click the "Bugzilla" tab for that tab, it lists two related bugs:
bug 1388370 (marked as fixed over a month ago) and this one.
Also, the crash report signature is the same as this bug, that's why I mentioned it as related in the issue in GitHub.
Comment 14•8 years ago
|
||
(In reply to Kostas from comment #13)
> Also, the crash report signature is the same as this bug, that's why I
> mentioned it as related in the issue in GitHub.
Yes, the crash signature matches, but from a quick look the underlying cause is probably different. My comment was addressed at comment 11, which sounds like it didn't realize the initial comments are about a problem that's probably not the same as the one greasemonkey is seeing.
(I might also just be wrong and confusing matters for everyone, so I'll shut up until someone more informed can comment instead)
Comment 15•8 years ago
|
||
I filed another bug ticket (bug 1402201) for GM crash.
I didn't seem to get the same signature as this one (one of report: https://crash-stats.mozilla.com/report/index/01f2d2f5-415b-4c6c-aca3-b9fa30170922).
Comment 16•8 years ago
|
||
> I didn't seem to get the same signature as this one (one of report: https://crash-stats.mozilla.com/report/index/01f2d2f5-415b-4c6c-aca3-b9fa30170922).
In that report you use FF 56 beta.
That's why you got a different signature ("shutdownhang | NtWaitForKeyedEvent | RtlSleepConditionVariableCS | SleepConditionVariableCS").
I use FF 55.0.3 stable.
I've recreated this repeatedly, and in no case did I get that signature,
it's always the same as this bug title:
https://crash-stats.mozilla.com/report/index/bp-ff15a46d-e0e1-4e3e-8293-9eac41170922
https://crash-stats.mozilla.com/report/index/bp-f5292a96-cd71-47e4-82b8-385af0170922
https://crash-stats.mozilla.com/report/index/bp-0e4c18d1-6106-47ee-80c7-063dc1170922
https://crash-stats.mozilla.com/report/index/bp-82af81d5-0122-4455-a9fa-86b7d1170922
https://crash-stats.mozilla.com/report/index/bp-ad897f29-0a40-4f7b-beeb-ffa0d1170922
Comment 17•8 years ago
|
||
(In reply to Kostas from comment #16)
> > I didn't seem to get the same signature as this one (one of report: https://crash-stats.mozilla.com/report/index/01f2d2f5-415b-4c6c-aca3-b9fa30170922).
>
> In that report you use FF 56 beta.
> That's why you got a different signature ("shutdownhang |
> NtWaitForKeyedEvent | RtlSleepConditionVariableCS |
> SleepConditionVariableCS").
I can't get the same signature in stable either.
https://crash-stats.mozilla.com/report/index/dee75707-6a2c-4dca-b6ea-126a41170922
https://crash-stats.mozilla.com/report/index/85e8ed57-45ee-4aba-82ae-630411170922
https://crash-stats.mozilla.com/report/index/efaf3519-1246-4768-9f25-9935c1170922
All GM crashes, have signature of "shutdownhang | NtWaitForKeyedEvent | RtlSleepConditionVariableSRW | SleepConditionVariableSRW | ...".
Comment 18•8 years ago
|
||
Clarification: by "can't get the same" I mean even in stable my signature is different from the title here.
Comment 19•8 years ago
|
||
> I mean even in stable my signature is different from the title here.
They are almost the same (except NtWaitForAlertByThreadId -> NtWaitForKeyedEvent ):
shutdownhang | NtWaitForAlertByThreadId | RtlSleepConditionVariableSRW | SleepConditionVariableSRW | mozilla::detail::ConditionVariableImpl::wait | mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::nsChainedEventQueue::GetEvent | nsThread::Ge..
shutdownhang | NtWaitForKeyedEvent | RtlSleepConditionVariableSRW | SleepConditionVariableSRW | mozilla::detail::ConditionVariableImpl::wait | mozilla::CondVar::Wait | nsEventQueue::GetEvent | nsThread::nsChainedEventQueue::GetEvent | nsThread::GetEven...
I see in your reports that you use win 7. Maybe that's why it's different (I use win 10).
Or, maybe you used the same profile from FF 56, to FF 55 ?
If yes, do you have a backup of your Firefox profile from FF 55 to restore, and try to recreate the issue ?
Comment 20•8 years ago
|
||
I use brand new profile with minimal files:
extensions\{e4a8a97b-f2ed-450b-b12d-ee082ba24781}.xpi (GM)
better_better_booru.user.js (the script I encountered problem.)
config.xml (GM config)
extensions.json (so I don't need to re-install it everytime).
to test.
Comment 21•8 years ago
|
||
I don't see any crashes with this signature in the last month.
Jason, can we close this bug?
Flags: needinfo?(jduell.mcbugs)
Comment 22•8 years ago
|
||
Most probably the signature has changed. We can close this bug.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(jduell.mcbugs)
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•