Closed
Bug 1273475
Opened 8 years ago
Closed 8 years ago
Crash in shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | mozilla::CondVar::Wait | nsNSSActivityState::restrictActivityToCurrentThread
Categories
(Core :: Security: PSM, defect, P1)
Core
Security: PSM
Tracking
()
RESOLVED
FIXED
mozilla51
People
(Reporter: mcmanus, Assigned: keeler)
References
Details
(Keywords: crash, Whiteboard: [psm-assigned])
Crash Data
Attachments
(4 files)
58 bytes,
text/x-review-board-request
|
jcj
:
review+
ttaubert
:
review+
lizzard
:
approval-mozilla-aurora+
lizzard
:
approval-mozilla-beta+
|
Details |
58 bytes,
text/x-review-board-request
|
ttaubert
:
review+
|
Details |
9.58 KB,
patch
|
Details | Diff | Splinter Review | |
3.81 KB,
patch
|
Details | Diff | Splinter Review |
just hit https://crash-stats.mozilla.com/report/index/fa3c765d-623c-4d4f-8883-d161e2160517 - on nightly 49 There seem to 100+ of them in crash stats without a bug that I can find. Perhaps low hanging project uptime fruit. It appears to be older than 45 https://crash-stats.mozilla.com/signature/?signature=shutdownhang+%7C+WaitForSingleObjectEx+%7C+WaitForSingleObject+%7C+PR_WaitCondVar+%7C+mozilla%3A%3ACondVar%3A%3AWait+%7C+nsNSSActivityState%3A%3ArestrictActivityToCurrentThread#summary
Assignee | ||
Updated•8 years ago
|
Whiteboard: [psm-backlog]
Comment 1•8 years ago
|
||
(Updating summary so that hopefully this bug shows up on crash-stats.)
Crash Signature: [@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | mozilla::CondVar::Wait | nsNSSActivityState::restrictActivityToCurrentThread]
Keywords: crash
Summary: restrictActivityToCurrentThread shutdown hang → Crash in shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | mozilla::CondVar::Wait | nsNSSActivityState::restrictActivityToCurrentThread
Comment 3•8 years ago
|
||
Crash volume for signature 'shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | mozilla::CondVar::Wait | nsNSSActivityState::restrictActivityToCurrentThread': - nightly (version 50): 60 crashes from 2016-06-06. - aurora (version 49): 579 crashes from 2016-06-07. - beta (version 48): 5339 crashes from 2016-06-06. - release (version 47): 0 crashes from 2016-05-31. - esr (version 45): 0 crashes from 2016-04-07. Crash volume on the last weeks: W. N-1 W. N-2 W. N-3 W. N-4 W. N-5 W. N-6 W. N-7 - nightly 6 12 8 7 6 8 5 - aurora 72 85 86 71 85 78 66 - beta 907 767 746 699 713 628 555 - release 0 0 0 0 0 0 0 - esr 0 0 0 0 0 0 0 Affected platform: Windows
Assignee | ||
Comment 5•8 years ago
|
||
Well, bad news is we introduced a deadlock. Good news is I'm fairly sure I've identified what it is and that the patch I'm working on fixes it. I'll have that ready tomorrow.
Assignee: nobody → dkeeler
Flags: needinfo?(dkeeler)
Priority: -- → P1
Whiteboard: [psm-backlog] → [psm-assigned]
Assignee | ||
Comment 6•8 years ago
|
||
This fixes two issues: 1. nsNSSShutDownList::evaporateAllNSSResources could deadlock by acquiring sListLock and then the singleton's mNSSActivityStateLock in nsNSSActivityState::restrictActivityToCurrentThread. 2. Calling UnloadLoadableRoots before nsNSSShutDownList::evaporateAllNSSResources could result in removing modules that were still in use, causing assertion failures and potential crashes. Review commit: https://reviewboard.mozilla.org/r/69044/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/69044/
Attachment #8777512 -
Flags: review?(ttaubert)
Attachment #8777512 -
Flags: review?(jjones)
Comment 7•8 years ago
|
||
Comment on attachment 8777512 [details] bug 1273475 - fix deadlock and potential crash when PSM shuts down NSS https://reviewboard.mozilla.org/r/69044/#review66200 I guess this would uplift into Aurora. Since it won't get a full cycle in Nightly, perhaps this should use MOZ_RELEASE_ASSERT so we collect destructor-wrong-thread crashes in the Beta cycle too, instead of hanging. I'd love to get Tim's feedback on that. Otherwise, r+, and please carry the r+ through if we're moving to RELEASE_ASSERT.
Attachment #8777512 -
Flags: review?(jjones) → review+
Comment 8•8 years ago
|
||
Patrick - is the 5000 figure for 48 up there in Comment 3 high? Like, should we expect to need to uplift this all the way to firefox48?
Flags: needinfo?(mcmanus)
Reporter | ||
Comment 9•8 years ago
|
||
(In reply to J.C. Jones [:jcj] from comment #8) > Patrick - is the 5000 figure for 48 up there in Comment 3 high? Like, should > we expect to need to uplift this all the way to firefox48? its certainly possible as a ride along.. sylvestre would be the person I would ask.
Flags: needinfo?(mcmanus) → needinfo?(sledru)
Comment 10•8 years ago
|
||
Yes, probably. Could you fill the uplift request to Aurora, beta and release? Thanks
Flags: needinfo?(sledru)
Updated•8 years ago
|
Attachment #8777512 -
Flags: review?(ttaubert) → review+
Comment 11•8 years ago
|
||
Comment on attachment 8777512 [details] bug 1273475 - fix deadlock and potential crash when PSM shuts down NSS https://reviewboard.mozilla.org/r/69044/#review66608 LGTM
Comment 12•8 years ago
|
||
(In reply to J.C. Jones [:jcj] from comment #7) > I guess this would uplift into Aurora. Since it won't get a full cycle in > Nightly, perhaps this should use MOZ_RELEASE_ASSERT so we collect > destructor-wrong-thread crashes in the Beta cycle too, instead of hanging. > I'd love to get Tim's feedback on that. Yeah, that might indeed be a good idea to find misbehaving code earlier.
Comment 13•8 years ago
|
||
The volume is not very big on release. As it is a shutdown issue, I won't take that as a ride along.
status-firefox51:
--- → affected
Comment 14•8 years ago
|
||
OK, having trouble getting my follow-on patch to add-in to Keeler's MozReview (it's opening a whole new review w/ his patch included), so let's get the first patch landed and I'll add the follow-on after.
Keywords: checkin-needed,
leave-open
Comment 15•8 years ago
|
||
Pushed by cbook@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/990954b29382 fix deadlock and potential crash when PSM shuts down NSS. r=ttaubert, r=jcj
Keywords: checkin-needed
Comment hidden (mozreview-request) |
Comment 17•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/990954b29382
Comment 18•8 years ago
|
||
mozreview-review |
Comment on attachment 8778981 [details] bug 1273475 - use release assertions for PSM->NSS shutdown https://reviewboard.mozilla.org/r/70054/#review67618
Attachment #8778981 -
Flags: review?(ttaubert) → review+
Comment 19•8 years ago
|
||
Marking checkin-needed for the release assertions patch. Leaving it open, still. The try run is here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=028ec8bac2c4
Keywords: checkin-needed
Comment 20•8 years ago
|
||
Pushed by cbook@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/f7e33d3604f5 use release assertions for PSM->NSS shutdown. r=ttaubert
Keywords: checkin-needed
Comment 21•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f7e33d3604f5
Assignee | ||
Updated•8 years ago
|
Keywords: leave-open
Assignee | ||
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 22•8 years ago
|
||
Comment on attachment 8777512 [details] bug 1273475 - fix deadlock and potential crash when PSM shuts down NSS Approval Request Comment [Feature/regressing bug #]: bug 1235634 appears to have made this worse, although I think this issue has been present for a long time [User impact if declined]: small potential for a shutdown crash and/or hang [Describe test coverage new/current, TreeHerder]: has a test [Risks and why]: low - the worst that will happen is we traded one kind of crash for another kind of crash [String/UUID change made/needed]: none
Attachment #8777512 -
Flags: approval-mozilla-beta?
Attachment #8777512 -
Flags: approval-mozilla-aurora?
Comment 23•8 years ago
|
||
I don't see any crashes with this signature for the last week on beta. I checked back to 2016-08-01 and still don't see anything (on any channel). David, do you still think we need to uplift this? Maybe some other uplift shifted the crash signature. I do remember this being a high volume signature a while back.
Flags: needinfo?(dkeeler)
Assignee | ||
Comment 24•8 years ago
|
||
I think users are still hitting this fairly often: https://crash-stats.mozilla.com/signature/?signature=shutdownhang%20%7C%20mozilla%3A%3ACondVar%3A%3AWait%20%7C%20nsNSSActivityState%3A%3ArestrictActivityToCurrentThread&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=-date&page=1#reports (looks like the signature changed a bit, but those stacks looks like exactly what this patch is intended to fix)
Flags: needinfo?(dkeeler)
Comment 25•8 years ago
|
||
Yes, looks like there are a lot of shutdown hangs where nsNSSActivityState::restrictActivityToCurrentThread is on the stack: https://crash-stats.mozilla.com/search/?product=Firefox&proto_signature=~nsNSSActivityState%3A%3ArestrictActivityToCurrentThread&_sort=-date&_facets=signature&_facets=version&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature.
Crash Signature: [@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | mozilla::CondVar::Wait | nsNSSActivityState::restrictActivityToCurrentThread] → [@ shutdownhang | WaitForSingleObjectEx | WaitForSingleObject | PR_WaitCondVar | mozilla::CondVar::Wait | nsNSSActivityState::restrictActivityToCurrentThread]
[@ shutdownhang | mozilla::CondVar::Wait | nsNSSActivityState::restrictActivityToCurrentThread]…
Comment 26•8 years ago
|
||
Comment on attachment 8777512 [details] bug 1273475 - fix deadlock and potential crash when PSM shuts down NSS Let's uplift this, as we are still getting a few hundred crashes per beta release per week. It should land for beta 8. Thanks David and Marco!
Attachment #8777512 -
Flags: approval-mozilla-beta?
Attachment #8777512 -
Flags: approval-mozilla-beta+
Attachment #8777512 -
Flags: approval-mozilla-aurora?
Attachment #8777512 -
Flags: approval-mozilla-aurora+
Updated•8 years ago
|
Keywords: checkin-needed
Updated•8 years ago
|
Severity: normal → critical
Target Milestone: --- → mozilla51
This hit conflicts for me uplifting it to aurora, so it likely doesn't apply to beta. Could we get rebased patches for this?
Flags: needinfo?(dkeeler)
Updated•8 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 28•8 years ago
|
||
I applied the following: https://hg.mozilla.org/mozilla-central/rev/990954b29382 https://hg.mozilla.org/mozilla-central/rev/f7e33d3604f5 on top of https://hg.mozilla.org/releases/mozilla-aurora/rev/285149724c55 without conflicts. Is that the right base? What were the conflicts you encountered? In terms of beta, there definitely are conflicts. I had forgotten that bug 1284946 hasn't gotten as far as beta yet. I'll investigate making a branch patch for that (and if it will even be a good idea to do so).
Flags: needinfo?(dkeeler) → needinfo?(wkocher)
Comment 29•8 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/97cd2fba07b3 https://hg.mozilla.org/releases/mozilla-aurora/rev/d52d39c46763
Flags: in-testsuite+
Assignee | ||
Comment 30•8 years ago
|
||
Assignee | ||
Comment 31•8 years ago
|
||
Assignee | ||
Comment 32•8 years ago
|
||
Ok - the beta conflicts were nowhere near as considerable as I thought they were going to be. I think we're fine to move forward with uplifting this to beta.
Keywords: checkin-needed
Updated•8 years ago
|
Flags: needinfo?(wkocher)
Comment 33•8 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/2406bffc6370 https://hg.mozilla.org/releases/mozilla-beta/rev/e2d17f2657b8
Keywords: checkin-needed
You need to log in
before you can comment on or make changes to this bug.
Description
•