Closed Bug 1389252 Opened 2 years ago Closed 2 years ago

Crash in nsThread::ShutdownInternal | nsThread::Shutdown | nsThreadManager::NewNamedThread

Categories

(Core :: XPCOM, defect, critical)

56 Branch
All
Windows
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox-esr52 --- unaffected
firefox55 --- unaffected
firefox56 --- fixed
firefox57 --- fixed

People

(Reporter: philipp, Assigned: Nika)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-dbe584d3-876e-444d-ba56-bcb670170810.
=============================================================
Crashing Thread (13), Name: BgHangManager
Frame 	Module 	Signature 	Source
0 	xul.dll 	nsThread::ShutdownInternal(bool) 	xpcom/threads/nsThread.cpp:963
1 	xul.dll 	nsThread::Shutdown() 	xpcom/threads/nsThread.cpp:1039
2 	xul.dll 	nsThreadManager::NewNamedThread(nsACString const&, unsigned int, nsIThread**) 	xpcom/threads/nsThreadManager.cpp:286
3 	xul.dll 	NS_NewNamedThread(nsACString const&, nsIThread**, nsIRunnable*, unsigned int) 	xpcom/threads/nsThreadUtils.cpp:104
4 	xul.dll 	nsThreadPool::PutEvent(already_AddRefed<nsIRunnable>, unsigned int) 	xpcom/threads/nsThreadPool.cpp:107
5 	xul.dll 	nsThreadPool::Dispatch(already_AddRefed<nsIRunnable>, unsigned int) 	xpcom/threads/nsThreadPool.cpp:274
6 	xul.dll 	mozilla::net::nsStreamTransportService::Dispatch(already_AddRefed<nsIRunnable>, unsigned int) 	netwerk/base/nsStreamTransportService.cpp:490
7 	xul.dll 	mozilla::BackgroundHangThread::ReportHang(unsigned int) 	xpcom/threads/BackgroundHangMonitor.cpp:656
8 	xul.dll 	mozilla::BackgroundHangManager::RunMonitorThread() 	xpcom/threads/BackgroundHangMonitor.cpp:368
9 	xul.dll 	mozilla::BackgroundHangManager::MonitorThread(void*) 	xpcom/threads/BackgroundHangMonitor.cpp:72
10 	nss3.dll 	_PR_NativeRunThread 	nsprpub/pr/src/threads/combined/pruthr.c:397
11 	nss3.dll 	pr_root 	nsprpub/pr/src/md/windows/w95thred.c:95
12 	ucrtbase.dll 	_o__CIpow 	
13 	kernel32.dll 	BaseThreadInitThunk 	
14 	mozglue.dll 	patched_BaseThreadInitThunk 	mozglue/build/WindowsDllBlocklist.cpp:815
15 	ntdll.dll 	__RtlUserThreadStart 	
16 	ntdll.dll 	_RtlUserThreadStart

reports about these crashes during shutdown are newly showing up from windows users in 56.0b1 (only on the aurora channel).
I see BHR in all of these stacks. Any ideas, Michael?
Flags: needinfo?(michael)
OK, I have some idea as to what is going on here (I think), just based on reading the code:
Basically, when creating a thread for the nsStreamTransportService from the BackgroundHangMonitor (as the BackgroundHangMonitor dispatches to STS), we detect that the browser is shutting down. Being good citizens, the threading code decides to clean up the newly created thread.

This causes us to enter ShutdownInternal() for the newly created thread, which contains these lines:
http://searchfox.org/mozilla-central/rev/4b79f3b23aebb4080ea85e94351fd4046116a957/xpcom/threads/nsThread.cpp#962-963 

These try to construct a NotNull<nsThread*> out of GetCurrentThread(), which _should_ work, however the BackgroundHangManager thread is not a nsThread, it is a raw PRThread, so we get a NULL and trigger a crash :-/.

I'm not sure what the best strategy here is. The easiest solution would be to turn off BHR on the beta channel (we don't look at beta BHR data anyway and it's being turned off in bug 1380081 anyway).
Flags: needinfo?(michael)
This is a bandaid for this crash.

MozReview-Commit-ID: K98DceJ2uHz
Attachment #8896401 - Flags: review?(nfroyd)
Comment on attachment 8896401 [details] [diff] [review]
Disable BHR in beta

Review of attachment 8896401 [details] [diff] [review]:
-----------------------------------------------------------------

Seems sort of bonkers that we don't look at the BHR data on beta, since that's a bigger pool of users (isn't it?), but if we're turning it off soon in other bugs, this is OK.  This patch is also significantly less risky than uplifting the stack of patches from the other bugs.

Please nominate this for beta uplift after it lands.
Attachment #8896401 - Flags: review?(nfroyd) → review+
(In reply to Nathan Froyd [:froydnj] from comment #4)
> Please nominate this for beta uplift after it lands.

I was planning to only land this on beta, are you OK with that?
Flags: needinfo?(nfroyd)
(In reply to Michael Layzell [:mystor] from comment #5)
> (In reply to Nathan Froyd [:froydnj] from comment #4)
> > Please nominate this for beta uplift after it lands.
> 
> I was planning to only land this on beta, are you OK with that?

Uh, that's a little weird?  The thinking is that this will be landed on beta, and then the code that it would have been patching will go away before we uplift 57 to beta, and we wouldn't have needed this patch anyway?

If you can convince relman of this line of thinking, then OK, but I don't see why landing it on central first would necessarily be a bad thing--other than churning the other patches that haven't landed yet.
Flags: needinfo?(nfroyd)
https://hg.mozilla.org/mozilla-central/rev/5f5f44e4699a
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Assignee: nobody → michael
Looks like this needs a Beta approval request still?
Flags: needinfo?(michael)
Comment on attachment 8896401 [details] [diff] [review]
Disable BHR in beta

Approval Request Comment
[Feature/Bug causing the regression]: BHR
[User impact if declined]: Crashes on Windows
[Is this code covered by automated tests?]: Disabling code
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: No 
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: No
[Why is the change risky/not risky?]: Disables independent telemetry code on beta.
[String changes made/needed]: None
Flags: needinfo?(michael)
Attachment #8896401 - Flags: approval-mozilla-beta?
Comment on attachment 8896401 [details] [diff] [review]
Disable BHR in beta

Disable BHR in beta. Beta56+.
Attachment #8896401 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
(In reply to Michael Layzell [:mystor] from comment #10)
> [Is this code covered by automated tests?]: Disabling code
> [Has the fix been verified in Nightly?]: Yes
> [Needs manual test from QE? If yes, steps to reproduce]: No 

Setting qe-verify- based on Michael's assessment on manual testing needs.
Flags: qe-verify-
You need to log in before you can comment on or make changes to this bug.