Closed Bug 1329449 Opened 7 years ago Closed 6 years ago

[e10s] Crash in mozilla::SingletonThreadHolder::ReleaseUse

Categories

(Core :: WebRTC: Networking, defect, P2)

50 Branch
x86
Windows
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox-esr45 --- wontfix
firefox50 --- wontfix
firefox51 --- wontfix
firefox52 --- wontfix
firefox-esr52 --- affected
firefox53 --- wontfix
firefox54 --- fix-optional
firefox55 --- affected

People

(Reporter: philipp, Unassigned)

References

Details

(Keywords: crash, csectype-nullptr)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-c88c116b-8854-44b3-8142-3dbbf2170107.
=============================================================
Crashing Thread (14)
Frame 	Module 	Signature 	Source
0 	xul.dll 	mozilla::SingletonThreadHolder::ReleaseUse() 	media/mtransport/nr_socket_prsock.cpp:235
1 	xul.dll 	mozilla::detail::RunnableFunction<void (*)(void)>::Run() 	obj-firefox/dist/include/nsThreadUtils.h:282
2 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp:1067
3 	xul.dll 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/glue/nsThreadUtils.cpp:311
4 	xul.dll 	mozilla::net::nsSocketTransportService::Run() 	netwerk/base/nsSocketTransportService2.cpp:939
5 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp:1067
6 	xul.dll 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/glue/nsThreadUtils.cpp:311
7 	xul.dll 	mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:338
8 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:225
9 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:205
10 	xul.dll 	nsThread::ThreadFunc(void*) 	xpcom/threads/nsThread.cpp:465
11 	nss3.dll 	_PR_NativeRunThread 	nsprpub/pr/src/threads/combined/pruthr.c:397
12 	nss3.dll 	pr_root 	nsprpub/pr/src/md/windows/w95thred.c:95
13 	ucrtbase.dll 	_o__CIpow 	
14 	kernel32.dll 	BaseThreadInitThunk 	
15 	ntdll.dll 	__RtlUserThreadStart 	
16 	ntdll.dll 	_RtlUserThreadStart

this is a crash in the content process that happens on 32bit versions on windows in a codepath from bug 1145354. overall it's rather low volume though (10-50 daily reports).

Correlations for Firefox Beta
(100.0% in signature vs 19.28% overall) address = 0x0
(87.32% in signature vs 26.73% overall) Module "dhcpcsvc6.DLL" = true
(100.0% in signature vs 51.04% overall) reason = EXCEPTION_ACCESS_VIOLATION_READ
Assignee: nobody → rjesup
Severity: critical → normal
Rank: 15
Component: General → WebRTC: Networking
Priority: -- → P1
This is very odd; I have trouble seeing how mThread could be null if we held a use that we're now releasing...  though clearly it must be, and the logic that ends up calling ReleaseUse() is involved.
a nullptr check can't hurt...
Attachment #8824862 - Flags: review?(drno)
Comment on attachment 8824862 [details] [diff] [review]
null-check mThread before shutting it down

Review of attachment 8824862 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM with Nits.

::: media/mtransport/nr_socket_prsock.cpp
@@ +228,5 @@
>      nsrefcnt count = --mUseCount;
>      MOZ_ASSERT(int32_t(mUseCount) >= 0, "illegal refcnt");
>      if (count == 0) {
>        // in-use -> idle -- no one forcing it to remain instantiated
> +      if (mThread) {

How about we add another MOZ_ASSERT(mThread) before the if, or add an else path below with logging and MOZ_CRASH?

All the ways I can come up this happening are scary. But all of the crashes happen on machines with 2-4G RAM. So most likely this is the aftermath of some OOM problem (which screws up the ref counting?).

The other thing which comes to my mind is if "Working Offline" could cause this, since it shuts down the STS thread in the parent. But that would not explain why this happens on 32bit machines with little RAM only.
Attachment #8824862 - Flags: review?(drno) → review+
Trading the current nullptr crash for a RELEASE_ASSERT would do little, just change the signature.  Landing this patch would quiet things ... but this shouldn't be happening.  Yes, we can armor against it, but as you point out something has got to be wrong... so maybe we leave the nullptr crash until we can find something actionable as a theory.
See Also: → 1310302
Mass wontfix for bugs affecting firefox 52.
nils - bouncing this to you for now
Mass change P1->P2 to align with new Mozilla triage process
Priority: P1 → P2
I'm not seeing any crashes here since 53.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(drno)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: