Closed Bug 1451262 Opened 6 years ago Closed 6 years ago

Intermittent Assertion failure: OnSocketThread() (not on socket thread), at z:/build/build/src/netwerk/base/nsSocketTransport2.cpp:278

Categories

(Core :: Networking, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla62
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox60 --- unaffected
firefox61 --- fixed
firefox62 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: valentin)

Details

(Keywords: assertion, intermittent-failure, Whiteboard: [necko-triaged])

Attachments

(1 file)

The assertion stack is not that pretty helpful but for postery here the details:

07:29:20     INFO -  #01: ???[Z:\task_1522824245\build\application\firefox\xul.dll +0xccb787]
07:29:20     INFO -  #02: ???[Z:\task_1522824245\build\application\firefox\xul.dll +0xcd6583]
07:29:20     INFO -  #03: ???[Z:\task_1522824245\build\application\firefox\xul.dll +0xbe5bdb]
07:29:20     INFO -  #04: ???[Z:\task_1522824245\build\application\firefox\xul.dll +0xbf3b5b]
07:29:20     INFO -  #05: ???[Z:\task_1522824245\build\application\firefox\xul.dll +0xbebe5c]
07:29:20     INFO -  #06: ???[Z:\task_1522824245\build\application\firefox\xul.dll +0xbf5a67]
07:29:20     INFO -  #07: DumpFrameArray[Z:\task_1522824245\build\application\firefox\xul.dll +0x4ada77a]
07:29:20     INFO -  #08: ???[Z:\task_1522824245\build\application\firefox\xul.dll +0xcd6e72]
07:29:20     INFO -  #09: ???[Z:\task_1522824245\build\application\firefox\xul.dll +0xbe5bdb]
07:29:20     INFO -  #10: ???[Z:\task_1522824245\build\application\firefox\xul.dll +0xbf3b5b]
07:29:20     INFO -  #11: soundtouch::SoundTouch::operator=[Z:\task_1522824245\build\application\firefox\xul.dll +0x1157a2e]
07:29:20     INFO -  #12: ???[Z:\task_1522824245\build\application\firefox\xul.dll +0x111bf3e]
07:29:20     INFO -  #13: ???[Z:\task_1522824245\build\application\firefox\xul.dll +0x111bb12]
07:29:20     INFO -  #14: ???[Z:\task_1522824245\build\application\firefox\xul.dll +0xbed21e]
07:29:20     INFO -  #15: PR_NativeCreateThread[Z:\task_1522824245\build\application\firefox\nss3.dll +0xc6b1e]
07:29:20     INFO -  #16: PR_MD_WAIT_CV[Z:\task_1522824245\build\application\firefox\nss3.dll +0xb827e]
07:29:20     INFO -  #17: o__strtoui64[C:\Windows\System32\ucrtbase.dll +0x20369]
07:29:20     INFO -  #18: BaseThreadInitThunk[C:\Windows\System32\KERNEL32.DLL +0x12774]
07:29:20     INFO -  #19: RtlUserThreadStart[C:\Windows\SYSTEM32\ntdll.dll +0x70d61]
07:2
I extracted the trace from the logs:

Thread 9 (crashed)
mozilla::net::nsSocketInputStream::OnSocketReady(nsresult) [nsSocketTransport2.cpp:278]
mozilla::net::nsSocketTransport::OnSocketEvent(unsigned int,nsresult,nsISupports *) [nsSocketTransport2.cpp:2070]
mozilla::net::nsSocketEvent::Run() [nsSocketTransport2.cpp:90]
nsThread::ProcessNextEvent(bool,bool *) [nsThread.cpp:1096]
NS_ProcessNextEvent(nsIThread *,bool) [nsThreadUtils.cpp:519]
nsThread::Shutdown() [nsThread.cpp:814]
nsThreadPool::Shutdown() [nsThreadPool.cpp:344]
mozilla::psm::StopSSLServerCertVerificationThreads() [SSLServerCertVerification.cpp:196]
mozilla::net::nsSocketTransportService::Run() [nsSocketTransportService2.cpp:1072]
nsThread::ProcessNextEvent(bool,bool *) [nsThread.cpp:1096]
NS_ProcessNextEvent(nsIThread *,bool) [nsThreadUtils.cpp:519]
mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate *) [MessagePump.cpp:364]
MessageLoop::RunHandler() [message_loop.cc:319]
MessageLoop::Run() [message_loop.cc:299]
nsThread::ThreadFunc(void *) [nsThread.cpp:425]
nss3.dll!PR_NativeRunThread [pruthr.c:397]
nss3.dll!static unsigned int pr_root(void *) [w95thred.c:137]
ucrtbase.dll!RtlGetFullPathName_Ustr

From what I can tell, when shutting down, in nsSocketTransportService::Run we first do
gSocketThread = nullptr;
and then we call psm::StopSSLServerCertVerificationThreads()
This causes us to hit the assertion when shutting down the thread, as the current thread no longer equals gSocketTread.

Not sure why this crash is only popping up now. That code hasn't changed for about 5 years.
I'll submit a quick fix to Honza for review.
Assignee: nobody → valentin.gosu
Priority: P5 → P3
Whiteboard: [necko-triaged]
Comment on attachment 8964833 [details]
Bug 1451262 - Call NS_ProcessPendingEvents after StopSSLServerCertVerificationThreads

https://reviewboard.mozilla.org/r/233584/#review239634
Attachment #8964833 - Flags: review?(honzab.moz) → review+
Pushed by valentin.gosu@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/1093016046ef
Clear gSocketThread after finishing all operations so we don't trigger thread assertions r=mayhemer
https://hg.mozilla.org/mozilla-central/rev/1093016046ef
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
This is still happening.
Recent log:
https://treeherder.mozilla.org/logviewer.html#?job_id=172598100&repo=autoland&lineNumber=62564
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla61 → ---
Please also note the line before the assertion:

> 23:34:01     INFO -  [Parent 5316, Socket Thread] WARNING: cannot post event if not initialized: file z:/build/build/src/netwerk/protocol/http/nsHttpConnectionMgr.cpp, line 301
> 23:34:01     INFO -  Assertion failure: OnSocketThread() (not on socket thread), at z:/build/build/src/netwerk/base/nsSocketTransport2.cpp:278
2
(In reply to Henrik Skupin (:whimboo) from comment #9)
> Please also note the line before the assertion:
> 
> > 23:34:01     INFO -  [Parent 5316, Socket Thread] WARNING: cannot post event if not initialized: file z:/build/build/src/netwerk/protocol/http/nsHttpConnectionMgr.cpp, line 301
> > 23:34:01     INFO -  Assertion failure: OnSocketThread() (not on socket thread), at z:/build/build/src/netwerk/base/nsSocketTransport2.cpp:278
> 2

we might turn the warning to an assertion temporarily to get a stack or maybe MOZ_ASSERT in nsHttpConnectionMgr.cpp all the time at that place?
Flags: needinfo?(valentin.gosu)
Pushed by valentin.gosu@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/b859a16d7450
Call NS_ProcessPendingEvents after StopSSLServerCertVerificationThreads r=mayhemer
Flags: needinfo?(valentin.gosu)
https://hg.mozilla.org/mozilla-central/rev/b859a16d7450
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Please request Beta approval on this when you get a chance.
Flags: needinfo?(valentin.gosu)
Keywords: assertion
Comment on attachment 8964833 [details]
Bug 1451262 - Call NS_ProcessPendingEvents after StopSSLServerCertVerificationThreads

Approval Request Comment
[Feature/Bug causing the regression]: Unknown.
[User impact if declined]: Debug assertions may be triggered in some conditions.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: It just landed.
[Needs manual test from QE? If yes, steps to reproduce]: No. 
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: It just reorders some events in order to avoid assertions. No actual functionality should be changed.
[String changes made/needed]: None.
Flags: needinfo?(valentin.gosu)
Attachment #8964833 - Flags: approval-mozilla-beta?
Comment on attachment 8964833 [details]
Bug 1451262 - Call NS_ProcessPendingEvents after StopSSLServerCertVerificationThreads

Minor code reordering to avoid an assertion from triggering intermittently in CI. Approved for 61.0b5.
Attachment #8964833 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.