Closed Bug 846137 Opened 13 years ago Closed 13 years ago

intermittent test_secureAll.html, test_unsecureIframeRedirect.html, test_stricttransportsecurity.html, test_bug215405.xul | Assertion count 1 is greater than expected range 0-0 assertions. ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread

Categories

(Core :: Networking, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla23
Tracking Status
firefox22 --- fixed
firefox23 --- fixed

People

(Reporter: dbaron, Assigned: jesup)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file, 1 obsolete file)

I believe this is a low-enough frequency intermittent assertion that it should be filed and starred rather than annotated as known to assert. https://tbpl.mozilla.org/php/getParsedLog.php?id=20166809&tree=Mozilla-Inbound Rev4 MacOSX Lion 10.7 mozilla-inbound debug test mochitest-5 on 2013-02-27 18:46:17 PST for push 2186eacb635c slave: talos-r4-lion-084 18:57:42 INFO - 163435 ERROR TEST-UNEXPECTED-FAIL | /tests/security/ssl/mixedcontent/test_unsecureIframeRedirect.html | Assertion count 1 is greater than expected range 0-0 assertions. 18:57:40 INFO - ^G###!!! ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread', file ../../../../netwerk/base/src/NetworkActivityMonitor.cpp, line 273 18:57:40 INFO - ssl_DefSend [security/nss/lib/ssl/ssldef.c:95] 18:57:40 INFO - ssl3_SendRecord [security/nss/lib/ssl/ssl3con.c:2556] 18:57:40 INFO - SSL3_SendAlert [security/nss/lib/ssl/ssl3con.c:2847] 18:57:40 INFO - ssl_SecureClose [security/nss/lib/ssl/sslsecur.c:1061] 18:57:40 INFO - ssl_Close [security/nss/lib/ssl/sslsock.c:2050] 18:57:40 INFO - nsSSLIOLayerClose [security/manager/ssl/src/nsNSSIOLayer.cpp:721] 18:57:40 INFO - nsSocketTransport::GetSelfAddr(mozilla::net::NetAddr*) [netwerk/base/src/nsSocketTransport2.cpp:1432] 18:57:40 INFO - mozilla::net::nsHttpChannel::OnTransportStatus(nsITransport*, tag_nsresult, unsigned long long, unsigned long long) [obj-firefox/dist/include/nsCOMPtr.h:783] 18:57:40 INFO - _ZThn776_N7mozilla3net13nsHttpChannel17OnTransportStatusEP12nsITransport12tag_nsresultyy [netwerk/protocol/http/nsHttpChannel.cpp:5192] 18:57:40 INFO - nsTransportStatusEvent::Run() [netwerk/base/src/nsTransportUtils.cpp:78] 18:57:40 INFO - nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:627] 18:57:40 INFO - NS_ProcessPendingEvents_P(nsIThread*, unsigned int) [obj-firefox/xpcom/build/nsThreadUtils.cpp:188] 18:57:40 INFO - nsBaseAppShell::NativeEventCallback() [widget/xpwidgets/nsBaseAppShell.cpp:98] 18:57:40 INFO - nsAppShell::ProcessGeckoEvents(void*) [widget/cocoa/nsAppShell.mm:388] 18:57:40 INFO - CoreFoundation + 0x12b51 18:57:40 INFO - CoreFoundation + 0x123bd 18:57:40 INFO - CoreFoundation + 0x391a9 18:57:40 INFO - CoreFoundation + 0x38ae6 18:57:40 INFO - HIToolbox + 0x23d3 18:57:40 INFO - HIToolbox + 0x963d 18:57:40 INFO - HIToolbox + 0x94ca 18:57:40 INFO - AppKit + 0x93f1 18:57:40 INFO - -AppKit + 0x8cf5 18:57:40 INFO - -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] [widget/cocoa/nsAppShell.mm:164] 18:57:40 INFO - -AppKit + 0x562d 18:57:40 INFO - nsAppShell::Run() [widget/cocoa/nsAppShell.mm:741] 18:57:40 INFO - nsAppStartup::Run() [toolkit/components/startup/nsAppStartup.cpp:288] 18:57:40 INFO - XREMain::XRE_mainRun() [toolkit/xre/nsAppRunner.cpp:3885] 18:57:40 INFO - XREMain::XRE_main(int, char**, nsXREAppData const*) [toolkit/xre/nsAppRunner.cpp:3952] 18:57:40 INFO - XRE_main [toolkit/xre/nsAppRunner.cpp:4155] 18:57:40 INFO - main [browser/app/nsBrowserApp.cpp:224]
https://tbpl.mozilla.org/php/getParsedLog.php?id=20173190&tree=Mozilla-Inbound Rev3 Fedora 12x64 mozilla-inbound debug test mochitest-5 on 2013-02-27 22:24:16 PST for push 49b4afb3ab2f slave: talos-r3-fed64-051 22:32:15 INFO - 163534 ERROR TEST-UNEXPECTED-FAIL | /tests/security/ssl/mixedcontent/test_secureAll.html | Assertion count 1 is greater than expected range 0-0 assertions. 22:32:13 INFO - ^G###!!! ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread', file ../../../../netwerk/base/src/NetworkActivityMonitor.cpp, line 273 22:32:13 INFO - nsNetMon_Send [netwerk/base/src/NetworkActivityMonitor.cpp:89] 22:32:14 INFO - ssl_DefSend [security/nss/lib/ssl/ssldef.c:95] 22:32:14 INFO - ssl3_SendRecord [security/nss/lib/ssl/ssl3con.c:2556] 22:32:14 INFO - SSL3_SendAlert [security/nss/lib/ssl/ssl3con.c:2847] 22:32:14 INFO - ssl_SecureClose [security/nss/lib/ssl/sslsecur.c:1063] 22:32:14 INFO - ssl_Close [security/nss/lib/ssl/sslsock.c:2050] 22:32:14 INFO - nsNSSSocketInfo::CloseSocketAndDestroy(nsNSSShutDownPreventionLock const&) [security/manager/ssl/src/nsNSSIOLayer.cpp:729] 22:32:14 INFO - nsSSLIOLayerClose [security/manager/ssl/src/nsNSSIOLayer.cpp:702] 22:32:14 INFO - nsSocketTransport::ReleaseFD_Locked(PRFileDesc*) [netwerk/base/src/nsSocketTransport2.cpp:1432] 22:32:14 INFO - nsSocketTransport::GetSelfAddr(mozilla::net::NetAddr*) [netwerk/base/src/nsSocketTransport2.cpp:1957] 22:32:14 INFO - mozilla::net::nsHttpChannel::OnTransportStatus(nsITransport*, tag_nsresult, unsigned long, unsigned long) [netwerk/protocol/http/nsHttpChannel.cpp:5166] 22:32:14 INFO - nsTransportStatusEvent::Run() [netwerk/base/src/nsTransportUtils.cpp:81] 22:32:14 INFO - nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:627] 22:32:14 INFO - NS_ProcessNextEvent_P(nsIThread*, bool) [obj-firefox/xpcom/build/nsThreadUtils.cpp:238] 22:32:14 INFO - mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) [ipc/glue/MessagePump.cpp:83] 22:32:14 INFO - MessageLoop::RunInternal() [ipc/chromium/src/base/message_loop.cc:217] 22:32:14 INFO - MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:503] 22:32:14 INFO - nsBaseAppShell::Run() [widget/xpwidgets/nsBaseAppShell.cpp:165] 22:32:14 INFO - nsAppStartup::Run() [toolkit/components/startup/nsAppStartup.cpp:289] 22:32:14 INFO - XREMain::XRE_mainRun() [toolkit/xre/nsAppRunner.cpp:3886] 22:32:14 INFO - XREMain::XRE_main(int, char**, nsXREAppData const*) [toolkit/xre/nsAppRunner.cpp:3952] 22:32:14 INFO - XRE_main [toolkit/xre/nsAppRunner.cpp:4156] 22:32:14 INFO - do_main [browser/app/nsBrowserApp.cpp:216] 22:32:14 INFO - main [browser/app/nsBrowserApp.cpp:524]
Summary: intermittent test_unsecureIframeRedirect.html | Assertion count 1 is greater than expected range 0-0 assertions. → intermittent test_secureAll.html, test_unsecureIframeRedirect.html | Assertion count 1 is greater than expected range 0-0 assertions. ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread
https://tbpl.mozilla.org/php/getParsedLog.php?id=20197164&tree=Mozilla-Inbound Rev5 MacOSX Mountain Lion 10.8 mozilla-inbound debug test mochitest-other on 2013-02-28 14:14:41 PST for push 57fc2a37629f slave: talos-mtnlion-r5-040 14:16:45 INFO - 945 ERROR TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/content/base/test/chrome/test_csp_bug768029.html | Assertion count 1 is greater than expected range 0-0 assertions. 14:16:29 INFO - ^G###!!! ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread', file ../../../../netwerk/base/src/NetworkActivityMonitor.cpp, line 273 14:16:29 INFO - ssl_DefSend [security/nss/lib/ssl/ssldef.c:95] 14:16:29 INFO - ssl3_SendRecord [security/nss/lib/ssl/ssl3con.c:2556] 14:16:29 INFO - SSL3_SendAlert [security/nss/lib/ssl/ssl3con.c:2847] 14:16:29 INFO - ssl_SecureClose [security/nss/lib/ssl/sslsecur.c:1061] 14:16:29 INFO - ssl_Close [security/nss/lib/ssl/sslsock.c:2050] 14:16:45 INFO - nsSSLIOLayerClose [security/manager/ssl/src/nsNSSIOLayer.cpp:721] 14:16:45 INFO - nsSocketTransport::GetSelfAddr(mozilla::net::NetAddr*) [netwerk/base/src/nsSocketTransport2.cpp:1432] 14:16:45 INFO - mozilla::net::nsHttpChannel::OnTransportStatus(nsITransport*, tag_nsresult, unsigned long long, unsigned long long) [obj-firefox/dist/include/nsCOMPtr.h:783] 14:16:45 INFO - _ZThn776_N7mozilla3net13nsHttpChannel17OnTransportStatusEP12nsITransport12tag_nsresultyy [netwerk/protocol/http/nsHttpChannel.cpp:5192] 14:16:45 INFO - nsTransportStatusEvent::Run() [netwerk/base/src/nsTransportUtils.cpp:78] 14:16:45 INFO - nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:627] 14:16:45 INFO - NS_ProcessPendingEvents_P(nsIThread*, unsigned int) [obj-firefox/xpcom/build/nsThreadUtils.cpp:188] 14:16:45 INFO - nsBaseAppShell::NativeEventCallback() [widget/xpwidgets/nsBaseAppShell.cpp:98] 14:16:45 INFO - nsAppShell::ProcessGeckoEvents(void*) [widget/cocoa/nsAppShell.mm:388] 14:16:45 INFO - CoreFoundation + 0x12841 14:16:45 INFO - CoreFoundation + 0x12165 14:16:45 INFO - CoreFoundation + 0x354e5 14:16:45 INFO - CoreFoundation + 0x34dd2 14:16:45 INFO - HIToolbox + 0x5f774 14:16:45 INFO - HIToolbox + 0x5f512 14:16:45 INFO - HIToolbox + 0x5f3a3 14:16:45 INFO - AppKit + 0x156fa3 14:16:45 INFO - -AppKit + 0x156862 14:16:45 INFO - -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] [widget/cocoa/nsAppShell.mm:164] 14:16:45 INFO - -AppKit + 0x14dc03 14:16:45 INFO - nsAppShell::Run() [widget/cocoa/nsAppShell.mm:741] 14:16:45 INFO - nsAppStartup::Run() [toolkit/components/startup/nsAppStartup.cpp:288] 14:16:45 INFO - XREMain::XRE_mainRun() [toolkit/xre/nsAppRunner.cpp:3885] 14:16:45 INFO - XREMain::XRE_main(int, char**, nsXREAppData const*) [toolkit/xre/nsAppRunner.cpp:3952] 14:16:45 INFO - XRE_main [toolkit/xre/nsAppRunner.cpp:4155] 14:16:45 INFO - main [browser/app/nsBrowserApp.cpp:224]
Summary: intermittent test_secureAll.html, test_unsecureIframeRedirect.html | Assertion count 1 is greater than expected range 0-0 assertions. ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread → intermittent test_secureAll.html, test_unsecureIframeRedirect.html, test_stricttransportsecurity.html | Assertion count 1 is greater than expected range 0-0 assertions. ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread
Summary: intermittent test_secureAll.html, test_unsecureIframeRedirect.html, test_stricttransportsecurity.html | Assertion count 1 is greater than expected range 0-0 assertions. ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread → intermittent test_secureAll.html, test_unsecureIframeRedirect.html, test_stricttransportsecurity.html, test_bug215405.xul | Assertion count 1 is greater than expected range 0-0 assertions. ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread
Blocks: 851962
Blocks: 854519
I landed: https://hg.mozilla.org/integration/mozilla-inbound/rev/db744bd0ec5b adding a comment reflecting that one of the more common places this bug occurs is actually annotated.
Assignee: dbaron → nobody
Assignee: nobody → michal.novotny
Some investigation: Initial read is someone is playing fast-and-loose with thread requirements for NSS sockets by having Transport Events go to the consumer on MainThread, which will seem to work "usually" until NSS has to close a socket and still has data to send. NSS *has* to run on the STS thread, and this violates that contract. The actual consumer is hidden behind a couple of layers of proxys/runnables, they need to ensure that the target is the STS thread. Or, if you prefer, make it impossible for OnTransportStatus to do any actions that require being on the STS thread (like closing a FD). Adding NS_ASSERTION(PR_GetCurrentThread() == gSocketThread, "wrong thread"); to ReleaseFD_Locked() will help ferret out anyone causing PR_Close off STS and may help find most or all of the MainThread watchers of Transport status. Right now this looks like the HTTP transport is the culprit. Bad things can happen when NSS is run off-main-thread.
Severity: normal → critical
So, is such an assertion in ReleaseFD_Locked() correct? If so, we've got problems; loading a simple page generated like 10 assertions and a bunch more off a timer later.
Per discussion with Patrick in #necko: This is a race between Socket shutdown and other things, and the Runnable has the last pseudo-reference to the socket, so when it's done with GetSelfAddr(), it calls PR_Close(): but it's not on STS thread, which is a no-no. Patrick's suggestion is to set mFD to nullptr, but dispatch a runnable to STS to actually do the PR_Close(). I'll take a quick crack at a patch
Proposed fix to thunk PR_Close() to STS thread
Attachment #735409 - Flags: review?(mcmanus)
Component: Security: PSM → Networking
Attachment #735409 - Flags: review?(mcmanus) → review+
Target Milestone: --- → mozilla23
move nested class out of function to avoid visibility warning from NS_IMETHOD
Attachment #735409 - Attachment is obsolete: true
Comment on attachment 735510 [details] [diff] [review] Don't call PR_Close() off STS thread carry forward r=mcmanus review on bustage to smaug
Attachment #735510 - Flags: review?(bugs)
Comment on attachment 735510 [details] [diff] [review] Don't call PR_Close() off STS thread >+class thunkPRClose : public nsRunnable Class names tend to start with capital letter. >+{ >+public: >+ thunkPRClose(PRFileDesc *fd) : mFD(fd) {} >+ >+ NS_IMETHOD Run() { { should apparently be on the next line even in necko. Can't comment other coding style :)
Attachment #735510 - Flags: review?(bugs) → review+
I mean, if there isn't ns* or some other prefix, class names tend to start with capital letter.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Assignee: michal.novotny → rjesup
Comment on attachment 735510 [details] [diff] [review] Don't call PR_Close() off STS thread [Approval Request Comment] Bug caused by (feature/regressing bug #): N/A -- though something might have made this more likely to be hit User impact if declined: Random oranges and shutdown-time crashes. Possible security impacts Testing completed (on m-c, etc.): on m-c for some time; oranges are gone. Risk to taking this patch (and alternatives if risky): low risk; few things would depend on a network socket actually closing synchronously when released on the MainThread (which is an odd edge-case to begin with) String or IDL/UUID changes made by this patch: none
Attachment #735510 - Flags: approval-mozilla-aurora?
(In reply to Randell Jesup [:jesup] from comment #35) > Comment on attachment 735510 [details] [diff] [review] > Don't call PR_Close() off STS thread > > [Approval Request Comment] > Bug caused by (feature/regressing bug #): N/A -- though something might have > made this more likely to be hit > > User impact if declined: Random oranges and shutdown-time crashes. Possible > security impacts > > Testing completed (on m-c, etc.): on m-c for some time; oranges are gone. > > Risk to taking this patch (and alternatives if risky): low risk; few things > would depend on a network socket actually closing synchronously when > released on the MainThread (which is an odd edge-case to begin with) Based on the risk here, this looks ok to be uplifted on aurora given it fixes the intermittent failure. But before than, can you please further explain on how (un)common is a user scenario as mentioned above. We have often seen enterprise users hitting edge cases like these where a regular FX user may not hit day-day . If there is any additional testing needed or not? > > String or IDL/UUID changes made by this patch: none
It's very hard for something to depend on when the actual socket closes, since the code that causes this case (release on MainThread) is supposed to happen, and in particular, this must be happening in a runnable dispatched with a "reference" (not an XPCOM reference, but equivalent) from STS thread to MainThread, and before the Main Thread can process the runnable, the STS thread manages to release it's own "reference" to the socket (perhaps only possible at shutdown). I can't really think of any way something could depend on this. It may change some timings slightly, but releasing the socket on main-thread could cause random crashes and other undefined behavior, as NSS doesn't want to be called off-STS-thread. This makes me concerned from a security aspect, though I have no proof anything can come of it. If you want a second opinion, I'd suggest Patrick McManus. The sheriffs want it to reduce random oranges.
Comment on attachment 735510 [details] [diff] [review] Don't call PR_Close() off STS thread Thanks for the explanation Randell.Based on your comment's this looks good to be uplifted on aurora.
Attachment #735510 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: