Closed
Bug 846137
Opened 13 years ago
Closed 13 years ago
intermittent test_secureAll.html, test_unsecureIframeRedirect.html, test_stricttransportsecurity.html, test_bug215405.xul | Assertion count 1 is greater than expected range 0-0 assertions. ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread
Categories
(Core :: Networking, defect)
Core
Networking
Tracking
()
RESOLVED
FIXED
mozilla23
People
(Reporter: dbaron, Assigned: jesup)
References
Details
(Keywords: intermittent-failure)
Attachments
(1 file, 1 obsolete file)
1.95 KB,
patch
|
smaug
:
review+
bajaj
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
I believe this is a low-enough frequency intermittent assertion that it should be filed and starred rather than annotated as known to assert.
https://tbpl.mozilla.org/php/getParsedLog.php?id=20166809&tree=Mozilla-Inbound
Rev4 MacOSX Lion 10.7 mozilla-inbound debug test mochitest-5 on 2013-02-27 18:46:17 PST for push 2186eacb635c
slave: talos-r4-lion-084
18:57:42 INFO - 163435 ERROR TEST-UNEXPECTED-FAIL | /tests/security/ssl/mixedcontent/test_unsecureIframeRedirect.html | Assertion count 1 is greater than expected range 0-0 assertions.
18:57:40 INFO - ^G###!!! ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread', file ../../../../netwerk/base/src/NetworkActivityMonitor.cpp, line 273
18:57:40 INFO - ssl_DefSend [security/nss/lib/ssl/ssldef.c:95]
18:57:40 INFO - ssl3_SendRecord [security/nss/lib/ssl/ssl3con.c:2556]
18:57:40 INFO - SSL3_SendAlert [security/nss/lib/ssl/ssl3con.c:2847]
18:57:40 INFO - ssl_SecureClose [security/nss/lib/ssl/sslsecur.c:1061]
18:57:40 INFO - ssl_Close [security/nss/lib/ssl/sslsock.c:2050]
18:57:40 INFO - nsSSLIOLayerClose [security/manager/ssl/src/nsNSSIOLayer.cpp:721]
18:57:40 INFO - nsSocketTransport::GetSelfAddr(mozilla::net::NetAddr*) [netwerk/base/src/nsSocketTransport2.cpp:1432]
18:57:40 INFO - mozilla::net::nsHttpChannel::OnTransportStatus(nsITransport*, tag_nsresult, unsigned long long, unsigned long long) [obj-firefox/dist/include/nsCOMPtr.h:783]
18:57:40 INFO - _ZThn776_N7mozilla3net13nsHttpChannel17OnTransportStatusEP12nsITransport12tag_nsresultyy [netwerk/protocol/http/nsHttpChannel.cpp:5192]
18:57:40 INFO - nsTransportStatusEvent::Run() [netwerk/base/src/nsTransportUtils.cpp:78]
18:57:40 INFO - nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:627]
18:57:40 INFO - NS_ProcessPendingEvents_P(nsIThread*, unsigned int) [obj-firefox/xpcom/build/nsThreadUtils.cpp:188]
18:57:40 INFO - nsBaseAppShell::NativeEventCallback() [widget/xpwidgets/nsBaseAppShell.cpp:98]
18:57:40 INFO - nsAppShell::ProcessGeckoEvents(void*) [widget/cocoa/nsAppShell.mm:388]
18:57:40 INFO - CoreFoundation + 0x12b51
18:57:40 INFO - CoreFoundation + 0x123bd
18:57:40 INFO - CoreFoundation + 0x391a9
18:57:40 INFO - CoreFoundation + 0x38ae6
18:57:40 INFO - HIToolbox + 0x23d3
18:57:40 INFO - HIToolbox + 0x963d
18:57:40 INFO - HIToolbox + 0x94ca
18:57:40 INFO - AppKit + 0x93f1
18:57:40 INFO - -AppKit + 0x8cf5
18:57:40 INFO - -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] [widget/cocoa/nsAppShell.mm:164]
18:57:40 INFO - -AppKit + 0x562d
18:57:40 INFO - nsAppShell::Run() [widget/cocoa/nsAppShell.mm:741]
18:57:40 INFO - nsAppStartup::Run() [toolkit/components/startup/nsAppStartup.cpp:288]
18:57:40 INFO - XREMain::XRE_mainRun() [toolkit/xre/nsAppRunner.cpp:3885]
18:57:40 INFO - XREMain::XRE_main(int, char**, nsXREAppData const*) [toolkit/xre/nsAppRunner.cpp:3952]
18:57:40 INFO - XRE_main [toolkit/xre/nsAppRunner.cpp:4155]
18:57:40 INFO - main [browser/app/nsBrowserApp.cpp:224]
Reporter | ||
Comment 1•13 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=20173190&tree=Mozilla-Inbound
Rev3 Fedora 12x64 mozilla-inbound debug test mochitest-5 on 2013-02-27 22:24:16 PST for push 49b4afb3ab2f
slave: talos-r3-fed64-051
22:32:15 INFO - 163534 ERROR TEST-UNEXPECTED-FAIL | /tests/security/ssl/mixedcontent/test_secureAll.html | Assertion count 1 is greater than expected range 0-0 assertions.
22:32:13 INFO - ^G###!!! ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread', file ../../../../netwerk/base/src/NetworkActivityMonitor.cpp, line 273
22:32:13 INFO - nsNetMon_Send [netwerk/base/src/NetworkActivityMonitor.cpp:89]
22:32:14 INFO - ssl_DefSend [security/nss/lib/ssl/ssldef.c:95]
22:32:14 INFO - ssl3_SendRecord [security/nss/lib/ssl/ssl3con.c:2556]
22:32:14 INFO - SSL3_SendAlert [security/nss/lib/ssl/ssl3con.c:2847]
22:32:14 INFO - ssl_SecureClose [security/nss/lib/ssl/sslsecur.c:1063]
22:32:14 INFO - ssl_Close [security/nss/lib/ssl/sslsock.c:2050]
22:32:14 INFO - nsNSSSocketInfo::CloseSocketAndDestroy(nsNSSShutDownPreventionLock const&) [security/manager/ssl/src/nsNSSIOLayer.cpp:729]
22:32:14 INFO - nsSSLIOLayerClose [security/manager/ssl/src/nsNSSIOLayer.cpp:702]
22:32:14 INFO - nsSocketTransport::ReleaseFD_Locked(PRFileDesc*) [netwerk/base/src/nsSocketTransport2.cpp:1432]
22:32:14 INFO - nsSocketTransport::GetSelfAddr(mozilla::net::NetAddr*) [netwerk/base/src/nsSocketTransport2.cpp:1957]
22:32:14 INFO - mozilla::net::nsHttpChannel::OnTransportStatus(nsITransport*, tag_nsresult, unsigned long, unsigned long) [netwerk/protocol/http/nsHttpChannel.cpp:5166]
22:32:14 INFO - nsTransportStatusEvent::Run() [netwerk/base/src/nsTransportUtils.cpp:81]
22:32:14 INFO - nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:627]
22:32:14 INFO - NS_ProcessNextEvent_P(nsIThread*, bool) [obj-firefox/xpcom/build/nsThreadUtils.cpp:238]
22:32:14 INFO - mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) [ipc/glue/MessagePump.cpp:83]
22:32:14 INFO - MessageLoop::RunInternal() [ipc/chromium/src/base/message_loop.cc:217]
22:32:14 INFO - MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:503]
22:32:14 INFO - nsBaseAppShell::Run() [widget/xpwidgets/nsBaseAppShell.cpp:165]
22:32:14 INFO - nsAppStartup::Run() [toolkit/components/startup/nsAppStartup.cpp:289]
22:32:14 INFO - XREMain::XRE_mainRun() [toolkit/xre/nsAppRunner.cpp:3886]
22:32:14 INFO - XREMain::XRE_main(int, char**, nsXREAppData const*) [toolkit/xre/nsAppRunner.cpp:3952]
22:32:14 INFO - XRE_main [toolkit/xre/nsAppRunner.cpp:4156]
22:32:14 INFO - do_main [browser/app/nsBrowserApp.cpp:216]
22:32:14 INFO - main [browser/app/nsBrowserApp.cpp:524]
Summary: intermittent test_unsecureIframeRedirect.html | Assertion count 1 is greater than expected range 0-0 assertions. → intermittent test_secureAll.html, test_unsecureIframeRedirect.html | Assertion count 1 is greater than expected range 0-0 assertions. ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread
Comment hidden (Legacy TBPL/Treeherder Robot) |
Reporter | ||
Comment 3•13 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=20197164&tree=Mozilla-Inbound
Rev5 MacOSX Mountain Lion 10.8 mozilla-inbound debug test mochitest-other on 2013-02-28 14:14:41 PST for push 57fc2a37629f
slave: talos-mtnlion-r5-040
14:16:45 INFO - 945 ERROR TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/content/base/test/chrome/test_csp_bug768029.html | Assertion count 1 is greater than expected range 0-0 assertions.
14:16:29 INFO - ^G###!!! ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread', file ../../../../netwerk/base/src/NetworkActivityMonitor.cpp, line 273
14:16:29 INFO - ssl_DefSend [security/nss/lib/ssl/ssldef.c:95]
14:16:29 INFO - ssl3_SendRecord [security/nss/lib/ssl/ssl3con.c:2556]
14:16:29 INFO - SSL3_SendAlert [security/nss/lib/ssl/ssl3con.c:2847]
14:16:29 INFO - ssl_SecureClose [security/nss/lib/ssl/sslsecur.c:1061]
14:16:29 INFO - ssl_Close [security/nss/lib/ssl/sslsock.c:2050]
14:16:45 INFO - nsSSLIOLayerClose [security/manager/ssl/src/nsNSSIOLayer.cpp:721]
14:16:45 INFO - nsSocketTransport::GetSelfAddr(mozilla::net::NetAddr*) [netwerk/base/src/nsSocketTransport2.cpp:1432]
14:16:45 INFO - mozilla::net::nsHttpChannel::OnTransportStatus(nsITransport*, tag_nsresult, unsigned long long, unsigned long long) [obj-firefox/dist/include/nsCOMPtr.h:783]
14:16:45 INFO - _ZThn776_N7mozilla3net13nsHttpChannel17OnTransportStatusEP12nsITransport12tag_nsresultyy [netwerk/protocol/http/nsHttpChannel.cpp:5192]
14:16:45 INFO - nsTransportStatusEvent::Run() [netwerk/base/src/nsTransportUtils.cpp:78]
14:16:45 INFO - nsThread::ProcessNextEvent(bool, bool*) [xpcom/threads/nsThread.cpp:627]
14:16:45 INFO - NS_ProcessPendingEvents_P(nsIThread*, unsigned int) [obj-firefox/xpcom/build/nsThreadUtils.cpp:188]
14:16:45 INFO - nsBaseAppShell::NativeEventCallback() [widget/xpwidgets/nsBaseAppShell.cpp:98]
14:16:45 INFO - nsAppShell::ProcessGeckoEvents(void*) [widget/cocoa/nsAppShell.mm:388]
14:16:45 INFO - CoreFoundation + 0x12841
14:16:45 INFO - CoreFoundation + 0x12165
14:16:45 INFO - CoreFoundation + 0x354e5
14:16:45 INFO - CoreFoundation + 0x34dd2
14:16:45 INFO - HIToolbox + 0x5f774
14:16:45 INFO - HIToolbox + 0x5f512
14:16:45 INFO - HIToolbox + 0x5f3a3
14:16:45 INFO - AppKit + 0x156fa3
14:16:45 INFO - -AppKit + 0x156862
14:16:45 INFO - -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] [widget/cocoa/nsAppShell.mm:164]
14:16:45 INFO - -AppKit + 0x14dc03
14:16:45 INFO - nsAppShell::Run() [widget/cocoa/nsAppShell.mm:741]
14:16:45 INFO - nsAppStartup::Run() [toolkit/components/startup/nsAppStartup.cpp:288]
14:16:45 INFO - XREMain::XRE_mainRun() [toolkit/xre/nsAppRunner.cpp:3885]
14:16:45 INFO - XREMain::XRE_main(int, char**, nsXREAppData const*) [toolkit/xre/nsAppRunner.cpp:3952]
14:16:45 INFO - XRE_main [toolkit/xre/nsAppRunner.cpp:4155]
14:16:45 INFO - main [browser/app/nsBrowserApp.cpp:224]
Comment 4•13 years ago
|
||
Summary: intermittent test_secureAll.html, test_unsecureIframeRedirect.html | Assertion count 1 is greater than expected range 0-0 assertions. ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread → intermittent test_secureAll.html, test_unsecureIframeRedirect.html, test_stricttransportsecurity.html | Assertion count 1 is greater than expected range 0-0 assertions. ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread
Comment 5•13 years ago
|
||
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 7•13 years ago
|
||
Summary: intermittent test_secureAll.html, test_unsecureIframeRedirect.html, test_stricttransportsecurity.html | Assertion count 1 is greater than expected range 0-0 assertions. ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread → intermittent test_secureAll.html, test_unsecureIframeRedirect.html, test_stricttransportsecurity.html, test_bug215405.xul | Assertion count 1 is greater than expected range 0-0 assertions. ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Reporter | ||
Comment 14•13 years ago
|
||
I landed:
https://hg.mozilla.org/integration/mozilla-inbound/rev/db744bd0ec5b
adding a comment reflecting that one of the more common places this bug occurs is actually annotated.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 16•13 years ago
|
||
Assignee: nobody → dbaron
Reporter | ||
Updated•13 years ago
|
Assignee: dbaron → nobody
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•13 years ago
|
Assignee: nobody → michal.novotny
Assignee | ||
Comment 21•13 years ago
|
||
Some investigation:
Initial read is someone is playing fast-and-loose with thread requirements for NSS sockets by having Transport Events go to the consumer on MainThread, which will seem to work "usually" until NSS has to close a socket and still has data to send. NSS *has* to run on the STS thread, and this violates that contract.
The actual consumer is hidden behind a couple of layers of proxys/runnables, they need to ensure that the target is the STS thread. Or, if you prefer, make it impossible for OnTransportStatus to do any actions that require being on the STS thread (like closing a FD).
Adding
NS_ASSERTION(PR_GetCurrentThread() == gSocketThread, "wrong thread");
to ReleaseFD_Locked() will help ferret out anyone causing PR_Close off STS and may help find most or all of the MainThread watchers of Transport status.
Right now this looks like the HTTP transport is the culprit. Bad things can happen when NSS is run off-main-thread.
Severity: normal → critical
Assignee | ||
Comment 22•13 years ago
|
||
So, is such an assertion in ReleaseFD_Locked() correct? If so, we've got problems; loading a simple page generated like 10 assertions and a bunch more off a timer later.
Assignee | ||
Comment 23•13 years ago
|
||
Per discussion with Patrick in #necko:
This is a race between Socket shutdown and other things, and the Runnable has the last pseudo-reference to the socket, so when it's done with GetSelfAddr(), it calls PR_Close(): but it's not on STS thread, which is a no-no. Patrick's suggestion is to set mFD to nullptr, but dispatch a runnable to STS to actually do the PR_Close().
I'll take a quick crack at a patch
Assignee | ||
Comment 24•13 years ago
|
||
Proposed fix to thunk PR_Close() to STS thread
Assignee | ||
Updated•13 years ago
|
Attachment #735409 -
Flags: review?(mcmanus)
Updated•13 years ago
|
Component: Security: PSM → Networking
Updated•13 years ago
|
Attachment #735409 -
Flags: review?(mcmanus) → review+
Assignee | ||
Comment 25•13 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/e2d34d74ca35
I think the affects all targets...
Target Milestone: --- → mozilla23
Bounced due to build bustage.
https://hg.mozilla.org/integration/mozilla-inbound/rev/b467e7ef9d56
Assignee | ||
Comment 27•13 years ago
|
||
move nested class out of function to avoid visibility warning from NS_IMETHOD
Assignee | ||
Updated•13 years ago
|
Attachment #735409 -
Attachment is obsolete: true
Assignee | ||
Comment 28•13 years ago
|
||
Comment on attachment 735510 [details] [diff] [review]
Don't call PR_Close() off STS thread
carry forward r=mcmanus
review on bustage to smaug
Attachment #735510 -
Flags: review?(bugs)
Comment 29•13 years ago
|
||
Comment on attachment 735510 [details] [diff] [review]
Don't call PR_Close() off STS thread
>+class thunkPRClose : public nsRunnable
Class names tend to start with capital letter.
>+{
>+public:
>+ thunkPRClose(PRFileDesc *fd) : mFD(fd) {}
>+
>+ NS_IMETHOD Run() {
{ should apparently be on the next line even in necko. Can't comment other coding style :)
Attachment #735510 -
Flags: review?(bugs) → review+
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 31•13 years ago
|
||
I mean, if there isn't ns* or some other prefix, class names tend to start with capital letter.
Assignee | ||
Comment 32•13 years ago
|
||
![]() |
||
Comment 33•13 years ago
|
||
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Updated•13 years ago
|
Assignee: michal.novotny → rjesup
Comment 34•13 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=21698848&full=1&branch=mozilla-aurora#error0
Can we uplift this to Aurora please?
Assignee | ||
Comment 35•13 years ago
|
||
Comment on attachment 735510 [details] [diff] [review]
Don't call PR_Close() off STS thread
[Approval Request Comment]
Bug caused by (feature/regressing bug #): N/A -- though something might have made this more likely to be hit
User impact if declined: Random oranges and shutdown-time crashes. Possible security impacts
Testing completed (on m-c, etc.): on m-c for some time; oranges are gone.
Risk to taking this patch (and alternatives if risky): low risk; few things would depend on a network socket actually closing synchronously when released on the MainThread (which is an odd edge-case to begin with)
String or IDL/UUID changes made by this patch: none
Attachment #735510 -
Flags: approval-mozilla-aurora?
Comment 36•13 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #35)
> Comment on attachment 735510 [details] [diff] [review]
> Don't call PR_Close() off STS thread
>
> [Approval Request Comment]
> Bug caused by (feature/regressing bug #): N/A -- though something might have
> made this more likely to be hit
>
> User impact if declined: Random oranges and shutdown-time crashes. Possible
> security impacts
>
> Testing completed (on m-c, etc.): on m-c for some time; oranges are gone.
>
> Risk to taking this patch (and alternatives if risky): low risk; few things
> would depend on a network socket actually closing synchronously when
> released on the MainThread (which is an odd edge-case to begin with)
Based on the risk here, this looks ok to be uplifted on aurora given it fixes the intermittent failure. But before than, can you please further explain on how (un)common is a user scenario as mentioned above. We have often seen enterprise users hitting edge cases like these where a regular FX user may not hit day-day . If there is any additional testing needed or not?
>
> String or IDL/UUID changes made by this patch: none
Assignee | ||
Comment 37•13 years ago
|
||
It's very hard for something to depend on when the actual socket closes, since the code that causes this case (release on MainThread) is supposed to happen, and in particular, this must be happening in a runnable dispatched with a "reference" (not an XPCOM reference, but equivalent) from STS thread to MainThread, and before the Main Thread can process the runnable, the STS thread manages to release it's own "reference" to the socket (perhaps only possible at shutdown).
I can't really think of any way something could depend on this. It may change some timings slightly, but releasing the socket on main-thread could cause random crashes and other undefined behavior, as NSS doesn't want to be called off-STS-thread. This makes me concerned from a security aspect, though I have no proof anything can come of it.
If you want a second opinion, I'd suggest Patrick McManus.
The sheriffs want it to reduce random oranges.
Comment 38•13 years ago
|
||
Comment on attachment 735510 [details] [diff] [review]
Don't call PR_Close() off STS thread
Thanks for the explanation Randell.Based on your comment's this looks good to be uplifted on aurora.
Attachment #735510 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 39•13 years ago
|
||
status-firefox22:
--- → fixed
status-firefox23:
--- → fixed
Reporter | ||
Comment 40•12 years ago
|
||
More test cleanup: https://hg.mozilla.org/integration/mozilla-inbound/rev/d986ae112cb6
Comment 41•12 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•