Closed Bug 255141 Opened 20 years ago Closed 1 year ago

PSM attempts to access socket notificationCallbacks on main thread [was: ###!!! ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread', file r:/mozilla/netwerk/protocol/http/src/nsHttpConnection.cpp, line 784]

Categories

(Core :: Networking, defect, P3)

Other Branch
x86
Windows XP
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: timeless, Unassigned)

References

()

Details

(Whiteboard: [kerh-cuz][psm-logic][necko-backlog])

>	xpcom.dll!nsDebugImpl::Assertion(const char * aStr=0x011ba2dc, const char *
aExpr=0x011ba3f4, const char * aFile=0x011d1ae0, int aLine=0x00000310)  Line 205	C++
 	xpcom.dll!nsDebug::Assertion(const char * aStr=0x011ba2dc, const char *
aExpr=0x011ba3f4, const char * aFile=0x011d1ae0, int aLine=0x00000310)  Line 109	C++
 	necko.dll!nsHttpConnection::GetInterface(const nsID & iid={...}, void * *
result=0x0140f868)  Line 784 + 0x28	C++
 	xpcom.dll!XPTC_InvokeByIndex(nsISupports * that=0x02ba9fa4, unsigned int
methodIndex=0x00000003, unsigned int paramCount=0x00000002, nsXPTCVariant *
params=0x02c557c0)  Line 102	C++
 	xpcom.dll!nsProxyObject::Post(unsigned int methodIndex=0x00000003,
nsXPTMethodInfo * methodInfo=0x00f00720, nsXPTCMiniVariant * params=0x0212fbc0,
nsIInterfaceInfo * interfaceInfo=0x02b71c50)  Line 475 + 0x14	C++
 	xpcom.dll!nsProxyEventObject::CallMethod(unsigned short methodIndex=0x0003,
const nsXPTMethodInfo * info=0x00f00720, nsXPTCMiniVariant * params=0x0012fbc0)
 Line 550	C++
 	xpcom.dll!PrepareAndDispatch(nsXPTCStubBase * self=0x02bcf120, unsigned int
methodIndex=0x00000003, unsigned int * args=0x0012fc84, unsigned int *
stackBytesToPop=0x0012fc74)  Line 117 + 0x1a	C++
 	xpcom.dll!SharedStub()  Line 147	C++
 	pipnss.dll!nsNSSSocketInfo::GetInterface(const nsID & uuid={...}, void * *
result=0x0140f868)  Line 317 + 0x11	C++
 	xpcom.dll!XPTC_InvokeByIndex(nsISupports * that=0x02bcdf10, unsigned int
methodIndex=0x00000003, unsigned int paramCount=0x00000002, nsXPTCVariant *
params=0x02c4a398)  Line 102	C++
 	xpcom.dll!EventHandler(PLEvent * self=0x02c4a3e8)  Line 561 + 0x16	C++
 	xpcom.dll!PL_HandleEvent(PLEvent * self=0x02c4a3e8)  Line 693	C
 	xpcom.dll!PL_ProcessPendingEvents(PLEventQueue * self=0x00f5c328)  Line 628	C
 	xpcom.dll!_md_TimerProc(HWND__ * hwnd=0x017e039c, unsigned int
uMsg=0x00000113, unsigned int idEvent=0x00000000, unsigned long
dwTime=0xf57ab4ab)  Line 998 + 0x6	C
 	user32.dll!77d43a50() 	
 	user32.dll!GetSysColor()  + 0x10f	
 	user32.dll!TranslateMessage()  + 0x8d	
 	user32.dll!DispatchMessageW()  + 0xb	
 	appshell.dll!nsAppShellService::Run()  Line 489	C++
 	mozilla.exe!main1(int argc=0x00000004, char * * argv=0x02bcf170, nsISupports *
nativeApp=0x00000001)  Line 1322	C++
 	mozilla.exe!main(int argc=0x00000004, char * * argv=0x003f7c50)  Line 1812 +
0x16	C++
 	mozilla.exe!mainCRTStartup()  Line 400 + 0x11	C
 	kernel32.dll!GetCurrentDirectoryW()  + 0x44
hmm... how can i reproduce this stack?  looks to me like PSM, on the main
thread, is accessing the nsIInterfaceRequestor passed to it on the socket
transport thread.  that seems like something we should not do.  hmm...

-> PSM
Status: NEW → ASSIGNED
Component: Networking: HTTP → Client Library
Product: Browser → PSM
Summary: ###!!! ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread', file r:/mozilla/netwerk/protocol/http/src/nsHttpConnection.cpp, line 784 → PSM attempts to access socket notificationCallbacks on main thread [was: ###!!! ASSERTION: wrong thread: 'PR_GetCurrentThread() == gSocketThread', file r:/mozilla/netwerk/protocol/http/src/nsHttpConnection.cpp, line 784]
Version: Trunk → unspecified
QA Contact: core.networking.http → kaie
Product: PSM → Core
Whiteboard: [kerh-cuz]
Assignee: darin → kengert
Status: ASSIGNED → NEW
QA Contact: kengert
QA Contact: ui
(In reply to comment #1)
> looks to me like PSM, on the main
> thread, is accessing the nsIInterfaceRequestor passed to it on the socket
> transport thread.  that seems like something we should not do.

Not sure if this bug is still valid.
In my recent attempts to fix bug 420187 I saw the code that is mentioned here.

The code used as of today, nsNSSSocketInfo::SetNotificationCallbacks, will proxy all accesses to nsIInterfaceRequestor to the main thread.

Is that right or wrong?
Putting all that stuff on the main thread sounds right to me.
Assignee: kaie → nobody
Whiteboard: [kerh-cuz] → [kerh-cuz][psm-logic]
I think there is still a problem here. As of today the code looks like:

..

// not called on the socket transport thread
NS_IMETHODIMP
nsHttpConnection::GetInterface(const nsIID &iid, void **result)
{
    // NOTE: This function is only called on the UI thread via sync proxy from
    //       the socket transport thread.  If that weren't the case, then we'd
    //       have to worry about the possibility of mTransaction going away
    //       part-way through this function call.  See CloseTransaction.
    NS_ASSERTION(PR_GetCurrentThread() != gSocketThread, "wrong thread");
[..]
        mTransaction->GetSecurityCallbacks(getter_AddRefs(callbacks));

...

That comment indicates that this code should be executed on the UI thread but the network thread should be blocked waiting for the sync proxy callout to the UI thread to return. If the network thread is running concurrently bad things(tm) can happen.

That makes sense.

But I started looking at it because it can call nsHttpPipeline::GetSecurityCallbacks() and that asserts if it is not running on the network thread (mTransaction can be a pipeline object). Once I understood what was going on, I was just going to change the assert in pipeline::GSC() to be a comment to reflect that the network thread was paused..

Unfortunately, I didn't find the network thread to be blocked..

nsHttpConnection::GetInterface() was indeed being executed on the UI thread... but the corresponding proxy call was being made from security/manager/ssl/src/nsNSSIOLayer.cpp:384 on the  nsPSMBackgroundThread thread.

0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x00007ffff3c6ad2a in PR_WaitCondVar (cvar=0x7fffce8b0b80, timeout=4294967295) at /home/mcmanus/src/mozilla2/wd/pipemq/nsprpub/pr/src/pthreads/ptsynch.c:417
#2  0x00007ffff3c6b4bc in PR_Wait (mon=0x7fffcf199c80, timeout=4294967295) at /home/mcmanus/src/mozilla2/wd/pipemq/nsprpub/pr/src/pthreads/ptsynch.c:614
#3  0x00007ffff4e916c4 in nsAutoMonitor::Wait (this=0x7fffe03f7ea0, interval=4294967295) at ../../../dist/include/nsAutoLock.h:346
#4  0x00007ffff64f2ad9 in nsEventQueue::GetEvent (this=0x7fffd052c6b0, mayWait=1, result=0x7fffe03f7f60) at /home/mcmanus/src/mozilla2/wd/pipemq/xpcom/threads/nsEventQueue.cpp:85
#5  0x00007ffff64f5a18 in nsThread::nsChainedEventQueue::GetEvent (this=0x7fffd052c6a0, mayWait=1, event=0x7fffe03f7f60)
    at /home/mcmanus/src/mozilla2/wd/pipemq/xpcom/threads/nsThread.h:112
#6  0x00007ffff64f509b in nsThread::ProcessNextEvent (this=0x7fffce443430, mayWait=1, result=0x7fffe03f7fcc)
    at /home/mcmanus/src/mozilla2/wd/pipemq/xpcom/threads/nsThread.cpp:530
#7  0x00007ffff647a0b8 in NS_ProcessNextEvent_P (thread=0x7fffce443430, mayWait=1) at nsThreadUtils.cpp:250
#8  0x00007ffff6502267 in nsProxyEventObject::CallMethod (this=0x7fffce8b0b00, methodIndex=3, methodInfo=0x7fffeb2c21c0, params=0x7fffe03f80d0)
    at /home/mcmanus/src/mozilla2/wd/pipemq/xpcom/proxy/src/nsProxyEventObject.cpp:260
#9  0x00007ffff65147ff in PrepareAndDispatch (self=0x7fffce439c00, methodIndex=3, args=0x7fffe03f8260, gpregs=0x7fffe03f81e0, fpregs=0x7fffe03f8210)
    at /home/mcmanus/src/mozilla2/wd/pipemq/xpcom/reflect/xptcall/src/md/unix/xptcstubs_x86_64_linux.cpp:153
#10 0x00007ffff6514893 in SharedStub () at /home/mcmanus/src/mozilla2/wd/pipemq/xpcom/reflect/xptcall/src/md/unix/xptcstubs_x86_64_linux.cpp:159
#11 0x00007ffff64729d2 in nsGetInterface::operator() (this=0x7fffe03f8350, aIID=..., aInstancePtr=0x7fffe03f82f8) at nsIInterfaceRequestorUtils.cpp:52
#12 0x00007ffff507cb03 in nsCOMPtr<nsIDocShellTreeItem>::assign_from_helper (this=0x7fffe03f8420, helper=..., aIID=...) at ../../dist/include/nsCOMPtr.h:1272
#13 0x00007ffff5079c37 in nsCOMPtr (this=0x7fffe03f8420, helper=...) at ../../dist/include/nsCOMPtr.h:644
#14 0x00007ffff5f144cb in nsNSSSocketInfo::EnsureDocShellDependentStuffKnown (this=0x7fffcf143300)
    at /home/mcmanus/src/mozilla2/wd/pipemq/security/manager/ssl/src/nsNSSIOLayer.cpp:384
#15 0x00007ffff5f15b43 in nsNSSSocketInfo::GetPreviousCert (this=0x7fffcf143300, _result=0x7fffe03f8790)
    at /home/mcmanus/src/mozilla2/wd/pipemq/security/manager/ssl/src/nsNSSIOLayer.cpp:844
#16 0x00007ffff5f00e38 in HandshakeCallback (fd=0x7fffd0044190, client_data=0x7fffcf143300)
    at /home/mcmanus/src/mozilla2/wd/pipemq/security/manager/ssl/src/nsNSSCallbacks.cpp:946
#17 0x00007fffefbb49e0 in ssl3_HandleFinished (ss=0x7fffcf184000, 
    b=0x7fffcf1a6004 "j,\273\017me\372;\365E<\a\263\023\305\331Ug\353\067\023\203c\335\346\020\375f\vy\224\275\003\003\003\003A0\r\006\t*\206H\206\367\r\001\001\005\005", 
    length=12, hashes=0x7fffe03f8930) at ssl3con.c:8503
#18 0x00007fffefbb5097 in ssl3_HandleHandshakeMessage (ss=0x7fffcf184000, 
    b=0x7fffcf1a6004 "j,\273\017me\372;\365E<\a\263\023\305\331Ug\353\067\023\203c\335\346\020\375f\vy\224\275\003\003\003\003A0\r\006\t*\206H\206\367\r\001\001\005\005", 
    length=12) at ssl3con.c:8659
#19 0x00007fffefbb52e5 in ssl3_HandleHandshake (ss=0x7fffcf184000, origBuf=0x7fffcf184368) at ssl3con.c:8727
#20 0x00007fffefbb5f95 in ssl3_HandleRecord (ss=0x7fffcf184000, cText=0x7fffe03f8af0, databuf=0x7fffcf184368) at ssl3con.c:9066
#21 0x00007fffefbb7148 in ssl3_GatherCompleteHandshake (ss=0x7fffcf184000, flags=0) at ssl3gthr.c:209
#22 0x00007fffefbb9f19 in ssl_GatherRecord1stHandshake (ss=0x7fffcf184000) at sslcon.c:1258
#23 0x00007fffefbc63db in ssl_Do1stHandshake (ss=0x7fffcf184000) at sslsecur.c:151
#24 0x00007fffefbc8877 in ssl_SecureSend (ss=0x7fffcf184000, 
    buf=0x7fffcf11cc00 "GET / HTTP/1.1\r\nHost: www.paypal.com\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:2.0b7pre) Gecko/20100929 Firefox/4.0b7pre\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r"..., len=881, flags=0) at sslsecur.c:1213
#25 0x00007fffefbc8a34 in ssl_SecureWrite (ss=0x7fffcf184000, 
    buf=0x7fffcf11cc00 "GET / HTTP/1.1\r\nHost: www.paypal.com\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:2.0b7pre) Gecko/20100929 Firefox/4.0b7pre\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r"..., len=881) at sslsecur.c:1258
#26 0x00007fffefbd0e29 in ssl_Write (fd=0x7fffd0044190, buf=0x7fffcf11cc00, len=881) at sslsock.c:1652
#27 0x00007ffff5efa64b in nsSSLThread::Run (this=0x7fffd9bdbdc0) at /home/mcmanus/src/mozilla2/wd/pipemq/security/manager/ssl/src/nsSSLThread.cpp:1045
#28 0x00007ffff5ef84d7 in nsPSMBackgroundThread::nsThreadRunner (arg=0x7fffd9bdbdc0) at /home/mcmanus/src/mozilla2/wd/pipemq/security/manager/ssl/src/nsPSMBackgroundThread.cpp:44
#29 0x00007ffff3c72517 in _pt_root (arg=0x7fffda6e99b0) at /home/mcmanus/src/mozilla2/wd/pipemq/nsprpub/pr/src/pthreads/ptthread.c:228
#30 0x00007ffff7bc59ca in start_thread (arg=<value optimized out>) at pthread_create.c:300
#31 0x00007ffff04006fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#32 0x0000000000000000 in ?? ()

The network thread was sitting in PR_Poll as usual - ready to run at any time and cause havoc. 

Its not clear to me how to fix it.

You can reproduce pretty much at will by just breaking on the httpConnection code and pulling up an https url.
Blocks: 599164
Blocks: 603503
Patrick: I believe your comment 4 is out of original scope of this *6* years old bug.  The code has been changed a lot since that time.  However, what you have reported should be investigated in a different bug, looks like there may be a potential race condition.

timeless, kaie: Shouldn't we close this (and potentially open new bugs if there are still some issues) ?  I don't think the description is reproducible any more.
(In reply to comment #5)
 However, what you
> have reported should be investigated in a different bug, looks like there may
> be a potential race condition.
> 

sure - 615342.
No longer blocks: 599164
No longer blocks: 603503
PSM pretty much has to access some of the callbacks on the main thread, because some of the callbacks may be implemented in JS and that JS code has to run on the main thread.

Now, the question is whether PSM should also be blocking the socket transport thread from doing anything while it is playing with the callbacks. That seems to be what bug 615342 is about.

I am resolving this as invalid based on the above thinking. Please re-open if I am wrong about something.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → INVALID
to resolve this I would think minimally the assertion needs to be removed because it is asserting something that is not universally true.

right?
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
If this is still relevant, it seems like this should be in Networking, not Security: UI (fwiw, nsPSMBackgroundThread is now dead code that is being removed in bug 1301157).
Component: Security: UI → Networking
Whiteboard: [kerh-cuz][psm-logic] → [kerh-cuz][psm-logic][necko-backlog]
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: -- → P1
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P1 → P3
Severity: normal → S3

(In reply to Dana Keeler (she/her) (use needinfo) (:keeler for reviews) from comment #9)

If this is still relevant, it seems like this should be in Networking, not
Security: UI (fwiw, nsPSMBackgroundThread is now dead code that is being
removed in bug 1301157).

The thread is now gone, and nsHttpConnection::GetInterface seems to have no code coverage anymore (I assume it is dead code now).
We'll remove it (or at least the warning in bug 615342)

Status: REOPENED → RESOLVED
Closed: 13 years ago1 year ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.