Closed Bug 588931 Opened 9 years ago Closed 9 years ago

Crash in [@ nssTrustDomain_LockCertCache ]


(Core :: Security: PSM, defect, critical)

Not set





(Reporter: marcia, Unassigned)



(Keywords: crash, topcrash, Whiteboard: See comment 2)

Crash Data

Seen while sifting through trunk crash stats. links to all the crashes. No comments in any of the reports. Crash data indicates it started showing up in 2010081800 builds and there have been a few using today's nightly as well.

Apologies in advance if I selected the incorrect location for this bug.

Frame  	Module  	Signature [Expand]  	Source
0 	libnss3.dylib 	nssTrustDomain_LockCertCache 	tdcache.c:405
1 	libnss3.dylib 	nssCertificate_Destroy 	certificate.c:144
2 	libssl3.dylib 	ssl3_CleanupPeerCerts 	ssl3con.c:7744
3 	libssl3.dylib 	ssl3_DestroySSL3Info 	ssl3con.c:9452
4 	libssl3.dylib 	ssl_DestroySocketContents 	sslsock.c:407
5 	libssl3.dylib 	ssl_FreeSocket 	sslsock.c:478
6 	libssl3.dylib 	ssl_DefClose 	ssldef.c:233
7 	XUL 	nsNSSSocketInfo::CloseSocketAndDestroy 	security/manager/ssl/src/nsNSSIOLayer.cpp:1825
8 	XUL 	nsSSLThread::requestClose 	security/manager/ssl/src/nsSSLThread.cpp:437
9 	XUL 	nsSSLIOLayerClose 	security/manager/ssl/src/nsNSSIOLayer.cpp:1814
10 	XUL 	nsSocketTransport::OnSocketDetached 	netwerk/base/src/nsSocketTransport2.cpp:1395
11 	XUL 	nsSocketTransportService::DetachSocket 	netwerk/base/src/nsSocketTransportService2.cpp:187
12 	XUL 	nsSocketTransportService::DoPollIteration 	netwerk/base/src/nsSocketTransportService2.cpp:658
13 	XUL 	nsSocketTransportService::OnProcessNextEvent 	netwerk/base/src/nsSocketTransportService2.cpp:543
14 	XUL 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:517
15 	XUL 	NS_ProcessPendingEvents_P 	nsThreadUtils.cpp:200
16 	XUL 	nsSocketTransportService::Run 	netwerk/base/src/nsSocketTransportService2.cpp:579
17 	XUL 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:547
18 	XUL 	NS_ProcessNextEvent_P 	nsThreadUtils.cpp:250
19 	XUL 	nsThread::ThreadFunc 	xpcom/threads/nsThread.cpp:263
20 	libnspr4.dylib 	_pt_root 	nsprpub/pr/src/pthreads/ptthread.c:228
21 	libSystem.B.dylib 	_pthread_start 	
22 	libSystem.B.dylib 	thread_start
I believe that this occurs after Firefox sync connects to server.

Before this crash, firefox shows "The operation can not be completed because of an internal failure. A secure network communication has not been cleaned up correctly." dialog message.
Oh, so this is due to the browser shutting down NSS while some SSL socket 
is still open, then trying to shut down that SSL socket.  

I thought PSM was supposed to prevent that somehow. (?)

Most likely I'll give this bug to PSM.
The error message referred to in Comment 1 is filed as Bug 588511.
#2 top crash in early firefox 4.0b5 data
blocking2.0: --- → ?
Assignee: nobody → nobody
Component: Libraries → Security: PSM
Product: NSS → Core
QA Contact: libraries → psm
Whiteboard: See comment 2
Version: trunk → Trunk
I've seen this twice today - both times it was after installing b5 after b4 had been open, I speculate based on tabs to yahoo and gmail - it looked like it tried to restore the tabs after install, but something went really awry. Opened a blank browser and couldn't do anything until program shutdown. After restart, seemed to be fine. Error box said "The operation can not be completed because of an internal failure. A secure network communication has not been cleaned up correctly."
Who can we talk to about this? It looks like someone is suddenly shutting down NSS. Both this one and bug 517615 . Maybe we can put an assert in shutdown and find out who is causing this?

(of course we should comment out the shutdown call in PSM). Do we have any indication if these are happening when we are trying to shutdown the browser. Is there some new profile switching feature in FF 4.0?

Here are some user comments from the crash reports:

"It restarted after applying the update to b5 and it promptly crashed."

"Running firefox from remote desktop"

"Updated to the lastest version of 4b5. Crashes after trying to restore previous session."

I have never hit this crash stack myself, but I have definitely seen the error message in Bug 588511 when testing on trunk Win XP and Vista, and this is after I had applied a software update.
Bob, As you know, FF is still using cert8.db, the Berkeley DB version.
IIRC, that DB has the property that, whenever the DB is enlarged, it MUST
be closed properly or it becomes corrupt.  It's not enough to force the 
flush of the dirty data pages (as we do before each softoken operation 
completes) when the DB enlarges, because this does not write the DB header
(page 0), IIRC.  Only file close does that, IIRC.  So, I'm worried that 
removing the NSS_Shutdown call will guarantee the creation of corrupt 
cert8.DBs in new profiles.
I could reproduce this reliably a few days ago with these steps:

- Run Firefox 4 (with some add-ons) and close it
- Run Firefox 3.6 with same profile (it will check for add-on compatibility) and close it
- Run Firefox 4 again with same profile. After checking add-on compatibility again, bug 588511 is shown. After clicking Ok the browser opens (sometimes it can't load any data from the web) and after a few seconds it crashed.

Today with the same steps I've been unable to reproduce the crash, although I still get webpages failing to load after bug 588511 happens
mconner/ed, can you have a look from the sync side?  other comments talk about update invovled, but maybe that is just a by stander.
I'm betting this is tied to bug 588511.  We're way way out of the startup path on trunk (no more binary component) so I'm betting it's tied to the addon manager invoking NSS really early.
Depends on: 588511
Agreed -- haven't seen this since the fix for bug 588511 landed (was first in the sep 15 nightly). DUP!
Closed: 9 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 588511
blocking2.0: ? → ---
Crash Signature: [@ nssTrustDomain_LockCertCache ]
You need to log in before you can comment on or make changes to this bug.