Last Comment Bug 522041 - Crash on startup with FIPS mode enabled [@pthread_mutex_lock | nsSSLIOLayerAddToSocket] when NSS .chk files are missing
: Crash on startup with FIPS mode enabled [@pthread_mutex_lock | nsSSLIOLayerAd...
Status: VERIFIED DUPLICATE of bug 521849
: crash, regression, relnote
Product: Core
Classification: Components
Component: Security: PSM (show other bugs)
: 1.9.1 Branch
: x86 Mac OS X
: P2 critical with 4 votes (vote)
: mozilla1.9.3a1
Assigned To: Honza Bambas (:mayhemer)
:
Mentors:
honzab.moz@firemni.cz
Depends on: 522220
Blocks: 509319 515645
  Show dependency treegraph
 
Reported: 2009-10-13 08:47 PDT by Henrik Skupin (:whimboo)
Modified: 2011-06-09 14:58 PDT (History)
25 users (show)
mbeltzner: blocking1.9.2-
anthony.s.hughes: in‑litmus-
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
wanted


Attachments
v1 (1001 bytes, patch)
2009-12-28 03:53 PST, Honza Bambas (:mayhemer)
no flags Details | Diff | Review

Description Henrik Skupin (:whimboo) 2009-10-13 08:47:25 PDT
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2b1pre) Gecko/20091011 Namoroka/3.6b1pre ID:20091011033822

Starting Namoroka with FIPS mode enabled by a 1.9.1 branch version (e.g. Firefox 3.5.1) always crashes the browser.

Steps:
1. Start Firefox 3.5.1 and set a Master Password
2. Goto Preferences | Advanced | Encryption | Security Devices and enable the FIPS mode
3. Start Namoroka

In step 3 Namoroka always crashes on my machine within 4s.

Crash report: bp-2189e99b-4e76-4665-9760-ac5702091013

First 10 frames:
0  	libSystem.B.dylib  	pthread_mutex_lock  	
1 	libnspr4.dylib 	PR_Lock 	nsprpub/pr/src/pthreads/ptsynch.c:206
2 	XUL 	nsSSLIOLayerAddToSocket 	
3 	XUL 	nsSSLIOLayerNewSocket 	
4 	XUL 	nsSSLSocketProvider::NewSocket 	
5 	XUL 	nsSocketTransport::BuildSocket 	
6 	XUL 	nsSocketTransport::InitiateSocket 	
7 	XUL 	nsSocketTransport::OnSocketEvent 	
8 	XUL 	nsSocketEvent::Run 	
9 	XUL 	nsThread::ProcessNextEvent 	
10 	XUL 	NS_ProcessNextEvent_P
Comment 1 Henrik Skupin (:whimboo) 2009-10-13 08:50:56 PDT
Btw, it only happens with optimized builds. Sadly I'm not able to crash my debug build.
Comment 2 Henrik Skupin (:whimboo) 2009-10-13 08:51:15 PDT
Oh and I have seen this while checking bug 521878.
Comment 3 Henrik Skupin (:whimboo) 2009-10-13 08:54:42 PDT
Regression between 090901 and 091001. I will bisect to find the regression range.
Comment 4 Henrik Skupin (:whimboo) 2009-10-13 09:28:09 PDT
Regressed between the builds 09092103 and 09092203.

Pass: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/25e1253030f4
Fail: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/1616267e8153

Changesets: http://hg.mozilla.org/releases/mozilla-1.9.2/pushloghtml?fromchange=25e1253030f4&tochange=1616267e8153

I can see only one fix in this time range which could have been regressed this crash. It's http://hg.mozilla.org/releases/mozilla-1.9.2/rev/fb2192ebeff0.

Looks like another fallout from bug 516396.
Comment 5 Henrik Skupin (:whimboo) 2009-10-13 09:46:13 PDT
Looks like comment 4 has the wrong range. While the crash always happens with builds starting from 090922 it more tricky to get it crash with earlier builds. I tried it with builds from the days before and crashed too now. ATM I don't get 09091603 to crash while 09091703 crashes:

Pass: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/0ee58a54c5d6
Fail: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/34da890d632d

http://hg.mozilla.org/releases/mozilla-1.9.2/pushloghtml?fromchange=0ee58a54c5d6&tochange=34da890d632d

Probably related bugs: bug 509558, bug 509319.
Comment 6 Henrik Skupin (:whimboo) 2009-10-13 10:11:57 PDT
Given by bug 521878 it's more a regression from bug 509319.
Comment 7 Henrik Skupin (:whimboo) 2009-10-13 17:09:19 PDT
I have updated my Shiretoko builds to the recent version and those are crashing too now. Running with Firefox 3.5.4 doesn't crash. So it looks like we are safe for 3.5.4. But I would like to test with the patch on bug 521878.
Comment 9 Henrik Skupin (:whimboo) 2009-10-13 17:13:04 PDT
The stack is a bit different: bp-d91d86be-2aca-4870-b6a9-6f69d2091013

0  	libSystem.B.dylib  	pthread_mutex_lock  	
1 	libnspr4.dylib 	PR_Lock 	nsprpub/pr/src/pthreads/ptsynch.c:206
2 	XUL 	nsSSLIOLayerHelpers::isKnownAsIntolerantSite 	nsAutoLock.h:219
3 	XUL 	nsSSLIOLayerAddToSocket 	security/manager/ssl/src/nsNSSIOLayer.cpp:3421
4 	XUL 	nsSSLIOLayerNewSocket 	security/manager/ssl/src/nsNSSIOLayer.cpp:2174
5 	XUL 	nsSSLSocketProvider::NewSocket 	security/manager/ssl/src/nsSSLSocketProvider.cpp:72
6 	XUL 	nsSocketTransport::BuildSocket 	netwerk/base/src/nsSocketTransport2.cpp:1016
7 	XUL 	nsSocketTransport::InitiateSocket 	netwerk/base/src/nsSocketTransport2.cpp:1118
8 	XUL 	nsSocketTransport::OnSocketEvent 	netwerk/base/src/nsSocketTransport2.cpp:1447
9 	XUL 	nsSocketEvent::Run 	netwerk/base/src/nsSocketTransport2.cpp:98
10 	XUL 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:521
Comment 10 Henrik Skupin (:whimboo) 2009-10-13 17:16:46 PDT
As talked with Nick on IRC that only crashes nightly builds of Firefox.
Comment 11 Nick Thomas [:nthomas] 2009-10-13 18:20:23 PDT
Turns out there aren't any chk files in the nightly builds, but are in the "3.5.4 build 1" release build. Comparing mozconfigs [1] implicates the --disable-install-strip in the nightly config, because of 
  http://mxr.mozilla.org/mozilla1.9.2/source/toolkit/mozapps/installer/packager.mk#386
On mac we compile ppc and i386, remove the chk files for both, create the universal build, and then should recreate the chk files for the fat binary. Unless you set --disable-install-strip that is.

The mozconfig change landed between 2009-09-16 and 2009-09-17 nightlies, so that matches the regression window in comment #5.

There are two bugs here I think:
1) --disable-installer-strip is overloaded, and probably shouldn't control creating nss checksums. A Core:Build Config bug.
1) NSS should handle missing chk files more gracefully

------
[1] http://hg.mozilla.org/build/buildbot-configs/file/b135980dd22b/mozilla2/macosx/mozilla-central/nightly/mozconfig (yes really, there's a symlink)
http://hg.mozilla.org/build/buildbot-configs/file/b135980dd22b/mozilla2/macosx/mozilla-1.9.2/release/mozconfig
Comment 12 Nelson Bolyard (seldom reads bugmail) 2009-10-13 19:29:51 PDT
I'm not yet convinced there's any NSS bug here.
I have yet to see a stack with any NSS functions on it.
The stacks in comment 0 and comment 9 contain no NSS functions whatsoever.
What they DO contain is lots of PSM code, which is browser code. 
If PSM is calling NSPR to lock a PRLock with a NULL PRLock pointer, that's
not necessarily an NSS fault nor an NSPR fault.  

Maybe some function is being consistently omitted from these stack traces.
If and When it comes to light, that will be one thing.  

I believe that the absent .chk files cause NSS functions to fail gracefully.
I suspect that some PSM code doesn't notice that NSS has reported failure,
and plows on ahead into the abyss.
Comment 13 Ted Mielczarek [:ted.mielczarek] 2009-10-14 03:59:20 PDT
(In reply to comment #11)
> 1) --disable-installer-strip is overloaded, and probably shouldn't control
> creating nss checksums. A Core:Build Config bug.

I have filed bug 522220 on this issue.
Comment 14 glen beasley 2009-10-20 16:10:03 PDT
Installing Namoroka/3.6b1pre I was able to reproduce this error. 

Installing  Firefox 3.6beta1 I no longer see the error. 

http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/3.6b1-candidates/build1/

I also have older Firefox 3.5.3 installed on my system.

If I manually corrupt Firefox 3.5.3 or newer Firefox 3.6beta1 by manually removing 
.chk file from the installed locations each browser version has the same behavior showing 
an alert window on startup stating the security component was not initialized, and the user needs to 
correct the issue... (not the exact wording). 

I believe the fixed was done by bug 509319 so marking as duplicate

*** This bug has been marked as a duplicate of bug 509319 ***
Comment 15 Wan-Teh Chang 2009-10-20 20:10:25 PDT
Glen, this bug is about PSM crashing when NSS can not be initialized (in FIPS mode).
Ideally, PSM should not crash if NSS initialization fails.  So this bug is not a duplicate
of bug 509319.
Comment 16 glen beasley 2009-10-21 11:03:59 PDT
(In reply to comment #15)
> Glen, this bug is about PSM crashing when NSS can not be initialized (in FIPS
> mode).
> Ideally, PSM should not crash if NSS initialization fails.  So this bug is not
> a duplicate
> of bug 509319.

I agree that PSM requires a fix when it is not able to initialize NSS in fips mode
but I believe bug 503418 is the more appropriate test case for that issue.
This description has FIPS mode enabled by a previous working version of Firefox, then you
install a newer version of Firefox such as Namoroka and the new Firefox version
is unable to launch. 

When I duped this bug I had only tested with firefox 3.6beta1 which installs the .chk files correctly.

I now have tested 

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2b2pre)
Gecko/20091020 Namoroka/3.6b2pre

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.5pre)
Gecko/20091020 Shiretoko/3.5.5pre

both of these recent nightly builds are missing the required .chk files and therefore
will not be able to initialize FIPS mode.


but both on startup come up with an Alert window that states:

"Could not initialize the application's security component. The most likely cause is problems with files in your application's profile directory. Please check that this directory has no read/write restrictions and your hard disk is not full or close to full. It is recommended that you exit the application and fix the problem. If you continue to use this session, you might see incorrect application behaviour when accessing security features."


If you then try to go to a site that requires SSL you get an Alert stating 

Secure Connection Failed
        
An error occurred during a connection to www.wellsfargo.com.

Can't connect securely because the SSL protocol has been disabled.

(Error code: ssl_error_ssl_disabled)

I would say that this bug is Fixed for the test case description of this bug. The security component is 
not initialize, the initialize alert should help the user in fixing the problem. Granted if the .chk files were not installed the user will have to get a new version of Firefox but at least the Alert and the fact that the security component was not initialized is correct behavior.

There are two open issues that need to be addressed for bug 503418:

1) nightly builds require .chk files I believe bug 522220 should 
address this issue. 
2) PSM should not crash if it is unable to put NSS in FIPS mode. 
I plan to address this issue in bug 511320.
Comment 17 Wan-Teh Chang 2009-10-22 13:51:43 PDT
Glen, since you have examined these bugs in depth, please
mark this bug with the right resolution.  Thanks!
Comment 18 Henrik Skupin (:whimboo) 2009-10-23 02:15:18 PDT
(In reply to comment #16)
> I would say that this bug is Fixed for the test case description of this bug.
> The security component is 
> not initialize, the initialize alert should help the user in fixing the
> problem. Granted if the .chk files were not installed the user will have to get
> a new version of Firefox but at least the Alert and the fact that the security
> component was not initialized is correct behavior.

This bug is not fixed. The most recent nigthly build of Namoroka still crashes for me with the given steps. See bp-8cd93a7f-269f-4e2b-b939-366692091023
Comment 19 glen beasley 2009-10-23 09:39:25 PDT
now that the nightly trunk builds should have .chk build due to bug 522220 closing bug.
Comment 20 glen beasley 2009-10-23 09:43:04 PDT
(In reply to comment #18)
> (In reply to comment #16)
> > I would say that this bug is Fixed for the test case description of this bug.
> > The security component is 
> > not initialize, the initialize alert should help the user in fixing the
> > problem. Granted if the .chk files were not installed the user will have to get
> > a new version of Firefox but at least the Alert and the fact that the security
> > component was not initialized is correct behavior.
> 
> This bug is not fixed. The most recent nigthly build of Namoroka still crashes
> for me with the given steps. See bp-8cd93a7f-269f-4e2b-b939-366692091023

sorry I missed comment 18 did you not get any alert windows? 

I am unable to reproduce, since my Namoroka builds with the missing .chk files
pop with the alert windows and the security component is not initialized.
Comment 21 Henrik Skupin (:whimboo) 2009-10-23 11:15:05 PDT
The alert window pops-up for a split of second before we crash with Minefield and Namoroka.
Comment 22 Henrik Skupin (:whimboo) 2009-11-04 17:53:49 PST
Now that bug 522220 is fixed I cannot reproduce this crash anymore. Marking verified fixed with Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.3a1pre) Gecko/20091104 Minefield/3.7a1pre ID:20091104031046.
Comment 23 Nelson Bolyard (seldom reads bugmail) 2009-11-05 07:25:54 PST
Is this fixed at all?

This bug says "PSM crashes whenever NSS fails to initialize in FIPS mode."
There are many possible causes for NSS to fail to initialize in FIPS mode.
Recently one such cause was found (a build problem, bug 522220) and corrected.
But, as far as I can tell, NOTHING was done about the problem that, when NSS 
fails to initialize in FIPS mode, which is not a bug in itself, PSM crashes.  

So, are you sure this bug is verified/fixed?  
Or do you only care about the particular test case that you were experiencing?
Comment 24 Ted Mielczarek [:ted.mielczarek] 2009-11-05 07:53:44 PST
Yeah, I agree that the underlying cause of the crash here is not fixed. We simply fixed the build bug that was causing missing .chk files which exposed it.
Comment 25 Henrik Skupin (:whimboo) 2009-11-05 09:36:16 PST
Oh, that's true. So it's definitely not fixed. Thanks Nelson.
Comment 26 Henrik Skupin (:whimboo) 2009-11-12 14:42:13 PST
The crash seems to have returned with recent Namoroka builds:
http://crash-stats.mozilla.com/report/index/8f93f102-0eca-4126-a6e8-7a6432091112?p=1
Comment 27 Nick Thomas [:nthomas] 2009-11-12 14:46:44 PST
(In reply to comment #26)
Bug 522220 only landed in Namoroka builds for the 20091113 nightly.
Comment 28 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2009-11-30 13:14:13 PST
In a Minefield debug build, I'm seeing the same as comment 20; if I enable FIPS and remove the .chk files, I get the alert dialog described in comment 16, and SSL is disabled.
Comment 29 Boris Zbarsky [:bz] (Out June 25-July 6) 2009-12-01 00:00:58 PST
Given that we're shipping the .chk files now (and hence the crash should be pretty difficult to reproduce), is this still a release blocker?
Comment 30 Mike Beltzner [:beltzner, not reading bugmail] 2009-12-02 22:24:33 PST
Yes, quite right, bz; no longer a blocker.
Comment 31 Henrik Skupin (:whimboo) 2009-12-04 21:39:31 PST
(In reply to comment #28)
> In a Minefield debug build, I'm seeing the same as comment 20; if I enable FIPS
> and remove the .chk files, I get the alert dialog described in comment 16, and
> SSL is disabled.

For me it's reproducible all the time. Here some updated steps:

1. Create a profile with Shiretoko and enable FIPS mode there.
2. Start a Minefield build => no crash
3. Remove the libnssdbm3.chk file from the application folder
4. Start the Minefield build again => crash
Comment 32 Honza Bambas (:mayhemer) 2009-12-16 14:07:06 PST
Henrik, thanks, it seems to work, I have in the debugger. Going to take a look at that.
Comment 33 Honza Bambas (:mayhemer) 2009-12-17 15:14:05 PST
Looks like regression from bug 456705. We display an alert dialog (letting other events be handled on the main thread) while we are in the middle of instantiation of nsNSSComponent service (responsible for nss initiation and checked for before any security component or ssl socket is to be created, to ensure we have nss). 

However, during the instantiation process we do not let others checking for nsNSSComponent fail. It was made so because I found some components that checks for nsNSSComponent while we instantiate it. So, while we keep the dialog displayed, we let nsSSLSocketProvider be initiated even we do not have nss yet. This cause crash.

The simplest solution for this bug is to invoke the dialog asynchronously, very simple. We will that way exit from the nsNSSComponent instantiation and socket provider creation will fail, no crash.

To have a correct future solution I have to think of it more deeply. Actually, other threads should wait until nsNSSComponent init is up.
Comment 34 Honza Bambas (:mayhemer) 2009-12-28 03:53:13 PST
Created attachment 419290 [details] [diff] [review]
v1

No crash, the same behavior.
Comment 35 Honza Bambas (:mayhemer) 2010-01-21 13:15:15 PST

*** This bug has been marked as a duplicate of bug 521849 ***
Comment 36 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2010-02-03 09:43:02 PST
(In reply to comment #35)
> 
> *** This bug has been marked as a duplicate of bug 521849 ***

in-litmus-, see dupe bug
Comment 37 CAROL 2010-08-05 06:24:11 PDT
Comment on attachment 419290 [details] [diff] [review]
v1

>diff --git a/security/manager/ssl/src/nsNSSComponent.cpp b/security/manager/ssl/src/nsNSSComponent.cpp
>--- a/security/manager/ssl/src/nsNSSComponent.cpp
>+++ b/security/manager/ssl/src/nsNSSComponent.cpp
>@@ -2245,21 +2245,21 @@ void nsNSSComponent::ShowAlert(AlertIden
>   else {
>     nsCOMPtr<nsIPrompt> prompter;
>     wwatch->GetNewPrompter(0, getter_AddRefs(prompter));
>     if (!prompter) {
>       PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("can't get window prompter\n"));
>     }
>     else {
>       nsCOMPtr<nsIPrompt> proxyPrompt;
>       NS_GetProxyForObject(NS_PROXY_TO_MAIN_THREAD,
>                            NS_GET_IID(nsIPrompt),
>-                           prompter, NS_PROXY_SYNC,
>+                           prompter, NS_PROXY_ASYNC,
>                            getter_AddRefs(proxyPrompt));
>       if (!proxyPrompt) {
>         PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("can't get proxy for nsIPrompt\n"));
>       }
>       else {
>         proxyPrompt->Alert(nsnull, message.get());
>       }
>     }
>   }
> }
Comment 38 timeless 2010-08-05 10:14:08 PDT
carol: i'm not sure why you touched that attachment

Note You need to log in before you can comment on or make changes to this bug.