Closed Bug 522041 Opened 15 years ago Closed 15 years ago

Crash on startup with FIPS mode enabled [@pthread_mutex_lock | nsSSLIOLayerAddToSocket] when NSS .chk files are missing

Categories

(Core :: Security: PSM, defect, P2)

1.9.1 Branch
x86
macOS
defect

Tracking

()

VERIFIED DUPLICATE of bug 521849
mozilla1.9.3a1
Tracking Status
status1.9.1 --- wanted

People

(Reporter: whimboo, Assigned: mayhemer)

References

()

Details

(Keywords: crash, regression, relnote)

Crash Data

Attachments

(1 file)

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2b1pre) Gecko/20091011 Namoroka/3.6b1pre ID:20091011033822

Starting Namoroka with FIPS mode enabled by a 1.9.1 branch version (e.g. Firefox 3.5.1) always crashes the browser.

Steps:
1. Start Firefox 3.5.1 and set a Master Password
2. Goto Preferences | Advanced | Encryption | Security Devices and enable the FIPS mode
3. Start Namoroka

In step 3 Namoroka always crashes on my machine within 4s.

Crash report: bp-2189e99b-4e76-4665-9760-ac5702091013

First 10 frames:
0  	libSystem.B.dylib  	pthread_mutex_lock  	
1 	libnspr4.dylib 	PR_Lock 	nsprpub/pr/src/pthreads/ptsynch.c:206
2 	XUL 	nsSSLIOLayerAddToSocket 	
3 	XUL 	nsSSLIOLayerNewSocket 	
4 	XUL 	nsSSLSocketProvider::NewSocket 	
5 	XUL 	nsSocketTransport::BuildSocket 	
6 	XUL 	nsSocketTransport::InitiateSocket 	
7 	XUL 	nsSocketTransport::OnSocketEvent 	
8 	XUL 	nsSocketEvent::Run 	
9 	XUL 	nsThread::ProcessNextEvent 	
10 	XUL 	NS_ProcessNextEvent_P
Flags: blocking1.9.2?
Btw, it only happens with optimized builds. Sadly I'm not able to crash my debug build.
Oh and I have seen this while checking bug 521878.
Regression between 090901 and 091001. I will bisect to find the regression range.
Flags: in-litmus?
Regressed between the builds 09092103 and 09092203.

Pass: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/25e1253030f4
Fail: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/1616267e8153

Changesets: http://hg.mozilla.org/releases/mozilla-1.9.2/pushloghtml?fromchange=25e1253030f4&tochange=1616267e8153

I can see only one fix in this time range which could have been regressed this crash. It's http://hg.mozilla.org/releases/mozilla-1.9.2/rev/fb2192ebeff0.

Looks like another fallout from bug 516396.
Flags: blocking1.9.2? → blocking1.9.2+
Priority: -- → P2
Looks like comment 4 has the wrong range. While the crash always happens with builds starting from 090922 it more tricky to get it crash with earlier builds. I tried it with builds from the days before and crashed too now. ATM I don't get 09091603 to crash while 09091703 crashes:

Pass: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/0ee58a54c5d6
Fail: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/34da890d632d

http://hg.mozilla.org/releases/mozilla-1.9.2/pushloghtml?fromchange=0ee58a54c5d6&tochange=34da890d632d

Probably related bugs: bug 509558, bug 509319.
Given by bug 521878 it's more a regression from bug 509319.
Blocks: 509319
No longer blocks: CVE-2009-0689
I have updated my Shiretoko builds to the recent version and those are crashing too now. Running with Firefox 3.5.4 doesn't crash. So it looks like we are safe for 3.5.4. But I would like to test with the patch on bug 521878.
blocking1.9.1: --- → ?
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/b7dd9891657f (9/16)
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/f3f8aeecc2bd (9/17)
The stack is a bit different: bp-d91d86be-2aca-4870-b6a9-6f69d2091013

0  	libSystem.B.dylib  	pthread_mutex_lock  	
1 	libnspr4.dylib 	PR_Lock 	nsprpub/pr/src/pthreads/ptsynch.c:206
2 	XUL 	nsSSLIOLayerHelpers::isKnownAsIntolerantSite 	nsAutoLock.h:219
3 	XUL 	nsSSLIOLayerAddToSocket 	security/manager/ssl/src/nsNSSIOLayer.cpp:3421
4 	XUL 	nsSSLIOLayerNewSocket 	security/manager/ssl/src/nsNSSIOLayer.cpp:2174
5 	XUL 	nsSSLSocketProvider::NewSocket 	security/manager/ssl/src/nsSSLSocketProvider.cpp:72
6 	XUL 	nsSocketTransport::BuildSocket 	netwerk/base/src/nsSocketTransport2.cpp:1016
7 	XUL 	nsSocketTransport::InitiateSocket 	netwerk/base/src/nsSocketTransport2.cpp:1118
8 	XUL 	nsSocketTransport::OnSocketEvent 	netwerk/base/src/nsSocketTransport2.cpp:1447
9 	XUL 	nsSocketEvent::Run 	netwerk/base/src/nsSocketTransport2.cpp:98
10 	XUL 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:521
As talked with Nick on IRC that only crashes nightly builds of Firefox.
Turns out there aren't any chk files in the nightly builds, but are in the "3.5.4 build 1" release build. Comparing mozconfigs [1] implicates the --disable-install-strip in the nightly config, because of 
  http://mxr.mozilla.org/mozilla1.9.2/source/toolkit/mozapps/installer/packager.mk#386
On mac we compile ppc and i386, remove the chk files for both, create the universal build, and then should recreate the chk files for the fat binary. Unless you set --disable-install-strip that is.

The mozconfig change landed between 2009-09-16 and 2009-09-17 nightlies, so that matches the regression window in comment #5.

There are two bugs here I think:
1) --disable-installer-strip is overloaded, and probably shouldn't control creating nss checksums. A Core:Build Config bug.
1) NSS should handle missing chk files more gracefully

------
[1] http://hg.mozilla.org/build/buildbot-configs/file/b135980dd22b/mozilla2/macosx/mozilla-central/nightly/mozconfig (yes really, there's a symlink)
http://hg.mozilla.org/build/buildbot-configs/file/b135980dd22b/mozilla2/macosx/mozilla-1.9.2/release/mozconfig
I'm not yet convinced there's any NSS bug here.
I have yet to see a stack with any NSS functions on it.
The stacks in comment 0 and comment 9 contain no NSS functions whatsoever.
What they DO contain is lots of PSM code, which is browser code. 
If PSM is calling NSPR to lock a PRLock with a NULL PRLock pointer, that's
not necessarily an NSS fault nor an NSPR fault.  

Maybe some function is being consistently omitted from these stack traces.
If and When it comes to light, that will be one thing.  

I believe that the absent .chk files cause NSS functions to fail gracefully.
I suspect that some PSM code doesn't notice that NSS has reported failure,
and plows on ahead into the abyss.
Assignee: nobody → kaie
Component: Libraries → Security: PSM
Product: NSS → Core
QA Contact: libraries → psm
Version: unspecified → 1.9.1 Branch
(In reply to comment #11)
> 1) --disable-installer-strip is overloaded, and probably shouldn't control
> creating nss checksums. A Core:Build Config bug.

I have filed bug 522220 on this issue.
Summary: Crash on startup with FIPS mode enabled [@pthread_mutex_lock | nsSSLIOLayerAddToSocket] → Crash on startup with FIPS mode enabled [@pthread_mutex_lock | nsSSLIOLayerAddToSocket] when NSS .chk files are missing
blocking1.9.1: ? → .5+
Installing Namoroka/3.6b1pre I was able to reproduce this error. 

Installing  Firefox 3.6beta1 I no longer see the error. 

http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/3.6b1-candidates/build1/

I also have older Firefox 3.5.3 installed on my system.

If I manually corrupt Firefox 3.5.3 or newer Firefox 3.6beta1 by manually removing 
.chk file from the installed locations each browser version has the same behavior showing 
an alert window on startup stating the security component was not initialized, and the user needs to 
correct the issue... (not the exact wording). 

I believe the fixed was done by bug 509319 so marking as duplicate
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → DUPLICATE
Glen, this bug is about PSM crashing when NSS can not be initialized (in FIPS mode).
Ideally, PSM should not crash if NSS initialization fails.  So this bug is not a duplicate
of bug 509319.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
(In reply to comment #15)
> Glen, this bug is about PSM crashing when NSS can not be initialized (in FIPS
> mode).
> Ideally, PSM should not crash if NSS initialization fails.  So this bug is not
> a duplicate
> of bug 509319.

I agree that PSM requires a fix when it is not able to initialize NSS in fips mode
but I believe bug 503418 is the more appropriate test case for that issue.
This description has FIPS mode enabled by a previous working version of Firefox, then you
install a newer version of Firefox such as Namoroka and the new Firefox version
is unable to launch. 

When I duped this bug I had only tested with firefox 3.6beta1 which installs the .chk files correctly.

I now have tested 

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2b2pre)
Gecko/20091020 Namoroka/3.6b2pre

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.5pre)
Gecko/20091020 Shiretoko/3.5.5pre

both of these recent nightly builds are missing the required .chk files and therefore
will not be able to initialize FIPS mode.


but both on startup come up with an Alert window that states:

"Could not initialize the application's security component. The most likely cause is problems with files in your application's profile directory. Please check that this directory has no read/write restrictions and your hard disk is not full or close to full. It is recommended that you exit the application and fix the problem. If you continue to use this session, you might see incorrect application behaviour when accessing security features."


If you then try to go to a site that requires SSL you get an Alert stating 

Secure Connection Failed
        
An error occurred during a connection to www.wellsfargo.com.

Can't connect securely because the SSL protocol has been disabled.

(Error code: ssl_error_ssl_disabled)

I would say that this bug is Fixed for the test case description of this bug. The security component is 
not initialize, the initialize alert should help the user in fixing the problem. Granted if the .chk files were not installed the user will have to get a new version of Firefox but at least the Alert and the fact that the security component was not initialized is correct behavior.

There are two open issues that need to be addressed for bug 503418:

1) nightly builds require .chk files I believe bug 522220 should 
address this issue. 
2) PSM should not crash if it is unable to put NSS in FIPS mode. 
I plan to address this issue in bug 511320.
Glen, since you have examined these bugs in depth, please
mark this bug with the right resolution.  Thanks!
(In reply to comment #16)
> I would say that this bug is Fixed for the test case description of this bug.
> The security component is 
> not initialize, the initialize alert should help the user in fixing the
> problem. Granted if the .chk files were not installed the user will have to get
> a new version of Firefox but at least the Alert and the fact that the security
> component was not initialized is correct behavior.

This bug is not fixed. The most recent nigthly build of Namoroka still crashes for me with the given steps. See bp-8cd93a7f-269f-4e2b-b939-366692091023
Status: REOPENED → NEW
now that the nightly trunk builds should have .chk build due to bug 522220 closing bug.
Status: NEW → RESOLVED
Closed: 15 years ago15 years ago
Depends on: 522220
Resolution: --- → FIXED
(In reply to comment #18)
> (In reply to comment #16)
> > I would say that this bug is Fixed for the test case description of this bug.
> > The security component is 
> > not initialize, the initialize alert should help the user in fixing the
> > problem. Granted if the .chk files were not installed the user will have to get
> > a new version of Firefox but at least the Alert and the fact that the security
> > component was not initialized is correct behavior.
> 
> This bug is not fixed. The most recent nigthly build of Namoroka still crashes
> for me with the given steps. See bp-8cd93a7f-269f-4e2b-b939-366692091023

sorry I missed comment 18 did you not get any alert windows? 

I am unable to reproduce, since my Namoroka builds with the missing .chk files
pop with the alert windows and the security component is not initialized.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
The alert window pops-up for a split of second before we crash with Minefield and Namoroka.
Status: REOPENED → NEW
Whiteboard: [fixed by 522220]
Status: NEW → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.3a1
Now that bug 522220 is fixed I cannot reproduce this crash anymore. Marking verified fixed with Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.3a1pre) Gecko/20091104 Minefield/3.7a1pre ID:20091104031046.
Status: RESOLVED → VERIFIED
Is this fixed at all?

This bug says "PSM crashes whenever NSS fails to initialize in FIPS mode."
There are many possible causes for NSS to fail to initialize in FIPS mode.
Recently one such cause was found (a build problem, bug 522220) and corrected.
But, as far as I can tell, NOTHING was done about the problem that, when NSS 
fails to initialize in FIPS mode, which is not a bug in itself, PSM crashes.  

So, are you sure this bug is verified/fixed?  
Or do you only care about the particular test case that you were experiencing?
Yeah, I agree that the underlying cause of the crash here is not fixed. We simply fixed the build bug that was causing missing .chk files which exposed it.
Oh, that's true. So it's definitely not fixed. Thanks Nelson.
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
Whiteboard: [fixed by 522220]
The crash seems to have returned with recent Namoroka builds:
http://crash-stats.mozilla.com/report/index/8f93f102-0eca-4126-a6e8-7a6432091112?p=1
(In reply to comment #26)
Bug 522220 only landed in Namoroka builds for the 20091113 nightly.
blocking1.9.1: .6+ → ---
In a Minefield debug build, I'm seeing the same as comment 20; if I enable FIPS and remove the .chk files, I get the alert dialog described in comment 16, and SSL is disabled.
Given that we're shipping the .chk files now (and hence the crash should be pretty difficult to reproduce), is this still a release blocker?
Yes, quite right, bz; no longer a blocker.
Flags: blocking1.9.2+ → blocking1.9.2-
(In reply to comment #28)
> In a Minefield debug build, I'm seeing the same as comment 20; if I enable FIPS
> and remove the .chk files, I get the alert dialog described in comment 16, and
> SSL is disabled.

For me it's reproducible all the time. Here some updated steps:

1. Create a profile with Shiretoko and enable FIPS mode there.
2. Start a Minefield build => no crash
3. Remove the libnssdbm3.chk file from the application folder
4. Start the Minefield build again => crash
Henrik, thanks, it seems to work, I have in the debugger. Going to take a look at that.
Status: REOPENED → ASSIGNED
Looks like regression from bug 456705. We display an alert dialog (letting other events be handled on the main thread) while we are in the middle of instantiation of nsNSSComponent service (responsible for nss initiation and checked for before any security component or ssl socket is to be created, to ensure we have nss). 

However, during the instantiation process we do not let others checking for nsNSSComponent fail. It was made so because I found some components that checks for nsNSSComponent while we instantiate it. So, while we keep the dialog displayed, we let nsSSLSocketProvider be initiated even we do not have nss yet. This cause crash.

The simplest solution for this bug is to invoke the dialog asynchronously, very simple. We will that way exit from the nsNSSComponent instantiation and socket provider creation will fail, no crash.

To have a correct future solution I have to think of it more deeply. Actually, other threads should wait until nsNSSComponent init is up.
Attached patch v1 β€” β€” Splinter Review
No crash, the same behavior.
Assignee: kaie → honzab.moz
Attachment #419290 - Flags: review?(kaie)
Status: ASSIGNED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → DUPLICATE
(In reply to comment #35)
> 
> *** This bug has been marked as a duplicate of bug 521849 ***

in-litmus-, see dupe bug
Flags: in-litmus? → in-litmus-
Comment on attachment 419290 [details] [diff] [review]
v1

>diff --git a/security/manager/ssl/src/nsNSSComponent.cpp b/security/manager/ssl/src/nsNSSComponent.cpp
>--- a/security/manager/ssl/src/nsNSSComponent.cpp
>+++ b/security/manager/ssl/src/nsNSSComponent.cpp
>@@ -2245,21 +2245,21 @@ void nsNSSComponent::ShowAlert(AlertIden
>   else {
>     nsCOMPtr<nsIPrompt> prompter;
>     wwatch->GetNewPrompter(0, getter_AddRefs(prompter));
>     if (!prompter) {
>       PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("can't get window prompter\n"));
>     }
>     else {
>       nsCOMPtr<nsIPrompt> proxyPrompt;
>       NS_GetProxyForObject(NS_PROXY_TO_MAIN_THREAD,
>                            NS_GET_IID(nsIPrompt),
>-                           prompter, NS_PROXY_SYNC,
>+                           prompter, NS_PROXY_ASYNC,
>                            getter_AddRefs(proxyPrompt));
>       if (!proxyPrompt) {
>         PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("can't get proxy for nsIPrompt\n"));
>       }
>       else {
>         proxyPrompt->Alert(nsnull, message.get());
>       }
>     }
>   }
> }
Attachment #419290 - Flags: review?(kaie)
carol: i'm not sure why you touched that attachment
Status: RESOLVED → VERIFIED
Crash Signature: [@pthread_mutex_lock | nsSSLIOLayerAddToSocket]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: