Crash on startup with FIPS mode enabled [@pthread_mutex_lock | nsSSLIOLayerAddToSocket] when NSS .chk files are missing

VERIFIED DUPLICATE of bug 521849

Status

()

Core
Security: PSM
P2
critical
VERIFIED DUPLICATE of bug 521849
8 years ago
6 years ago

People

(Reporter: whimboo, Assigned: mayhemer)

Tracking

({crash, regression, relnote})

1.9.1 Branch
mozilla1.9.3a1
x86
Mac OS X
crash, regression, relnote
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9.2 -
in-litmus -

Firefox Tracking Flags

(status1.9.1 wanted)

Details

(crash signature, URL)

Attachments

(1 attachment)

(Reporter)

Description

8 years ago
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2b1pre) Gecko/20091011 Namoroka/3.6b1pre ID:20091011033822

Starting Namoroka with FIPS mode enabled by a 1.9.1 branch version (e.g. Firefox 3.5.1) always crashes the browser.

Steps:
1. Start Firefox 3.5.1 and set a Master Password
2. Goto Preferences | Advanced | Encryption | Security Devices and enable the FIPS mode
3. Start Namoroka

In step 3 Namoroka always crashes on my machine within 4s.

Crash report: bp-2189e99b-4e76-4665-9760-ac5702091013

First 10 frames:
0  	libSystem.B.dylib  	pthread_mutex_lock  	
1 	libnspr4.dylib 	PR_Lock 	nsprpub/pr/src/pthreads/ptsynch.c:206
2 	XUL 	nsSSLIOLayerAddToSocket 	
3 	XUL 	nsSSLIOLayerNewSocket 	
4 	XUL 	nsSSLSocketProvider::NewSocket 	
5 	XUL 	nsSocketTransport::BuildSocket 	
6 	XUL 	nsSocketTransport::InitiateSocket 	
7 	XUL 	nsSocketTransport::OnSocketEvent 	
8 	XUL 	nsSocketEvent::Run 	
9 	XUL 	nsThread::ProcessNextEvent 	
10 	XUL 	NS_ProcessNextEvent_P
Flags: blocking1.9.2?
(Reporter)

Comment 1

8 years ago
Btw, it only happens with optimized builds. Sadly I'm not able to crash my debug build.
(Reporter)

Comment 2

8 years ago
Oh and I have seen this while checking bug 521878.
(Reporter)

Comment 3

8 years ago
Regression between 090901 and 091001. I will bisect to find the regression range.
Keywords: regression, regressionwindow-wanted
Flags: in-litmus?
(Reporter)

Comment 4

8 years ago
Regressed between the builds 09092103 and 09092203.

Pass: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/25e1253030f4
Fail: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/1616267e8153

Changesets: http://hg.mozilla.org/releases/mozilla-1.9.2/pushloghtml?fromchange=25e1253030f4&tochange=1616267e8153

I can see only one fix in this time range which could have been regressed this crash. It's http://hg.mozilla.org/releases/mozilla-1.9.2/rev/fb2192ebeff0.

Looks like another fallout from bug 516396.
Blocks: 516396
Keywords: regressionwindow-wanted
Flags: blocking1.9.2? → blocking1.9.2+
Priority: -- → P2
Keywords: relnote
(Reporter)

Comment 5

8 years ago
Looks like comment 4 has the wrong range. While the crash always happens with builds starting from 090922 it more tricky to get it crash with earlier builds. I tried it with builds from the days before and crashed too now. ATM I don't get 09091603 to crash while 09091703 crashes:

Pass: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/0ee58a54c5d6
Fail: http://hg.mozilla.org/releases/mozilla-1.9.2/rev/34da890d632d

http://hg.mozilla.org/releases/mozilla-1.9.2/pushloghtml?fromchange=0ee58a54c5d6&tochange=34da890d632d

Probably related bugs: bug 509558, bug 509319.
(Reporter)

Comment 6

8 years ago
Given by bug 521878 it's more a regression from bug 509319.
Blocks: 509319
No longer blocks: 516396
(Reporter)

Comment 7

8 years ago
I have updated my Shiretoko builds to the recent version and those are crashing too now. Running with Firefox 3.5.4 doesn't crash. So it looks like we are safe for 3.5.4. But I would like to test with the patch on bug 521878.
blocking1.9.1: --- → ?
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/b7dd9891657f (9/16)
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/f3f8aeecc2bd (9/17)
(Reporter)

Comment 9

8 years ago
The stack is a bit different: bp-d91d86be-2aca-4870-b6a9-6f69d2091013

0  	libSystem.B.dylib  	pthread_mutex_lock  	
1 	libnspr4.dylib 	PR_Lock 	nsprpub/pr/src/pthreads/ptsynch.c:206
2 	XUL 	nsSSLIOLayerHelpers::isKnownAsIntolerantSite 	nsAutoLock.h:219
3 	XUL 	nsSSLIOLayerAddToSocket 	security/manager/ssl/src/nsNSSIOLayer.cpp:3421
4 	XUL 	nsSSLIOLayerNewSocket 	security/manager/ssl/src/nsNSSIOLayer.cpp:2174
5 	XUL 	nsSSLSocketProvider::NewSocket 	security/manager/ssl/src/nsSSLSocketProvider.cpp:72
6 	XUL 	nsSocketTransport::BuildSocket 	netwerk/base/src/nsSocketTransport2.cpp:1016
7 	XUL 	nsSocketTransport::InitiateSocket 	netwerk/base/src/nsSocketTransport2.cpp:1118
8 	XUL 	nsSocketTransport::OnSocketEvent 	netwerk/base/src/nsSocketTransport2.cpp:1447
9 	XUL 	nsSocketEvent::Run 	netwerk/base/src/nsSocketTransport2.cpp:98
10 	XUL 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:521
(Reporter)

Comment 10

8 years ago
As talked with Nick on IRC that only crashes nightly builds of Firefox.
Turns out there aren't any chk files in the nightly builds, but are in the "3.5.4 build 1" release build. Comparing mozconfigs [1] implicates the --disable-install-strip in the nightly config, because of 
  http://mxr.mozilla.org/mozilla1.9.2/source/toolkit/mozapps/installer/packager.mk#386
On mac we compile ppc and i386, remove the chk files for both, create the universal build, and then should recreate the chk files for the fat binary. Unless you set --disable-install-strip that is.

The mozconfig change landed between 2009-09-16 and 2009-09-17 nightlies, so that matches the regression window in comment #5.

There are two bugs here I think:
1) --disable-installer-strip is overloaded, and probably shouldn't control creating nss checksums. A Core:Build Config bug.
1) NSS should handle missing chk files more gracefully

------
[1] http://hg.mozilla.org/build/buildbot-configs/file/b135980dd22b/mozilla2/macosx/mozilla-central/nightly/mozconfig (yes really, there's a symlink)
http://hg.mozilla.org/build/buildbot-configs/file/b135980dd22b/mozilla2/macosx/mozilla-1.9.2/release/mozconfig

Updated

8 years ago
Blocks: 515645
I'm not yet convinced there's any NSS bug here.
I have yet to see a stack with any NSS functions on it.
The stacks in comment 0 and comment 9 contain no NSS functions whatsoever.
What they DO contain is lots of PSM code, which is browser code. 
If PSM is calling NSPR to lock a PRLock with a NULL PRLock pointer, that's
not necessarily an NSS fault nor an NSPR fault.  

Maybe some function is being consistently omitted from these stack traces.
If and When it comes to light, that will be one thing.  

I believe that the absent .chk files cause NSS functions to fail gracefully.
I suspect that some PSM code doesn't notice that NSS has reported failure,
and plows on ahead into the abyss.
Assignee: nobody → kaie
Component: Libraries → Security: PSM
Product: NSS → Core
QA Contact: libraries → psm
Version: unspecified → 1.9.1 Branch
(In reply to comment #11)
> 1) --disable-installer-strip is overloaded, and probably shouldn't control
> creating nss checksums. A Core:Build Config bug.

I have filed bug 522220 on this issue.
Summary: Crash on startup with FIPS mode enabled [@pthread_mutex_lock | nsSSLIOLayerAddToSocket] → Crash on startup with FIPS mode enabled [@pthread_mutex_lock | nsSSLIOLayerAddToSocket] when NSS .chk files are missing
blocking1.9.1: ? → .5+
status1.9.1: --- → wanted

Comment 14

8 years ago
Installing Namoroka/3.6b1pre I was able to reproduce this error. 

Installing  Firefox 3.6beta1 I no longer see the error. 

http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/3.6b1-candidates/build1/

I also have older Firefox 3.5.3 installed on my system.

If I manually corrupt Firefox 3.5.3 or newer Firefox 3.6beta1 by manually removing 
.chk file from the installed locations each browser version has the same behavior showing 
an alert window on startup stating the security component was not initialized, and the user needs to 
correct the issue... (not the exact wording). 

I believe the fixed was done by bug 509319 so marking as duplicate
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 509319

Comment 15

8 years ago
Glen, this bug is about PSM crashing when NSS can not be initialized (in FIPS mode).
Ideally, PSM should not crash if NSS initialization fails.  So this bug is not a duplicate
of bug 509319.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---

Comment 16

8 years ago
(In reply to comment #15)
> Glen, this bug is about PSM crashing when NSS can not be initialized (in FIPS
> mode).
> Ideally, PSM should not crash if NSS initialization fails.  So this bug is not
> a duplicate
> of bug 509319.

I agree that PSM requires a fix when it is not able to initialize NSS in fips mode
but I believe bug 503418 is the more appropriate test case for that issue.
This description has FIPS mode enabled by a previous working version of Firefox, then you
install a newer version of Firefox such as Namoroka and the new Firefox version
is unable to launch. 

When I duped this bug I had only tested with firefox 3.6beta1 which installs the .chk files correctly.

I now have tested 

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2b2pre)
Gecko/20091020 Namoroka/3.6b2pre

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.5pre)
Gecko/20091020 Shiretoko/3.5.5pre

both of these recent nightly builds are missing the required .chk files and therefore
will not be able to initialize FIPS mode.


but both on startup come up with an Alert window that states:

"Could not initialize the application's security component. The most likely cause is problems with files in your application's profile directory. Please check that this directory has no read/write restrictions and your hard disk is not full or close to full. It is recommended that you exit the application and fix the problem. If you continue to use this session, you might see incorrect application behaviour when accessing security features."


If you then try to go to a site that requires SSL you get an Alert stating 

Secure Connection Failed
        
An error occurred during a connection to www.wellsfargo.com.

Can't connect securely because the SSL protocol has been disabled.

(Error code: ssl_error_ssl_disabled)

I would say that this bug is Fixed for the test case description of this bug. The security component is 
not initialize, the initialize alert should help the user in fixing the problem. Granted if the .chk files were not installed the user will have to get a new version of Firefox but at least the Alert and the fact that the security component was not initialized is correct behavior.

There are two open issues that need to be addressed for bug 503418:

1) nightly builds require .chk files I believe bug 522220 should 
address this issue. 
2) PSM should not crash if it is unable to put NSS in FIPS mode. 
I plan to address this issue in bug 511320.

Comment 17

8 years ago
Glen, since you have examined these bugs in depth, please
mark this bug with the right resolution.  Thanks!
(Reporter)

Comment 18

8 years ago
(In reply to comment #16)
> I would say that this bug is Fixed for the test case description of this bug.
> The security component is 
> not initialize, the initialize alert should help the user in fixing the
> problem. Granted if the .chk files were not installed the user will have to get
> a new version of Firefox but at least the Alert and the fact that the security
> component was not initialized is correct behavior.

This bug is not fixed. The most recent nigthly build of Namoroka still crashes for me with the given steps. See bp-8cd93a7f-269f-4e2b-b939-366692091023
Status: REOPENED → NEW

Comment 19

8 years ago
now that the nightly trunk builds should have .chk build due to bug 522220 closing bug.
Status: NEW → RESOLVED
Last Resolved: 8 years ago8 years ago
Depends on: 522220
Resolution: --- → FIXED

Comment 20

8 years ago
(In reply to comment #18)
> (In reply to comment #16)
> > I would say that this bug is Fixed for the test case description of this bug.
> > The security component is 
> > not initialize, the initialize alert should help the user in fixing the
> > problem. Granted if the .chk files were not installed the user will have to get
> > a new version of Firefox but at least the Alert and the fact that the security
> > component was not initialized is correct behavior.
> 
> This bug is not fixed. The most recent nigthly build of Namoroka still crashes
> for me with the given steps. See bp-8cd93a7f-269f-4e2b-b939-366692091023

sorry I missed comment 18 did you not get any alert windows? 

I am unable to reproduce, since my Namoroka builds with the missing .chk files
pop with the alert windows and the security component is not initialized.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Reporter)

Comment 21

8 years ago
The alert window pops-up for a split of second before we crash with Minefield and Namoroka.
Status: REOPENED → NEW
Whiteboard: [fixed by 522220]
(Reporter)

Updated

8 years ago
Status: NEW → RESOLVED
Last Resolved: 8 years ago8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.3a1
(Reporter)

Comment 22

8 years ago
Now that bug 522220 is fixed I cannot reproduce this crash anymore. Marking verified fixed with Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.3a1pre) Gecko/20091104 Minefield/3.7a1pre ID:20091104031046.
Status: RESOLVED → VERIFIED
Is this fixed at all?

This bug says "PSM crashes whenever NSS fails to initialize in FIPS mode."
There are many possible causes for NSS to fail to initialize in FIPS mode.
Recently one such cause was found (a build problem, bug 522220) and corrected.
But, as far as I can tell, NOTHING was done about the problem that, when NSS 
fails to initialize in FIPS mode, which is not a bug in itself, PSM crashes.  

So, are you sure this bug is verified/fixed?  
Or do you only care about the particular test case that you were experiencing?
Yeah, I agree that the underlying cause of the crash here is not fixed. We simply fixed the build bug that was causing missing .chk files which exposed it.
(Reporter)

Comment 25

8 years ago
Oh, that's true. So it's definitely not fixed. Thanks Nelson.
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
Whiteboard: [fixed by 522220]
(Reporter)

Comment 26

8 years ago
The crash seems to have returned with recent Namoroka builds:
http://crash-stats.mozilla.com/report/index/8f93f102-0eca-4126-a6e8-7a6432091112?p=1
(In reply to comment #26)
Bug 522220 only landed in Namoroka builds for the 20091113 nightly.
blocking1.9.1: .6+ → ---
In a Minefield debug build, I'm seeing the same as comment 20; if I enable FIPS and remove the .chk files, I get the alert dialog described in comment 16, and SSL is disabled.
Given that we're shipping the .chk files now (and hence the crash should be pretty difficult to reproduce), is this still a release blocker?
Yes, quite right, bz; no longer a blocker.
Flags: blocking1.9.2+ → blocking1.9.2-
(Reporter)

Comment 31

8 years ago
(In reply to comment #28)
> In a Minefield debug build, I'm seeing the same as comment 20; if I enable FIPS
> and remove the .chk files, I get the alert dialog described in comment 16, and
> SSL is disabled.

For me it's reproducible all the time. Here some updated steps:

1. Create a profile with Shiretoko and enable FIPS mode there.
2. Start a Minefield build => no crash
3. Remove the libnssdbm3.chk file from the application folder
4. Start the Minefield build again => crash
(Assignee)

Comment 32

8 years ago
Henrik, thanks, it seems to work, I have in the debugger. Going to take a look at that.
Status: REOPENED → ASSIGNED
(Assignee)

Comment 33

8 years ago
Looks like regression from bug 456705. We display an alert dialog (letting other events be handled on the main thread) while we are in the middle of instantiation of nsNSSComponent service (responsible for nss initiation and checked for before any security component or ssl socket is to be created, to ensure we have nss). 

However, during the instantiation process we do not let others checking for nsNSSComponent fail. It was made so because I found some components that checks for nsNSSComponent while we instantiate it. So, while we keep the dialog displayed, we let nsSSLSocketProvider be initiated even we do not have nss yet. This cause crash.

The simplest solution for this bug is to invoke the dialog asynchronously, very simple. We will that way exit from the nsNSSComponent instantiation and socket provider creation will fail, no crash.

To have a correct future solution I have to think of it more deeply. Actually, other threads should wait until nsNSSComponent init is up.
(Assignee)

Comment 34

8 years ago
Created attachment 419290 [details] [diff] [review]
v1

No crash, the same behavior.
Assignee: kaie → honzab.moz
Attachment #419290 - Flags: review?(kaie)
(Assignee)

Updated

8 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago8 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 521849
(In reply to comment #35)
> 
> *** This bug has been marked as a duplicate of bug 521849 ***

in-litmus-, see dupe bug
Flags: in-litmus? → in-litmus-

Comment 37

7 years ago
Comment on attachment 419290 [details] [diff] [review]
v1

>diff --git a/security/manager/ssl/src/nsNSSComponent.cpp b/security/manager/ssl/src/nsNSSComponent.cpp
>--- a/security/manager/ssl/src/nsNSSComponent.cpp
>+++ b/security/manager/ssl/src/nsNSSComponent.cpp
>@@ -2245,21 +2245,21 @@ void nsNSSComponent::ShowAlert(AlertIden
>   else {
>     nsCOMPtr<nsIPrompt> prompter;
>     wwatch->GetNewPrompter(0, getter_AddRefs(prompter));
>     if (!prompter) {
>       PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("can't get window prompter\n"));
>     }
>     else {
>       nsCOMPtr<nsIPrompt> proxyPrompt;
>       NS_GetProxyForObject(NS_PROXY_TO_MAIN_THREAD,
>                            NS_GET_IID(nsIPrompt),
>-                           prompter, NS_PROXY_SYNC,
>+                           prompter, NS_PROXY_ASYNC,
>                            getter_AddRefs(proxyPrompt));
>       if (!proxyPrompt) {
>         PR_LOG(gPIPNSSLog, PR_LOG_DEBUG, ("can't get proxy for nsIPrompt\n"));
>       }
>       else {
>         proxyPrompt->Alert(nsnull, message.get());
>       }
>     }
>   }
> }

Updated

7 years ago
Attachment #419290 - Flags: review?(kaie)

Comment 38

7 years ago
carol: i'm not sure why you touched that attachment
Status: RESOLVED → VERIFIED
Crash Signature: [@pthread_mutex_lock | nsSSLIOLayerAddToSocket]
You need to log in before you can comment on or make changes to this bug.