Closed Bug 379582 Opened 13 years ago Closed 13 years ago

[FIX]SeaMonkey does not load/detect certificates from the profile anymore


(Core :: Security: PSM, defect, P1, major)

Windows 2000





(Reporter: mcsmurf, Assigned: bzbarsky)



(Keywords: regression)


(2 files)

To reproduce:
0. Get some custom (self-signed) certs or just import some certs from IE with a somewhat older build.
1. Observe in the older build that in the SeaMonkey preferences under Privacy&Security->Certificates and there under "Manage Certificates..." those certs appear in one of the categories (for example mail S/MIME certs from appear under "Your Certificates").
2. Observe with a current trunk build from today that all tabs in the "Manage Certificates..." dialog except the "Web Sites" are empty. Connecting to a mail server with SSL (self-signed cert) gives a security warning about an unknown CA, in a somewhat older build this worked since I have imported the certificate there.

I'll figure out the exact regression range, but I suspect that Bug 379012 is related here.
(In reply to comment #0)
> 2. Observe with a current trunk build from today that all tabs in the "Manage
> Certificates..." dialog except the "Web Sites" are empty. 

I meant the "Authorities" tab here.

This regressed between 2007-05-01 09:00:00 and 2007-05-02 09:00:00. No errors/warnings in the JS console when opening that manage certs... dialog.
Bonsai URL:
Strange, with a current Firefox trunk build it seems to work fine (copied cert8.db from SeaMonkey profile to Firefox profile).
There seems to be a more general problem, for example try this:
1. Create a new profile
2. Surf to and try to accept that certificate permanently

As soon as you click on OK, the same dialog appears again and again...this worked in a build from the 1st of Mai, no longer does in the build from the 2nd of Mai.
Re: Comment 3
I could reproduce that problem on two Win2k PCs, but not on a single WinXP PC (tested on WinXP PC and one WinXP Virtual Machine). Win2k-only bug? :-/
Ok, after a bit of testing and checkouts with MOZ_CO_DATE it looks like Bug 221428 caused this bug here...
Blocks: 221428
I think I know at least partly now why this happens:
nsNSSComponent::InitializeNSS is responsible for loading the certs from the profile, initing the path for the key3.db/cert8.db files, etc. For that it uses this code line:
rv = NS_GetSpecialDirectory(NS_APP_USER_PROFILE_50_DIR,
Without the patch that function (InitializeNSS) seems to get called when first using SSL (for example surfing to Now it first gets called when loading a CSS stylesheet here, which is before the profile manager appears/a profile is selected. Sample stack for this:
 	pipnss.dll!nsNSSComponent::InitializeNSS(int showWarningBox=1)  Line 1372	C++
 	pipnss.dll!nsNSSComponent::Init()  Line 1652	C++
 	pipnss.dll!nsNSSComponentConstructor(nsISupports * aOuter=0x00000000, const nsID & aIID={...}, void * * aResult=0x0012eea4)  Line 159 + 0x53 bytes	C++
 	xpcom_core.dll!nsGenericFactory::CreateInstance(nsISupports * aOuter=0x00000000, const nsID & aIID={...}, void * * aResult=0x0012eea4)  Line 80 + 0xe bytes	C++
 	xpcom_core.dll!nsComponentManagerImpl::CreateInstanceByContractID(const char * aContractID=0x02a7e430, nsISupports * aDelegate=0x00000000, const nsID & aIID={...}, void * * aResult=0x0012eea4)  Line 1801	C++
 	xpcom_core.dll!nsComponentManagerImpl::GetServiceByContractID(const char * aContractID=0x01a3996c, const nsID & aIID={...}, void * * result=0x0012eee0)  Line 2235	C++
 	xpcom_core.dll!CallGetService(const char * aContractID=0x01a3996c, const nsID & aIID={...}, void * * aResult=0x0012eee0)  Line 95	C++
 	xpcom_core.dll!nsGetServiceByContractIDWithError::operator()(const nsID & aIID={...}, void * * aInstancePtr=0x0012eee0)  Line 288 + 0xf bytes	C++
 	xpcom_core.dll!nsCOMPtr_base::assign_from_gs_contractid_with_error(const nsGetServiceByContractIDWithError & gs={...}, const nsID & iid={...})  Line 141 + 0xe bytes	C++
 	jar50.dll!nsCOMPtr<nsISignatureVerifier>::nsCOMPtr<nsISignatureVerifier>(const nsGetServiceByContractIDWithError & gs={...})  Line 686	C++
 	jar50.dll!nsJAR::GetCertificatePrincipal(const char * aFilename=0x02a7e0d8, nsIPrincipal * * aPrincipal=0x0012ef74)  Line 378	C++
 	jar50.dll!nsJARChannel::GetOwner(nsISupports * * result=0x0012f0a0)  Line 490	C++
 	caps.dll!nsScriptSecurityManager::GetChannelPrincipal(nsIChannel * aChannel=0x02a7dd38, nsIPrincipal * * aPrincipal=0x0012f128)  Line 367	C++
 	gklayout.dll!CSSLoaderImpl::LoadSheet(SheetLoadData * aLoadData=0x02a7dcc8, StyleSheetState aSheetState=eSheetStateUnknown)  Line 1318	C++
 	gklayout.dll!CSSLoaderImpl::InternalLoadNonDocumentSheet(nsIURI * aURL=0x02a7b8b0, int aAllowUnsafeRules=0, nsICSSStyleSheet * * aSheet=0x02a7b7c8, nsICSSLoaderObserver * aObserver=0x00000000)  Line 2039	C++
 	docshell.dll!nsDocShell::SetupNewViewer(nsIContentViewer * aNewViewer=0x01152c10)  Line 6075 + 0x33 bytes	C++

aUrl->mSpec in the CSSLoaderImpl::InternalLoadNonDocumentSheet frame is chrome://global/skin/xulscrollbars.css.
Odd...  The PSM code listens for profile change notifications and at least _seems_ to be trying to deal with them.  Do you hit nsNSSComponent::Observe after this point?
Assignee: dveditz → bzbarsky
Priority: -- → P1
Target Milestone: --- → seamonkey1.0alpha
Well, I can reproduce this by accepting permanently the cert from comment 3, then shutting down the browser, then starting up "seamonkey -ProfileManager", and selecting the profile.  The cert is missing from the cert viewer.

The problem is indeed that we init NSS to show the profile manager window, then do NOT get a profile-before-change when a profile is selected (see documentation in nsIProfileChangeStatus.idl), and then never start NSS with the right profile dir.

I tried shutting down NSS in profile-after-change, but that asserts about mNetworkIsDown being false and fails in SECMOD_Shutdown because secmod_PrivateModuleCount=1.

I can't test Firefox because it crashes on startup, but bsmedberg is guessing it should have similar issues.
Assignee: bzbarsky → kengert
Component: Security → Security: PSM
Product: Mozilla Application Suite → Core
QA Contact: seamonkey → psm
Target Milestone: seamonkey1.0alpha → ---
Attached patch What I triedSplinter Review
This still doesn't work.  I don't know who else is holding on to NSS stuff at this point, nor what we can do to deal with it (short of sending all the profile notifications even in the startup case or something?)

kaie, any idea how to debug who's still holding on to a module?
So I added some printfs, and the relevant code up to the point where we try to shut down NSS does the following (refcounts listed at the _entry_ to SECMOD_DestroyModule, so a "1" means it's going to go away:

Create module: "PSM Internal Crypto Services"
Create module: "NSS Internal PKCS #11 Module"
Destroy module: "NSS Internal PKCS #11 Module", refcount: 3
Destroy module: "PSM Internal Crypto Services", refcount: 4
Create module: "Builtin Roots Module"
Destroy module: "Builtin Roots Module", refcount: 3
Destroy module: "Builtin Roots Module", refcount: 2
Destroy module: "NSS Internal PKCS #11 Module", refcount: 2
Destroy module: "PSM Internal Crypto Services", refcount: 3
Destroy module: "NSS Internal PKCS #11 Module", refcount: 1
Destroy module: "PSM Internal Crypto Services", refcount: 2
Destroy module: "PSM Internal Crypto Services", refcount: 1

So someone's still holding on to the "Builtin Roots Module", which is screwing us over...
So the three references to "Builtin Roots Module" come from (looks like):

1) SECMOD_CreateModule, called by nsNSSComponent::InstallLoadableRoots via
   SECMOD_LoadUserModule and SECMOD_LoadModule
2) SECMOD_ReferenceModule, called by nsNSSComponent::InstallLoadableRoots via
   SECMOD_LoadUserModule, SECMOD_LoadModule, SECMOD_AddModuleToList
3) SECMOD_ReferenceModule, called by nsNSSComponent::UnloadLoadableRoots via

The two SECMOD_DestroyModule calls come from:

1) nsNSSComponent::UnloadLoadableRoots via SECMOD_UnloadUserModule,
   SECMOD_DeleteModuleEx, SECMOD_DestroyModuleListElement, 
2) nsNSSComponent::UnloadLoadableRoots directly (undoing the FindModule
   reference, I guess).

So the basic problem is that LoadUserModule() adds two references and only removes one.  I guess given that it returns a module, the caller is responsible for releasing it, right?
er, UnloadUserModule only removes one reference, is what I meant.
Attached patch Proposed patchSplinter Review
Frank, could you check whether this fixes the problems you see?  Nelson, is this the right thing to be doing with this NSS api?

kaie, can you just review the whole thing?  ;)
Attachment #264154 - Flags: review?(nelson)
Attachment #264154 - Flags: review?(bugzilla)
Attachment #264154 - Flags: review?(kengert)
Priority: P1 → --
Comment on attachment 264154 [details] [diff] [review]
Proposed patch

The patch seems to work fine :).
Attachment #264154 - Flags: review?(bugzilla) → review+
Comment on attachment 264154 [details] [diff] [review]
Proposed patch

This patch seems to contain two parts:
1) add a SECMOD_DestroyModule in nsNSSComponent::InstallLoadableRoots
2) big overhaul to the way profile switching is done.
I'm not qualified to review the latter.

To be honest, this proposed change to nsNSSComponent::InstallLoadableRoots
doesn't seem right to me.  IMO, we don't want to load a root cert module
and then immediately destroy the module just loaded.  Rather, we want to
destroy any previously loaded root certs module before loading a new one,
I think.  If this code ends up with two references to one module, then 
it loaded (or referenced) that module twice, and didn't destroy the old
reference before loading a new one.  

Bob Relyea really should review this patch, at least the first part.
He's the designer of all the PKCS11 module loading stuff.
I'm passing the review request on to him.
Attachment #264154 - Flags: review?(nelson) → review?(rrelyea)
> I'm not qualified to review the latter

That's fine.  I want Kai to review that one.

As for the rest... the question is what the ownership model of SECMOD_LoadUserModule is.  Right now, that function adds two references to the module (one directly, one by putting it into a list it keeps), and returns it.  SECMOD_UnloadUserModule removes the module from the list but does NOT remove the extra reference.  It would make sense if the consumer were responsible for that other reference, but perhaps it needs to do it when it calls SECMOD_UnloadUserModule.  I just don't know -- the relevant API is completely undocumented as far as I can see.  :(

Thanks for passing the review on in the right direction.
Comment on attachment 264154 [details] [diff] [review]
Proposed patch

r+ for the nsNSSComponent.cpp change.

SECMOD_LoadUserModule returns a reference you need to free when you are 'done'. This code accomplishies this.

Attachment #264154 - Flags: review?(rrelyea) → review+
bob, just to make sure I understand correctly...  secmod guarantees that the module will stick around until UnloadUserModule is called, right?
Excellent.  Taking bug, I guess.  ;)
Assignee: kengert → bzbarsky
Priority: -- → P1
Summary: SeaMonkey does not load/detect certificates from the profile anymore → [FIX]SeaMonkey does not load/detect certificates from the profile anymore
Target Milestone: --- → mozilla1.9alpha6
So after a lot of reading and scratching head I finally understand what's going on and why you are making these changes. :-) Let me summarize.

In the past bringing up profile manager did not bring up NSS.
But you say, now it does.
(Out of curiousity, do you know why?)

NSS probably gets initialized in read-only-no-db-files mode.

When the real profile gets selected, the attempt to init NSS with the real profile directory is ignored, because PSM ignores the attempt to init NSS twice.

For the entire application session, NSS still operates in its no-db-files mode, and therefore any cert imports are in-memory-only operations.

So in order to get it working correctly, we must shut down NSS first, so that a later attempt to init with the profile dir will succeed.

Most of the patch is moving code to separate functions, so you can call the code from multiple locations.

You researched that on initial startup we will get a profile-after-change notification with indicator string "startup", while on later calls it will be something else.

So at initial startup time, we need to bring down NSS, if it's already up.

You said earlier in this bug, that you made an attempt to bring it down, but it failed because of the reference count leak. Thanks a lot for tracking down that leak. The code where you add SECMOD_DestroyModule(RootsModule) got recently added with the patch for bug 176501, so that was a recent regression on trunk only.

In order to bring everything down, you call the functions in the same order as corresponding profile notifications are sent by profile manager.

While the veto-functionality inside the function doesn't make sense in this particular scenario, it doesn't hurt.

It makes sense to go through all the sanity checks before shutting down NSS. Should the profile manager UI do real SSL networking without cleaning up correctly, we'll get error message dialogs, which is good.

So I'm now convinced this code makes sense.

However, I'd like to propose a more obvious comment.
You wrote:

+      // We're starting up the application, so there was no
+      // PROFILE_BEFORE_CHANGE_TOPIC notification (we had no profile until
+      // now).  Handle shutting down NSS here, so we can restart it with the
+      // right profile.  We need to mimic the net-teardown and net-restore
+      // notifications we would normally get around profile-before-change.

What about something like this:

// The application initializes against a known profile directory for
// the first time during process execution.
// However, earlier code execution might have already triggered NSS init.
// We must ensure that NSS gets shut down prior to any attempt to init
// it again. We use the same cleanup functionality used when switching
// profiles. The order of function calls must correspond to the order
// of notifications sent by Profile Manager (nsProfile).
Comment on attachment 264154 [details] [diff] [review]
Proposed patch

r=kengert but please clarify the comment, possibly by taking my proposal
Attachment #264154 - Flags: review?(kengert) → review+
> (Out of curiousity, do you know why?)

Yes.  The profile manager needs stylesheets from a jar, we now track principals on stylesheets, which means we run the signature verifier on the jar in question.  Which needs NSS.

> While the veto-functionality inside the function doesn't make sense in this
> particular scenario, it doesn't hurt.

Actually, it does make _some_ sense.  If the veto happens (which is what I was seeing), the profile manager will not start the new profile and will instead put up an alert saying it couldn't do it.  That alert is what I was trying to get rid of with the secmod change.  ;)

> What about something like this:

Sure.  That makes sense to me.  I'll do that.
Checked in with the comment change.
Closed: 13 years ago
Resolution: --- → FIXED
Does this fix Bug 366440, by any chance ?
It should, yes.  At least if I understand that bug correctly.
Duplicate of this bug: 366440
Blocks: 380967
You need to log in before you can comment on or make changes to this bug.