PSM should remember valid intermediate CA certificates

RESOLVED FIXED

Status

()

defect
RESOLVED FIXED
12 years ago
8 years ago

People

(Reporter: nelson, Assigned: kaie)

Tracking

(Blocks 1 bug, {memory-leak})

Trunk
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9 ?

Firefox Tracking Flags

(Not tracked)

Details

()

Attachments

(1 attachment)

For years, we've received reports that go like this:
1. visit https server 'A' that sends incomplete cert chain.  SSL fails.
2. visit https server 'B' that sends complete cert chain, including the 
intermediate CA certs that were missing from server 'A'.  SSL works.
3. try again to visit https server 'A'.  Now it works.

A recent report like that is Bug 399019.  It reports that if you visit
    https://www.biglumber.com/x/web?mp=1  
you get a SSL cert failure, but if you then visit 
    https://www.godaddy.com/
and then try again to visit 
    https://www.biglumber.com/x/web?mp=1 
the subsequent visit is successful.  

This might be a leak of a reference to certificate struct, or it might
be some code intentionally holding on to a cert reference somewhere.

I can reproduce this behavior with some versions of SeaMonkey, but not with
others.  It is reportedly ocurring with FireFox 3, also.
I'm worried about this confusing admins of sites like biglumber.com into thinking their sites are configured correctly when they aren't.

Are you worried about this causing additional problems?
In bug 399019 comment 5, Kai wrote:

> I think it works, because we are caching the intermediate certificate
> somewhere.  But I wonder, WHERE are we caching it?
> 
> Does NSS keep an in-memory-only cache of intermediate certs it sees?
> 
> Or is it more likely the cert is referenced by PSM and therefore still
> available in the in-memory-only db (temp db)?

When an application (or PSM) asks NSS to take a binary DER cert and parse
it, and return a handle (reference) to a parsed cert structure, NSS keeps
a copy of that returned handle in a table, so as not to lose track of it.
The cert struccture is reference counted.  If the app asks for this same
cert to be parsed again (say, on another thread) while the first parsed 
struct is still in memory, NSS bumps the reference count, and returns the
same handle.  When the application calls the "destructor" on that handle,
NSS decrements the ref count, and if it goes to zero, NSS removes it from
the table of known cert struct handles.  

The behavior described for this bug could happen if the intermediate CA
cert handles were being leaked, or simply held for a long time (say, while
the page is in the cache, perhaps).  

I'm rather confident that this is not an NSS leak, as we do extensive 
automated leak testing on NSS, and the only known leaks (at present) are
one-time leaks in initialization, no per-connection leaks or per session 
leaks.  

In reply to comment 1, leaked cert refs (if that's what these are) cause
NSS_Shutdown to fail.  DBs may not get flushed and closed properly. 
Profile switches will fail (although maybe FF and TB don't do those any more).
I doubt we leak these through shutdown, since I expect that there's an XPCOM wrapper around the NSS certs and we'd see that leaking.  Please correct me if I'm wrong, though!

We should check that we don't leak these through the application lifetime.
Flags: blocking1.9?
Here's another example:
https://onlineid.bankofamerica.com/ serves an incomplete cert chain, and 
fails.  https://www.svmcards.net/ serves the complete chain including the 
missing cert. After visiting https://www.svmcards.net/ another try at 
https://onlineid.bankofamerica.com/ will succeed.
Keywords: mlk
Add www.blackhat.com to the list of sites with cert issues. 

Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9a9pre) Gecko/2007100905 Minefield/3.0a9pre Firefox/3.0 ID:2007100905
Blocks: 399019
blackhat's issue has NOTHING to do with this problem.
This bug is not a collection point for all servers with bad certs.
I was only trying to point out that this invalid-chain issue is much larger than anyone has any guesses about at this point.  

Blackhats issue is exactly the same error symptom as Biglumber, or your post in comment #4, so I don't see the need to get huffy about not being part of the problem.  As a end user/tester and not a coder the issue looks EXACTLY the same to me.  

thanks
General "bad cert" problems can be duped to bug 398915 or can become tech evangelism bugs (like bug 399019).  As Nelson said, this bug is more specific.
It's important that separate issues be kept separate in the bug tracking 
system.  Otherwise, there is no way to say "this bug has been fixed" or 
"this bug is invalid" if the bug report describes multiple separate issues.

This bug is about incorrectly configured sites that appear to be fixed by 
visiting another (correctly configured) site, then trying again to visit 
the one that is incorrectly configured, with successful results.  

Blackhat's problem is a cert domain name mismatch.  The cert doesn't have 
the host names in the right places.  It has a different error code than the 
one displayed by the examples in this bug.  Visiting another site, and then 
trying to visit blackhat again, won't fix it.  It's a separate issue.
Ok, back to the topic.

In comment 0, Nelson mentions, that something "leaks" certs.
Reality is slightly different.

PSM registers its AuthCertificateCallback as a callback using SSL_AuthCertificateHook.

As you can see by reading the function, PSM will:
- call the NSS default action, to ensure the cert is valid
- if it's indeed valid, PSM will ask for the full chain
- PSM will walk through the chain.
  For each cert that is neither the end entity cert,
  neither assigned to a slot, nor marked as permanent storage,
  PSM will explicitly remember that cert in a PSM data structure,
  and holding a reference to it.

The original intention of this was to make the certs available for "page info", so when you try to view the cert of a web page, you'd be still be able to view the full chain, even though the intermediates are no longer required and would otherwise have already been released.

The testcase with biglumber and godaddy was a side effect. I see we don't try to free the remembered cert once you leave a page. But that wouldn't really help to "remove the side effect" (even if we wanted to), because while you had the godaddy page open in one browser window/tab, and try to open biglumber in a different window/tab, you'd still get the side effect (which is: it works).


In comment 2 Nelson suspects that this leak might cause a shutdown failure, but it will not. Before PSM asks NSS to shut down, PSM will release all references to the explicitly remembered certs.


Having said that the above, we had a discussion during the last days.

After discussing the arguments why in the past it had been decided to not permanently store intermediate certs, it was decided: We are willing to permanently store all intermediate certs that have passed the validity test.

I'll attach a patch that implements it at the PSM level.
Summary: Browser remembers intermediate CA certificates for process lifetime → PSM should remember valid intermediate CA certificates
Posted patch Patch v1Splinter Review
The patch in this code will only get executed after the end entity cert was successfully validated, so the processed signer certs are valid, too.
Attachment #284677 - Flags: review?(rrelyea)
Blocks: 399643
I'm confused by the change in summary.  What will the effect on memory usage and biglumber.com be?
Comment on attachment 284677 [details] [diff] [review]
Patch v1

r+ I'm sure explicitly using the internal slot will bite us one day, but we don't really have a better solution right now..

bob

RE: memory usage. This patch stores the cert in the database. Once you visit a site with the required intermediate, that intermediate will be available for future validations.

As far as biglumber.com, the site is currently broken as an ssl site. It should be sending it's entire cert chain. This patch will mask that brokenness.

bob
Attachment #284677 - Flags: review?(rrelyea) → review+
Attachment #284677 - Flags: approval1.9?
Won't that cause footprint and/or privacy problems, as well as confusing the people who maintain sites like biglumber.com into thinking their sites are configured correctly?
(In reply to comment #14)
> Won't that cause footprint and/or privacy problems, 

Re footprint: It will be less footprint than today!
Before this patch, we remember all certs in RAM.
After this patch, certs are remembered on disk.

I don't understand where you see a privacy problem with storing an intermediate cert.


> as well as confusing the
> people who maintain sites like biglumber.com into thinking their sites are
> configured correctly?

Yes
My privacy concern is that it will be a partial record of what https sites you've visited.  Does one of the "clear private data" checkboxes clear these certs?

What is the advantage of storing these certs?
Attachment #284677 - Flags: approval1.9? → approval1.9+
Jesse Ruderman <jruderman@gmail.com> wrote:
> Won't that [confuse] the people who maintain sites like biglumber.com
> into thinking their sites are configured correctly?

I would say that the browsers that presently cache valid CA certs already 
deceive site admins into thinking their sites are configured correctly.  
Those admins don't become confused while everything seems to be working OK.  

The confusion begins when users of other browsers report that the site 
isn't working right for them.  The users then complain to us "Why doesn't 
FF like this web site?".  We tell them the server is misconfigured.  They 
report this to the site admins, who then file bugs or ask questions in newsgroups: "Why does FireFox not like my site?".  We educate them.  
Most of them then fix their server config.  Today this works pretty well, 
because all browsers except one agree that the sites are misconfigured.  
But a few admins become adamant that if it works for IE, then it must be 
right (see Bug 398210 for an example).

If FF follows IE, then the #1 and #2 browsers will be in accord, and this 
will change things a lot.  The remaining https clients that don't cache 
CA certs will be in such a small minority that admins will ignore them.  
Admins will say to the developers of those remaining https clients 
"my site works with IE and FF, so your product must be wrong", forcing 
them to also add this capability or lose out (one of the things the 
IETF TLS WG was trying to avoid, I believe).  This will impact Opera, 
*Minimo* and *FireFox portable*, and probably Safari and KDE.

Over time, fewer and fewer https servers will be properly configured.  
Eventually so few will be properly configured that active fetching of
previously unseen intermediate CA certs (via AIA extension URIs, see 
bug 399324) will become a necessity. 
 
I predict that will happen in less than 2 years.  The IETF TLS RFCs'
server requirements will have been pushed aside.  I predict the pace with 
which MS will introduce more "improvements" to TLS will increase.  The bar 
will continue to be raised.  The barrier to entry (or staying in the game) 
will continue to go up.  It will be "follow the leader" all the way.  

In fairness to them, I remember when they voiced the very same complaint 
against Netscape.  I've sat on both sides of the table, with and opposite 
market giants who maintain a lead by frequent introduction of non-standard 
features.  

I don't like forcing the minimum https client to become ever more complicated and have ever more storage.  OTOH, I wouldn't mind having to answer fewer inquiries like "Why doesn't FF like my https server?".

> What is the advantage of storing these certs?

#1) over time, it minimizes the number of sites that appear to not work,
because they are incorrectly configured.  It masks the sites' errors,
making happy users.  

#2) When AIA-driven cert fetching becomes a reality (which this will 
hasten), sites whose CA cert must be fetched will have a much longer 
start-up time.  The browser will, in effect, have to fetch one (or 
possibly several) other web page(s) first (containing the missing certs)
before it can display the https page the user requested.  This will be
perceived as a slow down.  Storing the fetched intermediate CA certs 
will minimize the number of times that each cert must be fetched 
(to just a single time per profile).  

AIA-driven cert fetching is the subject of Bug 399324.  Please read it
to see the source of the impetus for all this.

OBTW, My MS contact said that licensing.microsoft.com will be fixed soon.
If the cert-database starts to grow, then it might be necessary to remove old certificates once in a while (if they're not used in a month or so), as long as they're not installed by the user. But I don't think it will be a problem for the near future, unless you happen to visit several unknown websites every week, each with a list of intermediates that you haven't seen before. But in most cases, the database will serve as a nice cache. Wake me up when your cert-database reaches 4MB.
(In reply to comment #16)
> My privacy concern is that it will be a partial record of what https sites
> you've visited.

You can't deduce that you visited Paypal.
You can only deduce that you visited one of several hundred sites that use a cert from Verisign.


> Does one of the "clear private data" checkboxes clear these certs?

No. This will be tricky to implement.
If you really wanted to "delete all intermediate CAs that were collected by surfing the web",  then we would have to distinguish them from "all other intermediate CA certs the user might have explicitly imported".
Patch checked in, marking fixed.

I propose that remaining issues are filed as separate bugs.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Where will the intermediate certs show up in the Certificate Manager?
(In reply to comment #21)
> Where will the intermediate certs show up in the Certificate Manager?

In the authorities tab. See also bug 399643 for a proposal to introduce a separate tabs for intermediates.
Reopening bug. All patches that got checked in to trunk yesterday are being backed out, because it's unclear which patch has caused a performance regression.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
checked in again, marking fixed.
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
(In reply to comment #17)
> If FF follows IE, then the #1 and #2 browsers will be in accord, and this 
> will change things a lot.  The remaining https clients that don't cache 
> CA certs will be in such a small minority that admins will ignore them.  

Nelson, isn't one solution for this to add code that will check if the intermediate certs used in the validation are really present in the response of the server, and if so display a non-blocking warning that says the site is not correctly configured and that this can result in blocking access in some case. The warning would include a web link to a complete technical description of the problem.

Even if everybody starts implementing the caching, if AIA is not supported or not present in the cert, a fresh install will still fail on those sites, so some user will still be blocked even if it's infrequent enough to be really hard to detect for those sites.

With the non-blocking warning, user are not blocked, and the web master can get to the explanation of his problem by starting Firefox just once.
Jean-Marc, I filed bug 402846 based on your idea.
This fix enables Web administrators to ignore instructions from certificate authorities on the proper configuration of their servers.  See <http://www.verisign.com/support/verisign-intermediate-ca/index.html> for VeriSign's instructions for configuring.  

I would be concerned about the overall security of a server if the administrator cannot follow simple instructions regarding the installation of an intermediate certificate.  This bug report would allow such administrators to continue in their generally sloppy (ignorant?) operations, placing users at risk of security vulnerabilities.  

Do you really want to do this?  Wouldn't it be better to prevent a valid intermediate certificate -- installed with the site certificate on a server -- from being used at all with a site certificate from a misconfigured server?  
You need to log in before you can comment on or make changes to this bug.