Open Bug 324867 Opened 19 years ago Updated 1 year ago

Make certs sent by peer available to apps that use libSSL

Categories

(NSS :: Libraries, enhancement, P2)

enhancement

Tracking

(Not tracked)

People

(Reporter: julien.pierre, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 1 obsolete file)

There are several limitations to the current cert verification callback interface in libssl :

1) lack of support for a state object for non-blocking I/O

Currently, the SSL_AuthCertificate callback interface allows returning of a SECStatus which can have a value of SECWouldBlock .

However, no state object can be persisted accross multiple calls to the cert callback. So, the function has to basically start the operation from scratch. If there is I/O that would block, the function cannot return because it will lose its state, which will have to be destroyed (or leaked), but cannot be passed back during the next callback invocation.

One possible solution is to add a void** argument to the interface for saving this state. However, there would also need to be a way for the libssl caller to abort the cert verification in order to clean up the state. There would probably need to be an additional argument to specify state cleanup.

Nelson suggested also that we put this void* somewhere inside the SSL secret layer, and add an accessor to get/set it. This would help to create it and pass it back without changing the callback prototype, but this does not resolve the invocation of a a cleanup function.

2) the actual cert chain transmitted by the peer isn't passed to the callback

Only the PRFileDesc* is passed in, and then the callback function has to extract the leaf cert and reconstruct a chain, which may or may not be the one that came in over the network. It would be nice to have the actual raw cert chain that came in available. There have been cases in the past where some certs didn't decode and the SSL layer failed. The responsibility for decoding the certs could be moved to this new callback function instead, as long as the leaf cert which is used in the SSL protocol successfully decodes.

If we are going to change the callback interface prototype - or more accurately, offer the ability to register an alternate callback prototype, we may as well add a generic "catch-all" void* argument which could be used by SSL to pass any information needed in the future by the callback. New APIs would be exposed to access any additional data as needed.
Severity: normal → enhancement
Priority: -- → P1
Target Milestone: --- → 3.12
Isn't this bug effectively a duplicate of bug 294531 ?

As I mentioned yesterday, I think we can have a single callback API that
can be used to :
a) initiate a new cert path validation, or 
b) abort one already started.  

I think all our new callback APIs should be defined to use varargs, so 
that we can extend them in the future without needing to go through the 
pain we now face for the existing SSL callbacks.
Nelson,

This bug is about the SSL layer callback. Bug 294531 is about the new functions in the CERT (PKIX) layer. Using varargs is an interesting idea, but it has its own problems, such as how to specify the # of arguments that follow so the callback code knows how to pop them.
Depends on: 234130
Blocks: 327386
See also bug 151010 .
QA Contact: jason.m.reid → libraries
Whiteboard: PKIX
See also bug 288788 which requests other enhancements to the SSL callback interface.
I thought Steve Parkinson took this bug. (?)
Version: unspecified → trunk
Not P1 for 3.12.
Priority: P1 → P3
Blocks: 363477
This bug contains two separate RFEs which can be implemented independently.
I think it is likely that one of these will be implemented before the other.
So, I am narrowing the scope of this bug to just include the second issue
from comment 0, namely:

2) the actual cert chain transmitted by the peer isn't passed to the callback

We can address that problem without changing the callback API.  We can simply
provide a new libSSL function that the callback function can call to obtain
the certs sent by the peer.  

I will reopen bug 234135 to address the non-blocking issue.
No longer blocks: 327386
No longer depends on: 234130
Summary: enhancements to SSL cert auth callback interface for non-blocking I/O and PKIX → Make certs sent by peer available to apps that use libSSL
Whiteboard: PKIX
I propose to add this new function to libSSL:

extern CERTCertList * SSL_PeerCertificateList(PRFileDesc *fd);

It will return a CERTCertList containing the list of certs sent by the peer,
in original order, or NULL if it cannot or if it received no certs from the
peer.  (Should it return an empty list if it received no certs from the peer?)

The caller could then traverse/manipulate the list with any of the usual
functions declared in cert.h, and destroy the list with CERT_DestroyCertList
when done with it.
I think we'd like to add this for 3.12 if possible.  Patch forthcoming.
Priority: P3 → P2
Attached patch patch v1 (obsolete) — Splinter Review
I won't request review until I've tested this more.
There is a large overlap between this bug and bug 288788.
This patch also addresses that bug.
oops.  wrong patch was attached.
Attachment #313128 - Attachment is obsolete: true
This much larger patch eliminates libSSL's own private method of linking
CERTCertificates into a linked list, and uses libNSS's CERTCertList instead.

But I have one concern about this patch. For each cert in the old CERTCertList,
the new function CERT_DupCERTCertList dups the cert handle (bumping the ref 
count), but merely copies the "appdata" value.  While that is fine for libSSL's
purposes (because appData is always zero), it's probably not OK in the general case.  I suspect that there's no safe generic way to duplicate CERTCertLists,
and that each "application" of them must have its own dup function. :(

Julien, I'd appreciate your comments on that aspect of the new ssl3_SendRecord function in this patch.
Nelson,

There are several reasons behind enhancement 2) .

One of them is to be able to get at the exact cert chain sent by the server, as opposed to some reconstruction by the NSS PKIX code. I believe your proposal would solve this issue.

However, there was another reason too that you may not recall. We had run into problems before where the certs were undecodable by the NSS DER decoder. As a result, the SSL cert checker callback was not called, and there was no way to actually trace what was wrong with the peer's certs, except through painful manual debugging. You chose not to fix this problem, but agreed to the RFE.

I consider it a requirement to be able to get to the raw certs as a list or array of SECItems, rather than CERTCertificate. And it must be possible to get to them even if they are not valid ASN.1 so that we can trace what's wrong. I believe this requires a new callback.

I won't object to your proposal, but it doesn't solve the whole problem. The reason I had grouped it together with the other RFE about non-blocking is because I think the only way to fix both of those issues was with a new callback prototype.
Julien, 

As you will recall, NSS defines a CERTCertList, which is essentially a 
linked list of CERTCertificates, and a CERTCertificateList, which is essentially an array of SECItems each containing a DER cert.  I have chosen 
here to define a new libSSL function that returns a CERTCertList.  It would 
not be significantly more work to add ANOTHER function that returns a 
CERTCertificateList.  

My question for you (which you didn't answer) concerns the new function
CERT_DupCERTCertList, whose purpose is to duplicate a CERTCertList.  
I gave it that strange name, instead of the more obvious CERT_DupCertList,
because there is already a function named CERT_DupCertList which (sadly) 
takes and returns a CERTCertificateList. :(  To be consistent with the 
naming of all the other functions that use CERTCertificateLists, that old 
function should have been named CERT_DupCertificateList.  If I had named 
these functions originally, I would have used the term CertificateList in 
functions that manipulate the linked list of CERTCertificates, but alas 
it's a mess now.

IIRC, you added the appData member to the CERTCertList nodes so that
some other piece of info could be kept together with the CERTCertificates.
I don't recall what that other data was.  The question is: when duplicating
a CERTCertList, is it sufficient to copy the value of the appData member for each cert into the new CERTCertList?  or is it necessary to make a copy of 
the object to which that member refers (as is done for the CERTCertificate
object)?
I think the appData field was added to be able to get to all the instances of a particular DER cert that may exist in multiple slots. I'm not sure what the correct way to duplicate that field would be, if one exists. I would not try to go there.
Julien,
What function(s) return a CERTCertList in which the appData members contain
the info you describe?  

I think we probably should think of CERTCertLists not as a single type of 
object, but rather as a base class from which subsclasses are derived.
Some of those subclasses produce CERTCertLists in which the appData member
has meaning.  Others do not, or have different meanings.  So, maybe, rather
than having one function that dups a CERTCertList object, we need to have
separate duplication functions for each subclass.  

The subclass of CERTCertList used by libSSL for passing the list of peer 
certs to the app would not use appData, or would use it in its own way.
Perhaps it should have a function named SSL_DupCERTCertList that can 
rightfully make the correct assumption about the meaning of appData for 
CERTCertLists produced by libSSL.  

Before deciding if this approach to the problem is reasonable, I would like
to know what (some of) the other subclasses are.
Nelson,

I think PK11_ListCert was one such function, but I can't recall now. This is going to take me more time to research. In any case, I would much prefer a function that returns a CERTCertificateList (with SECItem*) rather than one that returns a CERTCertList (with CERTCertificate*), because the later list type implies that the peer certs are always decodable with our decoder, which is not necessarily the case.
Attachment #313129 - Attachment description: patch v2 → Alternative patch A, v2
Attachment #313200 - Attachment description: alternative patch v1 → Alternative patch B, v1
This patch provides two functions with which to access the received list
of certs.  One returns a CERTCertList, the other a CERTCertificateList.
A CERTCertList is what CERT_PKIXVerifyCert expects as input, which is why
I am providing it.  There is no software to use the CERTCertificateList.

In my next comment I will say a few words about the cost/benefit of 
CERTCertificateList.
Adding the ability to return a CERTCertificateList wasn't much more code.
It means that every incoming cert is copied in memory twice:
- once when it is copied into the CERTCertificateList and
- once when a copy is made for the CERTCertificate structure

IMO, in the vast vast majority of cases, 99+%, the CERTCertificateList form
(array of SECItems of DER certs) will not be used, so the memory and the 
time spent to copy the cert into that memory will have been wasted.  

The primary purpose of having the array of raw DER certs is, IMO, to make
them available for diagnostic purposes.  But since NSS cannot decode them,
any such diagnosis will have to be done with some other software, such as
dumpasn1, and that (in turn) implies that the raw DER certs will be output
to files.  So, I think the only use of the CERTCertificateList form is for
dumping certs to files, e.g. as we want to do in bug 363477.  Of course,
we have another program that will capture certs going over a connection
and will dump them to files: ssltap.  So, I wonder if we want to burden
all clients and all servers with the extra memory and memory cycle cost 
of creating the CERTCertificateList.  

In contrast, PSM could and *would* use the CERTCertList today, if it was
available in 3.12.

One other option (alternative D? :) is to invent yet another SSL socket 
option flag that enables or disables this behavior.  That may be the best
compromise.  
Nelson,

I think as you implemented it in "alternative patch C, 1", SSL_GetPeerDerCerts does not work in the general case. The reason is that CERT_NewTempCertificate may fail. If it does, there is a goto ambiguous_err and then everything goes away. So, the patch doesn't meet all the requirements to solve the problem originally reported. IMO, that doesn't meet the requirements for vfyserv in bug 363477.

The existing cert callback checker takes a CERTCertificate as one of its arguments. If we change libssl to ignore the decoding error and make it pass a NULL cert, we cannot presume that existing application callbakc code will check for NULL cert pointers, and it may cause crashes. Maybe we could pass a semi-bogus cert structure that will never verify to prevent these crashes ? This would be an expedient hack to get past the problem above, but I don't like it much. Otherwise, we have to provide a new callback interface that doesn't take a CERTCertificate* as an argument in the prototype as I originally proposed.

I agree that we should not burden all clients with 2 copies of the DER certs. I would prefer that we don't add another SSL option for this. The preferred way, IMO, would be to define a new callback interface, that would only take the raw cert list (CERTCertificateList).

The choice for libssl would be simple :

a) If this new callback was then set by the application, it would be up to the application to decode the raw certs, and call whatever PKIX function it wants. libssl would not decode the certs at all in this case. Preferably, this new callback prototype would also support non-blocking at the same time, since we don't have to have too many callback interfaces to do the same thing. This is why the 2 RFEs were grouped.

b) If this new callback was not set by the application, the old-style callback would be called, and libssl would automatically know that it didn't need to copy the DER certs, and would continue to decode them itself as it does today.

I understand we are not ready to deliver the non-blocking callback support right now, since there is still a dependency on an NSPR RFE unfortunately.

I just think it's not a good thing that we deliver new public APIs that can be used exclusively by PSM, but don't work in the general case, even if they can be used today. We are the API designers and we are the ones stuck with supporting those public APIs forever. As much as possible, we should try to implement something that will meet our goals in a way as generic as possible, rather than having to do it again. I think this is such a case since we have several programs faced with similar problems.

How critical is it for PSM to get to the exact certs that the server sent ? Is there a bug on this ? And why would it be acceptable for PSM not to get the raw certs if decoding fails ? Even though PSM is not a diagnostic tool, I think it would be a nice feature for it to have a way to save bogus certs eventually.

I don't see why PSM would be able to use SSL_GetPeerCertList but not SSL_GetPeerDerCerts. There would be very little difference in the implementation, just an extra loop to decode the DER certs.

Since I believe we don't want to change the callback interface today, here is a counter-proposal :

a) don't implement SSL_GetPeerCertList . Implement only SSL_GetPeerDerCerts .

b) add a new SSL option named something like SSL_GET_PEER_DER_CERTS .
SSL_GetPeerDerCerts will only work if that option is set.

c) if SSL_GET_PEER_DER_CERTS is set, libssl will stop decoding the certs.
It will pass a NULL argument for the CERTCertificate to the cert callback.
This should be OK, because any application that sets SSL_GET_PEER_DER_CERTS should have a callback ready to handle a NULL CERTCertificate argument.

d) Modify our default cert checker callback to deal with this case - don't crash if the CERTCertificate is NULL, and make it call SSL_GetPeerDerCerts if SSL_GET_PEER_DER_CERTS is set.

This proposal has the following benefits :
- existing applications are completely unaffected
- applications that want the peer certs can simply set SSL_GET_PEER_DER_CERTS and modify their cert checker callback to use SSL_GetPeerDerCerts
- there is always only one copy of the peer DER certs in memory
- peer certs are only decoded once, either by libssl or by the application callback
- this proposal should meet the requirements of both PSM and bug 363477
- there should also be no need for changing any code related to peer cert decoding when we implement the non-blocking callback interface in the future. One will be able to write those new callbacks using SSL_GetPeerDerCerts .
Comment on attachment 313129 [details] [diff] [review]
Alternative patch A, v2

This patch only provides the decoded peer certs, but not the raw certs.
Attachment #313129 - Flags: review-
Comment on attachment 313200 [details] [diff] [review]
Alternative patch B, v1

Different implementation, but still the same problem as patch A - no raw certs are returned.
Attachment #313200 - Flags: review-
Comment on attachment 313241 [details] [diff] [review]
Alternative patch C, v1

With this patch there is no way for the application to get the raw certs if they can't be decoded by NSS first. So the patch doesn't meet the requirements for bug 363477
Attachment #313241 - Flags: review-
Hmm. Looks like c) is not possible in my comment 21 . libssl still needs to at least decode the peer cert if not the whole chain. Maybe we can come up with some compromise for that one cert if the others aren't needed by the libssl code.
(In reply to comment #21)
> Nelson,
> 
> I think as you implemented it in "alternative patch C, 1", SSL_GetPeerDerCerts
> does not work in the general case. The reason is that CERT_NewTempCertificate
> may fail. If it does, there is a goto ambiguous_err and then everything goes
> away. 

Not in my patch.  That's how the code works without my patch, not with my patch.
If the server cert itself doesn't decode, then there's NO HOPE of an SSL 
handshake succeeding.  But if a CA cert doesn't decode, there is hope.
That's what bug 288788 asks for.
Here's the story on PSM's need for these certs today.

The server may send certs that are parts of multiple separate chains, e.g. 
to separate roots.  PSM does two cert validations, one without EV constraints,
and if that passes, a second one with EV constraints (e.g. using policy OIDs,
and a reduced set of EV-only trust anchors).  It may well be that the subset 
of certs received from the server that form the first chain are a different
subset than those that would be needed to succesfully form a validated chain 
in the second validation.  

PSM may perform the second validation after the handshake is complete 
(indeed, after the entire http transaction is complete and the connection is
closed), so PSM needs to keep the certs from the server around.  PSM attempts
to get references to the certs from the server and hold them for the duration
of the page view, for this purpose.  But it is only able to obtain the certs
that are actually used in the first chain validation.  (I can expound on that
if desired.) So, if the second validation happens after the connection is 
over and needs certs that were not in the first validated chain, the EV 
validation will fail.  With the API that returns a CERTCertList, PSM only 
needs to hold onto that CERTCertList for the duration of the page view to 
achieve its goal.
I like the idea of the two forms of decoding being mutually exclusive.
Rather than having a single path through the function that tries to do 
both, we would have two mutually exclusive paths, one of which doesn't
require any decoding.  The existing default auth cert callback would not 
be usable if the decoding was merely providing raw outputs, so we would
never attempt to use the "default" authcert callback function when the 
raw output option was selected.  The application would need to provide
an explicit callback of the function would fail.  
Nelson,

re: comment 26, there is one path that you left in ssl_HandleCertificate, in attachment 313241 [details] [diff] [review] lines 7200 - 7220, where it appears the problem is still after the CERT_NewTempCertificate failure .

re: comment 27, I realized that, and that's why I wrote comment 25.

re: comment 28, couldn't that be worked around by making libssl not destroy the certs it decoded, rather than creating an API to pass them in ?

re: comment 29, glad you like the suggestion. But I think hopefully we should be able to fix our default authecert callback to work with either the current model or raw output, in case the application chooses raw output mode and forgets to override the callback. Or we can make libssl check for this error and fail if it's not overriden in this case.

Blocks: 427668
Unsetting target milestone in unresolved bugs whose targets have passed.
Target Milestone: 3.12 → ---

The bug assignee is inactive on Bugzilla, and this bug has priority 'P2'.
:beurdouche, could you have a look please?

For more information, please visit auto_nag documentation.

Assignee: nelson → nobody
Flags: needinfo?(bbeurdouche)
Severity: normal → S3

We have modified the bot to only consider P1 as high priority, so I'm cancelling the needinfo here.

Flags: needinfo?(bbeurdouche)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: