Last Comment Bug 327677 - cmsutil assertion failure, object reference leak
: cmsutil assertion failure, object reference leak
Status: RESOLVED FIXED
ECC
:
Product: NSS
Classification: Components
Component: Libraries (show other bugs)
: 3.11
: All All
: P2 normal (vote)
: 3.11.1
Assigned To: Alexei Volkov
: Jason Reid
Mentors:
Depends on: 232386
Blocks: 324887
  Show dependency treegraph
 
Reported: 2006-02-17 16:28 PST by Nelson Bolyard (seldom reads bugmail)
Modified: 2006-03-14 14:25 PST (History)
3 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
mem leak fix patch (799 bytes, patch)
2006-03-02 14:23 PST, Alexei Volkov
nelson: review+
julien.pierre: superreview+
Details | Diff | Splinter Review

Description Nelson Bolyard (seldom reads bugmail) 2006-02-17 16:28:03 PST
This bug report is extracted from 
https://bugzilla.mozilla.org/show_bug.cgi?id=324887#c49 and 
https://bugzilla.mozilla.org/show_bug.cgi?id=324887#c50

The nightly QA test runs of NSS (built with ECC enabled) now fail in the 
same way on all platforms.  This problem is a blocker to further ECC 
progress with NSS.  I'm looking for a volunteer to debug/diagnose the 
problem to the point of being able to indict some specific code.  

There are actually two problems, one of which needs immediate attention.

1. The command 
  cmsutil -E -i alicecc.txt -d ../alicedir -o aliceve.env -r eve@bogus.net
fails, outputting these error messages:

  ERROR: cannot create CMS recipientInfo object.
  cmsutil: problem enveloping: security library: invalid algorithm.

We understand that "Enveloping" (Enrypting) of S/MIME email is not yet
supported, so this particular error has a good excuse for now.  But  ...

2. In Debug builds, the above error message is immediately followed with 
this assertion failure:

  Assertion failure: secmod_PrivateModuleCount == 0, at pk11util.c:120
  Abort - core dumped

This means that some reference counted object in NSS was leaked, probably
in an error path resulting from the invalid algorithm error above.
That leak problem needs to get fixed more urgently than the invalid algorithm
error itself.
Comment 1 Nelson Bolyard (seldom reads bugmail) 2006-02-27 12:52:33 PST
This is no longer occurring in nightly QA runs, because those runs are no
longer performing the case that triggered this assertion failure, IINM.
Christophe, can you confirm that?

Alexei, would you take a look at this?
Comment 2 Alexei Volkov 2006-03-02 14:23:36 PST
Created attachment 213802 [details] [diff] [review]
mem leak fix patch

Problem is still reproducible(as expected) when nss_cmsrecipientinfo_create tries to identify algorithm for EC cert:

cmsutil -E -i alicecc.txt -d ../alicedir -o aliceve.env -r dave-ec@bogus.com

(dave-ec@bogus.com with cert algorithm tag SEC_OID_ANSIX962_EC_PUBLIC_KEY).
cmsutil fails with the same error message.

nss_cmsrecipientinfo_create leaks a cert structure when error occures toward the end of the function and RecipientIDSelector type is set to NSSCMSRecipientID_IssuerSN. In this case cert parameter get duped into ri->cert (cmsrecinfo.c:130) and never get freed.
Comment 3 Nelson Bolyard (seldom reads bugmail) 2006-03-02 15:04:48 PST
Comment on attachment 213802 [details] [diff] [review]
mem leak fix patch

Good find!  and Quick, too!
Comment 4 Julien Pierre 2006-03-02 15:35:10 PST
Comment on attachment 213802 [details] [diff] [review]
mem leak fix patch

This patch looks good.

However, a quick inspection of the code shows that there may be another problem of the same type in this function. For example, on line 158 :

rid->id.issuerAndSN = CERT_GetCertIssuerAndSN(poolp, cert);

I don't see a corresponding CERT_DestroyCertificate. We probably don't actually run into this because the only things that can fail afterwards is ASN.1 encoding of integers, but if that happens I think we have another cert leak
Comment 5 Nelson Bolyard (seldom reads bugmail) 2006-03-02 16:07:12 PST
(In reply to comment #4)

> However, a quick inspection of the code shows that there may be another 
> problem of the same type in this function. For example, on line 158 :
> 
> rid->id.issuerAndSN = CERT_GetCertIssuerAndSN(poolp, cert);
> 
> I don't see a corresponding CERT_DestroyCertificate. 

I think you're referring to this code:

160 	rid->identifierType = type;
161 	if (type == NSSCMSRecipientID_IssuerSN) {
162   	    rid->id.issuerAndSN = CERT_GetCertIssuerAndSN(poolp, cert);
163   	    if (rid->id.issuerAndSN == NULL) {
164   	      break;
165   	    }

It seems to me that, with Alexei's patch, any path that goes to loser 
will not leak the cert.  But in the code snipped above, I notice that, 
whether rid->id.issuerAndSN is NULL or not, the action is the same.  
It breaks out of the big switch, and then proceeds down to the success case, 
returning the value of ri.  

When ri is returned, the cert isn't leaked, because the ri still holds a 
reference to it, and presumably that reference will be destroyed later 
when the ri is destroyed.  But it seems to me that if rid->id.issuerAndSN 
is NULL, that should be a failure case, no?  

I think the above break should be a "goto loser" or at least should set rv to SECFailure.  

But these issues seem quite separate from the subject of this bug.  
Alexei should go ahead and commit his fix (IMO) on trunk and branch.
Then if he wants to fix the rest... :-)
Comment 6 Julien Pierre 2006-03-02 16:37:14 PST
Nelson,

I was referring to that code indeed. However, it turns out the CERT_IssuerAndSNStr structure returned contains SECItems in the arena of the issuer subject and serial number, but no a CERTCertificate reference to the issuer cert, so there is no possible cert reference leak in this code path. The SECItem get freed properly when the arena gets released.

>It seems to me that, with Alexei's patch, any path that goes to loser 
>will not leak the cert.  But in the code snipped above, I notice that, 
>whether rid->id.issuerAndSN is NULL or not, the action is the same. 
>It breaks out of the big switch, and then proceeds down to the success case, 
>returning the value of ri.  

Before the function returns ri, there is another switch statement, which does some ASN.1 encoding of the version number. That could fail, for example with an out-of-memory error. The function goes to loser when this happens, so ri is not returned. But there is no problem because there is no cert reference to be leaked. Sorry I jumped the gun on this.

>But it seems to me that if rid->id.issuerAndSN  is NULL, that should be a failure case, no?  

Yes, that may be true, but I'm not certain. All the other case statements that contain errors set rv = SECFailure and call PORT_SetError except this one.
Comment 7 Nelson Bolyard (seldom reads bugmail) 2006-03-02 20:05:00 PST
Alexei's patch checked in.
Checking in cmsrecinfo.c;  new revision: 1.18; previous revision: 1.17
Checking in cmsrecinfo.c;  new revision: 1.16.2.1; previous revision: 1.16
Marking fixed.  

Thanks Alexei for that quick fix!

Note You need to log in before you can comment on or make changes to this bug.