Closed Bug 353910 Opened 15 years ago Closed 15 years ago

memory leak in RNG_RNGInit


(NSS :: Libraries, defect, P2)



(Not tracked)



(Reporter: julien.pierre, Assigned: julien.pierre)



(1 file, 2 obsolete files)

This freebl memory leak is seen on every NSS initialization .

        88       1  0x813c648  calloc < PR_Calloc < PR_NewLock < 0xd15041f9< PR_CallOnce < 0xd15042ea< RNG_RNGInit < nsc_CommonInitialize < NSC_Initialize < secmod_ModuleInit < SECMOD_LoadPKCS11Module < SECMOD_LoadModule < SECMOD_LoadModule < nss_Init < NSS_Initialize < main
        88       1  0x813c1a8  calloc < PR_Calloc < 0xd154d5aa< 0xd15041bf< PR_CallOnce < 0xd15042ea< RNG_RNGInit < nsc_CommonInitialize < NSC_Initialize < secmod_ModuleInit < SECMOD_LoadPKCS11Module < SECMOD_LoadModule < SECMOD_LoadModule < nss_Init < NSS_Initialize < main
Attached patch change default macro (obsolete) — Splinter Review
I was pleasantly surprised that the shutdown code for the RNG context was already written. But it was #ifdef LEAK_TEST . IMO, this was the wrong macro name to use. The Mozilla browser is no longer the main user of softoken. softoken gets delivered as part of several operating systems and used in many kinds of applications, many of which load and unload it dynamically. This memory needs to be cleaned up for these applications, and not just for testing purposes. I kept the code under ifdef KEEP_ENTROPY if the browser wants to have the existing behavior, but it should be the exception rather than the rule.
Attachment #239755 - Flags: superreview?(nelson)
Attachment #239755 - Flags: review?(rrelyea)
Comment on attachment 239755 [details] [diff] [review]
change default macro

r=nelson, with one change:

>-static PRCallOnceType coRNGInit = { 0, 0, 0 };
>+static PRCallOnceType pristineCallOnce;
>+static PRCallOnceType coRNGInit;

Declare pristineCallOnce to be const.
Attachment #239755 - Flags: superreview?(nelson) → review+
I know about this leak.  I don't know why I looked at it.
I can't find an existing bug, so perhaps I found it when
I reviewed the code for FIPS validation.

When I looked at this issue, I came up with the following

Add a static RNGContext structure and make globalrng point
to it:

  static RNGContext therngcxt;

Then, change rng_init to set globalrng to &therngcxt,
and change freeRNGContext to not call
  PORT_ZFree(globalrng, sizeof *globalrng);
Instead, manually zero the globalrng->xxx fields except
the globalrng->XKEY buffer, which holds entropy.

(Note: we can simply declare globalrng as:

  static RNGContext globalrng;

But this requires changing global->xxx to
and globalrng to &globalrng throughout the file, which
makes the patch larger.  It also requires a new way to
test if rng_init succeeded.  Right now we simply test
globalrng != NULL.)

Finally, remove the #ifdef LEAK_TEST from the softoken.
Attached patch patch with Wan-Teh's suggestion (obsolete) — Splinter Review
I really like Wan-Teh's suggestion of using a static RNGContext, and
having globalrng be set to its address in rng_init.  So I wrote the
attached patch, which incorporates Julien's previous patch plus 
Wan-Teh's suggestions (I think).  I have not done extensive testing
with this patch.
Attachment #239823 - Flags: review?(julien.pierre.bugs)
Comment on attachment 239823 [details] [diff] [review]
patch with Wan-Teh's suggestion

Thank you, Nelson.  I just thought of another issue.
Since the internal state of the RNG is a critical security
parameter (CSP) of the crypto module, let's hash it with
B_HASH_BUF as a form of zeroization without losing the

The 'Xj', 'avail' and 'seedCount' fields also need to be
zeroed during RNG shutdown or initialization.
Attachment #239823 - Flags: review?(julien.pierre.bugs) → review-
I'll let you write that patch, Wan-Teh.
It occurred to me over the week-end that as part of bug 115951, will be unloaded. Thus, all global variables relating to the RNG context, static or not, will be freed from memory during C_Finalize, and there will be no way to keep the entropy around after C_Finalize . When C_Initialize gets called a second time, it will load libfreebl again, and get a new "static" variable with a pristine state. Thus, attachment 239755 [details] [diff] [review] can be simplified to just call RNG_RNGShutdown unconditionally in pkcs11.c, and the long comment can be removed .

Even if libfreebl didn't get unloaded, in a properly-written NSS and PKCS#11 application, NSS_Initialize or softoken's C_Initialize should not be called concurrently from multiple threads. And C_Finalize's behavior is undefined if there are any pending calls to other PKCS#11 APIs in other threads, except for C_WaitForSlotEvent . As long as C_WaitForSlotEvent doesn't need to access the RNG context, I think we should be OK with the simpler patch.

I think the dynamic unloading of libfreebl will automatically take care of the zeroing of the context mentioned by Wan-Teh in issue 5, so there should be no zero'ing code needed to write if we fix bug 115951 .
Note that with the current architecture of softoken with a freebl shared library on all platforms, the entropy will always be lost in the  BL_Cleanup call that follows RNG_RNGShutdown, which will dlclose freebl once bug 115951 is fixed . So, it's not particularly useful that RNG_RNGShutdown preserves the entropy . But as a CSP, the entropy still needed to be zero'ed/reset since it is a CSP, even if it got unloaded.

If we ever change back to freebl static libraries, or some other architecture such as filtee library that doesn't require unloading freebl in C_Finalize, then this code will allow the entropy to be reused in programs that call C_Initialize / C_Finalize / C_Initialize again without unloading the softoken .
Attachment #239755 - Attachment is obsolete: true
Attachment #239823 - Attachment is obsolete: true
Attachment #240089 - Flags: superreview?(wtchang)
Attachment #240089 - Flags: review?(nelson)
Attachment #239755 - Flags: review?(rrelyea)
Attachment #240089 - Flags: review?(nelson) → review+
Comment on attachment 240089 [details] [diff] [review]
Implements Wan-Teh's suggestion in comment 5


I found that we actually documented that PRCallOnceType
variables should be initialized to all zeros:

So, it's okay to get rid of pristineCallOnce andjust use
memset to reset coRNGInit.

You may want to change the word "free" in the comment
and function name here:

  * Free the global RNG context
 static void

because it's no longer accurate.  Perhaps we can use
"clean up" or "shut down".
Attachment #240089 - Flags: superreview?(wtchang) → superreview+
Let's just stay with the "pristine" copy and doing a structure copy.
It's more efficient than using memset for such a small struct.
Thanks for the reviews. I changed the comment for freeRNGContext to say "Clean up" .

I checked the patch in to the tip :

Checking in freebl/prng_fips1861.c;
/cvsroot/mozilla/security/nss/lib/freebl/prng_fips1861.c,v  <--  prng_fips1861.c
new revision: 1.25; previous revision: 1.24
Checking in softoken/pkcs11.c;
/cvsroot/mozilla/security/nss/lib/softoken/pkcs11.c,v  <--  pkcs11.c
new revision: 1.133; previous revision: 1.132

And to NSS_3_11_BRANCH :
Checking in freebl/prng_fips1861.c;
/cvsroot/mozilla/security/nss/lib/freebl/prng_fips1861.c,v  <--  prng_fips1861.c
new revision:; previous revision:
Checking in softoken/pkcs11.c;
/cvsroot/mozilla/security/nss/lib/softoken/pkcs11.c,v  <--  pkcs11.c
new revision:; previous revision:

Christophe, this is also an init-time leak that we had chosen to ignore in previous leak tests. Please remove stacks that include RNG_RNGInit from the ignore list from future leak tests.
Closed: 15 years ago
Priority: -- → P2
Resolution: --- → FIXED
Target Milestone: --- → 3.11.4
You need to log in before you can comment on or make changes to this bug.