Closed Bug 353910 Opened 18 years ago Closed 18 years ago

memory leak in RNG_RNGInit

Categories

(NSS :: Libraries, defect, P2)

3.11.3
defect

Tracking

(Not tracked)

RESOLVED FIXED
3.11.4

People

(Reporter: julien.pierre, Assigned: julien.pierre)

Details

Attachments

(1 file, 2 obsolete files)

This freebl memory leak is seen on every NSS initialization . 88 1 0x813c648 calloc < PR_Calloc < PR_NewLock < 0xd15041f9< PR_CallOnce < 0xd15042ea< RNG_RNGInit < nsc_CommonInitialize < NSC_Initialize < secmod_ModuleInit < SECMOD_LoadPKCS11Module < SECMOD_LoadModule < SECMOD_LoadModule < nss_Init < NSS_Initialize < main 88 1 0x813c1a8 calloc < PR_Calloc < 0xd154d5aa< 0xd15041bf< PR_CallOnce < 0xd15042ea< RNG_RNGInit < nsc_CommonInitialize < NSC_Initialize < secmod_ModuleInit < SECMOD_LoadPKCS11Module < SECMOD_LoadModule < SECMOD_LoadModule < nss_Init < NSS_Initialize < main
Attached patch change default macro (obsolete) — Splinter Review
I was pleasantly surprised that the shutdown code for the RNG context was already written. But it was #ifdef LEAK_TEST . IMO, this was the wrong macro name to use. The Mozilla browser is no longer the main user of softoken. softoken gets delivered as part of several operating systems and used in many kinds of applications, many of which load and unload it dynamically. This memory needs to be cleaned up for these applications, and not just for testing purposes. I kept the code under ifdef KEEP_ENTROPY if the browser wants to have the existing behavior, but it should be the exception rather than the rule.
Attachment #239755 - Flags: superreview?(nelson)
Attachment #239755 - Flags: review?(rrelyea)
Comment on attachment 239755 [details] [diff] [review] change default macro r=nelson, with one change: >-static PRCallOnceType coRNGInit = { 0, 0, 0 }; >+static PRCallOnceType pristineCallOnce; >+static PRCallOnceType coRNGInit; Declare pristineCallOnce to be const.
Attachment #239755 - Flags: superreview?(nelson) → review+
I know about this leak. I don't know why I looked at it. I can't find an existing bug, so perhaps I found it when I reviewed the code for FIPS validation. When I looked at this issue, I came up with the following solution. Add a static RNGContext structure and make globalrng point to it: static RNGContext therngcxt; Then, change rng_init to set globalrng to &therngcxt, and change freeRNGContext to not call PORT_ZFree(globalrng, sizeof *globalrng); Instead, manually zero the globalrng->xxx fields except the globalrng->XKEY buffer, which holds entropy. (Note: we can simply declare globalrng as: static RNGContext globalrng; But this requires changing global->xxx to globalrng.xxx and globalrng to &globalrng throughout the file, which makes the patch larger. It also requires a new way to test if rng_init succeeded. Right now we simply test globalrng != NULL.) Finally, remove the #ifdef LEAK_TEST from the softoken.
Attached patch patch with Wan-Teh's suggestion (obsolete) — Splinter Review
I really like Wan-Teh's suggestion of using a static RNGContext, and having globalrng be set to its address in rng_init. So I wrote the attached patch, which incorporates Julien's previous patch plus Wan-Teh's suggestions (I think). I have not done extensive testing with this patch.
Attachment #239823 - Flags: review?(julien.pierre.bugs)
Comment on attachment 239823 [details] [diff] [review] patch with Wan-Teh's suggestion Thank you, Nelson. I just thought of another issue. Since the internal state of the RNG is a critical security parameter (CSP) of the crypto module, let's hash it with B_HASH_BUF as a form of zeroization without losing the entropy. The 'Xj', 'avail' and 'seedCount' fields also need to be zeroed during RNG shutdown or initialization.
Attachment #239823 - Flags: review?(julien.pierre.bugs) → review-
I'll let you write that patch, Wan-Teh.
It occurred to me over the week-end that as part of bug 115951, libfreebl3.so will be unloaded. Thus, all global variables relating to the RNG context, static or not, will be freed from memory during C_Finalize, and there will be no way to keep the entropy around after C_Finalize . When C_Initialize gets called a second time, it will load libfreebl again, and get a new "static" variable with a pristine state. Thus, attachment 239755 [details] [diff] [review] can be simplified to just call RNG_RNGShutdown unconditionally in pkcs11.c, and the long comment can be removed . Even if libfreebl didn't get unloaded, in a properly-written NSS and PKCS#11 application, NSS_Initialize or softoken's C_Initialize should not be called concurrently from multiple threads. And C_Finalize's behavior is undefined if there are any pending calls to other PKCS#11 APIs in other threads, except for C_WaitForSlotEvent . As long as C_WaitForSlotEvent doesn't need to access the RNG context, I think we should be OK with the simpler patch. I think the dynamic unloading of libfreebl will automatically take care of the zeroing of the context mentioned by Wan-Teh in issue 5, so there should be no zero'ing code needed to write if we fix bug 115951 .
Note that with the current architecture of softoken with a freebl shared library on all platforms, the entropy will always be lost in the BL_Cleanup call that follows RNG_RNGShutdown, which will dlclose freebl once bug 115951 is fixed . So, it's not particularly useful that RNG_RNGShutdown preserves the entropy . But as a CSP, the entropy still needed to be zero'ed/reset since it is a CSP, even if it got unloaded. If we ever change back to freebl static libraries, or some other architecture such as filtee library that doesn't require unloading freebl in C_Finalize, then this code will allow the entropy to be reused in programs that call C_Initialize / C_Finalize / C_Initialize again without unloading the softoken .
Attachment #239755 - Attachment is obsolete: true
Attachment #239823 - Attachment is obsolete: true
Attachment #240089 - Flags: superreview?(wtchang)
Attachment #240089 - Flags: review?(nelson)
Attachment #239755 - Flags: review?(rrelyea)
Attachment #240089 - Flags: review?(nelson) → review+
Comment on attachment 240089 [details] [diff] [review] Implements Wan-Teh's suggestion in comment 5 r=wtc. I found that we actually documented that PRCallOnceType variables should be initialized to all zeros: http://www.mozilla.org/projects/nspr/reference/html/prinit.html#15926 http://developer.mozilla.org/en/docs/PR_CallOnce So, it's okay to get rid of pristineCallOnce andjust use memset to reset coRNGInit. You may want to change the word "free" in the comment and function name here: /* * Free the global RNG context */ static void freeRNGContext() because it's no longer accurate. Perhaps we can use "clean up" or "shut down".
Attachment #240089 - Flags: superreview?(wtchang) → superreview+
Let's just stay with the "pristine" copy and doing a structure copy. It's more efficient than using memset for such a small struct.
Thanks for the reviews. I changed the comment for freeRNGContext to say "Clean up" . I checked the patch in to the tip : Checking in freebl/prng_fips1861.c; /cvsroot/mozilla/security/nss/lib/freebl/prng_fips1861.c,v <-- prng_fips1861.c new revision: 1.25; previous revision: 1.24 done Checking in softoken/pkcs11.c; /cvsroot/mozilla/security/nss/lib/softoken/pkcs11.c,v <-- pkcs11.c new revision: 1.133; previous revision: 1.132 done And to NSS_3_11_BRANCH : Checking in freebl/prng_fips1861.c; /cvsroot/mozilla/security/nss/lib/freebl/prng_fips1861.c,v <-- prng_fips1861.c new revision: 1.22.2.2; previous revision: 1.22.2.1 done Checking in softoken/pkcs11.c; /cvsroot/mozilla/security/nss/lib/softoken/pkcs11.c,v <-- pkcs11.c new revision: 1.112.2.18; previous revision: 1.112.2.17 done Christophe, this is also an init-time leak that we had chosen to ignore in previous leak tests. Please remove stacks that include RNG_RNGInit from the ignore list from future leak tests.
Status: NEW → RESOLVED
Closed: 18 years ago
Priority: -- → P2
Resolution: --- → FIXED
Target Milestone: --- → 3.11.4
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: