Closed Bug 391815 Opened 17 years ago Closed 8 years ago

leaks in new trunk code apparently undetected by NSS leak testing

Categories

(NSS :: Test, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: nelson, Assigned: slavomir.katuscak+mozilla)

References

(Blocks 1 open bug)

Details

(Keywords: memory-leak)

Attachments

(2 files, 1 obsolete file)

Today, L. David Baron filed several bugs reporting leaks in NSS on the trunk. As far as I know, none of these leaks were previously detected by NSS's automated leak checking in nightly builds or in tinderbox. Or, more importantly, I am unaware of bug reports having been filed about those leaks. So, our automated leak checking needs to be enhanced to detect all these leaks. This should be done BEFORE those leaks are fixed. Consequently, I am marking this bug as blocking those other bugs. The fixes to those other bugs should not be checked in until the leak checking is fixed to detect libPKIX bugs. Of course, if bugs have been filed about this already, then the new bugs filed today are duplicates of those. In that case, please mark the new bugs as duplicates of the older ones (if any).
NSS memory leak checking has now only very limited scope. We check only strsclnt->selfserv tests with all combinations of mode/freebl/ciphers. Adding new tests driven just by the leaks that were just found doesn't make much sense. It would be better to specify which areas should be covered and set priorities for them.
Slavo, at least two of the reported bugs should be seen in any process that initializes and shuts down NSS. If we are not leak checking at least one OCSP client, that needs to be remedied ASAP.
Priority: -- → P2
Nelson, AFAIK the tinderbox and nightlies aren't running with the PKIX code or even building it still. As long as that's the case, there is no chance for any memory leak checking tests to detect PKIX memory leaks.
Julien, please examine http://lxr.mozilla.org/security/source/security/nss/lib/nss/nssinit.c#519 There, in the common static initialization function, nss_Init(), we see a call to PKIX_Initialize(). That code is not conditionally compiled. It is compiled into all NSS trunk builds. All NSS programs that successfully initialize NSS also initialize PKIX. And, as David Baron has shown (in numerous bug reports) the result is several leaks. Now, please explain to me how there is "no chance" that our leak checking code will detect these leaks, unless it tests only programs that never initialize NSS (which I doubt).
Sorry, I didn't realize these were all initialization leaks. I meant that the runtime for verifying and building chains is not exercised currently in our leak tests with strsclnt and selfserv. You can disregard comment 3 .
Attached patch Patch v1. (obsolete) — Splinter Review
This patch contains: -OCSP memleak testing (wrapped standard OCSP tests by memleak tools) -added memleak wrapping functions to memleak.sh (run_command_dbx, run_command_valgrind) + changes in code to use this functions -new stacks (new bugs reported 393174, 393182, 393183 and 395366 + new stacks for bug 291225) -some small fixes Tested on all Tinderboxes running memleak tests on trunk.
Attachment #280050 - Flags: review?(nelson)
Comment on attachment 280050 [details] [diff] [review] Patch v1. I have some review comments, but I did not do a full review. I am asking Alexei to do the full review, since this is IOPR code. My comments are all about the file memleak/ignored > #291225 > */main/NSS_Initialize/*/SECMOD_LoadModule/** > */SECMOD_LoadModule/** > */SECMOD_LoadPKCS11Module/** > */*/SECMOD_LoadModule/** >+ocspclnt/NSS_Init/nss_Init/SECMOD_LoadModule/** >+ocspclnt/main/NSS_Init/SECMOD_LoadModule/** >+ocspclnt/main/NSS_Init/nss_Init/SECMOD_LoadModule/** There are no comments in bug 291225 about leaks in ocspclnt, nor how to reproduce them. Please add comments in that bug on EXACTLY how to reproduce these leaks. >+#393174 >+ocspclnt/get_cert_status/CERT_CheckOCSPStatus/ocsp_GetOCSPStatusFromNetwork/ocsp_GetEncodedOCSPResponseForSingleCert/ocsp_GetEncodedOCSPResponseFromRequest/fetchOcspHttpClientV1/pkix_pl_HttpDefaultClient_RequestCreateFcn/pkix_pl_HttpDefaultClient_RequestCreate/pkix_HttpCertStore_FindSocketConnection/pkix_pl_Socket_CreateByHostAndPort/pkix_pl_Socket_CreateClient/PR_NewTCPSocket/PR_Socket/pt_SetMethods/_PR_Getfd/PR_Malloc >+ocspclnt/main/get_cert_status/CERT_CheckOCSPStatus/ocsp_GetOCSPStatusFromNetwork/ocsp_GetEncodedOCSPResponseForSingleCert/ocsp_GetEncodedOCSPResponseFromRequest/fetchOcspHttpClientV1/pkix_pl_HttpDefaultClient_RequestCreateFcn/pkix_pl_HttpDefaultClient_RequestCreate/pkix_HttpCertStore_FindSocketConnection/pkix_pl_Socket_CreateByHostAndPort/PKIX_PL_Object_Alloc/PKIX_PL_Malloc/PR_Malloc >+ocspclnt/main/get_cert_status/CERT_CheckOCSPStatus/ocsp_GetOCSPStatusFromNetwork/ocsp_GetEncodedOCSPResponseForSingleCert/ocsp_GetEncodedOCSPResponseFromRequest/fetchOcspHttpClientV1/pkix_pl_HttpDefaultClient_RequestCreateFcn/pkix_pl_HttpDefaultClient_RequestCreate/pkix_HttpCertStore_FindSocketConnection/pkix_pl_Socket_CreateByHostAndPort/PKIX_PL_Object_Alloc/PR_NewLock/PR_Calloc/calloc >+ >+#393182 That is the wrong bug number. Should be 393181, I think. >+ocspclnt/main/CERT_SetOCSPDefaultResponder/** >+ >+#393183 >+ocspclnt/main/PL_CreateOptState/** >+ >+#395366 >+ocspclnt/main/PR_GetSpecialFD/_PR_ImplicitInitialization/**
Attachment #280050 - Flags: review?(nelson) → review?(alexei.volkov.bugs)
Comment on attachment 280050 [details] [diff] [review] Patch v1. r- for ocsp changes. Just too many dups. It will create a problem to support such code. Please use a new argument to the function to flag, the it will be operate in mem leak checking mode. On the other hand, modifications for memleak.sh seems to me resolved many duplications. So r+ for this file changes.
Attachment #280050 - Flags: review?(alexei.volkov.bugs) → review-
Two big changes: - DBX options changed from checking leaks to checking memuse (will detect also blocks in use when program finishing) - in Valgrind set leak resolution to high With these changes tools can detect more memory leaks than now.
Attachment #281752 - Flags: review?(julien.pierre.boogz)
Attachment #281752 - Flags: review?(nelson)
Attachment #281752 - Flags: review?(julien.pierre.boogz) → review+
Comment on attachment 281752 [details] [diff] [review] Enabled new DBX and Valgrind features. canceling second review request. One review is enough for the trunk.
Attachment #281752 - Flags: review?(nelson)
Comment on attachment 281752 [details] [diff] [review] Enabled new DBX and Valgrind features. Please prepare an additional patch with all the new leak stacks for the "ignored" file before comitting this patch. I don't want the tree to be orange for more than one day.
Checking in ignored; /cvsroot/mozilla/security/nss/tests/memleak/ignored,v <-- ignored new revision: 1.12; previous revision: 1.11 done Checking in memleak.sh; /cvsroot/mozilla/security/nss/tests/memleak/memleak.sh,v <-- memleak.sh new revision: 1.9; previous revision: 1.8 done For now I use temporary version of ignored leaks. I expect some failures (it's already failing), but I need results from all testing machines to update there all known leaks.
Attachment #281752 - Flags: superreview?(nelson)
Comment on attachment 281752 [details] [diff] [review] Enabled new DBX and Valgrind features. second r=nelson for branch
Attachment #281752 - Flags: superreview?(nelson) → review+
Can we re-enable the call to PKIX_Shutdown now? The PKIX init leaks are adding lots of noise to the memory leak logs.
The plan is to get all the existing leak stacks into the "ignored" file and see the tinderbox go green first, THEN to fix the PKIX_Shutdown bug. BTW, the tinderbox has gone green for Solaris x86, but not for Solaris sparc nor for Linux x86. This implies that there are leaks occurring on those other platforms that are not occurring on solaris x86. IMO, that is worthy of investigation. It MIGHT be the case that the remaining leaks that are keeping the Linux and Solaris/sparc trees orange are leaks in OS shared libs.
No longer blocks: 391770
Keywords: mlk
Replacement of patch v1. 1. Wrapped OCSP tests by debug tool. 2. Modified memleak script to use one function for testing different tools (selfserv, strsclnt, ocspclnt) + added OCSP tests.
Attachment #280050 - Attachment is obsolete: true
Attachment #283195 - Flags: review?(alexei.volkov.bugs)
Small fix for patch v3, on line 168 in memleak.sh "1" instead of 1.
Comment on attachment 283195 [details] [diff] [review] Added OCSP testing. I'm not going to do a full review of this patch, but here is one small change I request. Using else after return is generally a cause of unnecessary indentation. Instead of this: >+ rm -f $outFile >+ return $ret >+ else >+ OCSP_ATTR="-d $dbDir -S $cert $clntParam" >+ ${RUN_COMMAND_DBG} ocspclnt ${OCSP_ATTR} >+ fi please code this: >+ rm -f $outFile >+ return $ret >+ fi >+ OCSP_ATTR="-d $dbDir -S $cert $clntParam" >+ ${RUN_COMMAND_DBG} ocspclnt ${OCSP_ATTR}
Comment on attachment 283195 [details] [diff] [review] Added OCSP testing. r=alexei.
Attachment #283195 - Flags: review?(alexei.volkov.bugs) → review+
Checking in iopr/ocsp_iopr.sh; /cvsroot/mozilla/security/nss/tests/iopr/ocsp_iopr.sh,v <-- ocsp_iopr.sh new revision: 1.4; previous revision: 1.3 done Checking in memleak/memleak.sh; /cvsroot/mozilla/security/nss/tests/memleak/memleak.sh,v <-- memleak.sh new revision: 1.10; previous revision: 1.9 done
Comment on attachment 283195 [details] [diff] [review] Added OCSP testing. Successfully running OCSP tests in trunk. Need second review before checkin to branch.
Attachment #283195 - Flags: review?(nelson)
Comment on attachment 283195 [details] [diff] [review] Added OCSP testing. I'm passing this review request to Julien
Attachment #283195 - Flags: review?(nelson) → review?(julien.pierre.boogz)
Slavo, Could you please update the patch with better descriptions ? Having patch v1, v2, and v3 just doesn't help. You can go into details of implementation in the comment, but the patch description should state what the patch does, and if it's for the trunk, branch, or both.
I am still very confused by the scope of this bug and the attached patches. 1) The bug is targeted at the trunk, and the description is about the trunk. If we want to check in to the branch, they should both be updated. 2) Patch v1 contains changes relating to OCSP, memleak wrapping, new stacks, small functions. Do those really all belong in this bug ? 3) Patch v2 contains some changes to the way we run valgrind and dbx . It appears from comment 12 that it was checked in to the trunk . Was it checked in to the branch also ? Should it be ? 3) Patch v3 seems to be a hybrid of v1 and v2 . Is this for the branch or for the trunk ? Is it really an update ? There was a checkin made in comment 20 that appears to have been made to the trunk.
(In reply to comment #23) > Could you please update the patch with better descriptions ? Having patch v1, > v2, and v3 just doesn't help. You can go into details of implementation in the > comment, but the patch description should state what the patch does, and if > it's for the trunk, branch, or both. Usually I add description to comment when I attach some patch. For patch v3 there are changes described in comment 16 - there are 2 major things in this patch - 1. adding OCSP tests and 2. using one function to test all tools instead of having separate function for every tool (run_command_dbx instead of having run_selfserv_dbx, run_strsclnt_dbx,... and similar for valgrind). > 1) The bug is targeted at the trunk, and the description is about the trunk. If > we want to check in to the branch, they should both be updated. I updated target milestone. Changes in patch are also useful in branch, although problem was detected in trunk. > > 2) Patch v1 contains changes relating to OCSP, memleak wrapping, new stacks, > small functions. Do those really all belong in this bug ? When working on OCSP, I decided to use one function to wrap any tool (selfserv, strsclnt, ocspclnt), to prevent adding many new functions. So also testing of selfserv and strsclnt is affected. You can ignore stacks (and previous patches), they are not in patch v3. > 3) Patch v2 contains some changes to the way we run valgrind and dbx . It > appears from comment 12 that it was checked in to the trunk . Was it checked in > to the branch also ? Should it be ? Yes it was, I forgot to write it there. > 3) Patch v3 seems to be a hybrid of v1 and v2 . Is this for the branch or for > the trunk ? Is it really an update ? There was a checkin made in comment 20 > that appears to have been made to the trunk. Patch v1 was planned to add OCSP testing, but was not approved. Patch v2 fixed bugs in dbx and valgrind (this was checked to both trunk and branch) - this patch was critical, we needed to fix problem in both branches. Patch v3 adds OCSP testing (doesn't have anything with v2), it's better replacement for v1, and was written after v2 was checked into both branches. It contains changes which are also needed for branch.
Target Milestone: 3.12 → 3.11.8
Slavo, version numbers should be used for different revisions to the same conceptual patch. If you have multiple patches, each trying to make a different change, then they are not versions of the same patch. They need to have names that identify what they're doing, and (if they only apply to branch or only to trunk) they should identify whether they apply to branch or trunk. If there are multiple revisions to one of them, then it makes sense to use version numbers for the different versions of the same conceptual change. You can go back and change the names (descriptions) of the patches after they're attached. You should do that, using the info you supplied at the bottom of comment 25.
Attachment #281752 - Attachment description: Patch v2. → Enabled new DBX and Valgrind features.
Attachment #283195 - Attachment description: Patch v3. → Added OCSP testing.
Attachment #283195 - Flags: review?(julien.pierre.boogz) → review+
Checked into branch + updated ignored stacks. Checking in ignored; /cvsroot/mozilla/security/nss/tests/memleak/ignored,v <-- ignored new revision: 1.1.2.20; previous revision: 1.1.2.19 done Checking in memleak.sh; /cvsroot/mozilla/security/nss/tests/memleak/memleak.sh,v <-- memleak.sh new revision: 1.1.2.15; previous revision: 1.1.2.14 done
Nelson, the main problem why bugs were not detected is already fixed. Some leaks are not detected because they are not covered by our tests. Do you still insist on enhancing tests (not trivial, some leaking functions are just not used by NSS tools) or can I close this bug as fixed ?
Target Milestone: 3.11.8 → 3.11.9
This bug is a tracking bug for leaks found on the trunk. It is not fixed until the bugs that block it are fixed. If there are fixes needed for the branch, they should be in separate bugs. Slavo, yes, I insist on enhancing the tests. That's the purpose of QE.
Target Milestone: 3.11.9 → 3.12
This bug should remain open until object leak testing is integrated into the trunk.
Alexei, can we close this bug, or is there anything more to do ?
Unsetting target milestone in unresolved bugs whose targets have passed.
Target Milestone: 3.12 → ---
OBE
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: