Closed Bug 492385 Opened 12 years ago Closed 12 years ago
crash freeing named CRL entry on shutdown
Steps to reproduce: 1) Launch Fennec with js console 2) Install Add-on 3) Close Fennec Window (don't click the restart button) 4) Using Today Screen's task manager, close the error console The steps maybe simpler than that, but I've reproduced the same crash twice with these steps. stack: 0x03f9ff64 > nssutil3.dll!PORT_ZFree_Util(void* ptr = 0x79734165, unsigned int len = 2768266094) Line: 161, Byte Offsets: 0x28 C nssutil3.dll!SECITEM_ZfreeItem_Util(SECItemStr* zap = 0x68ea9570, int freeit = 1) Line: 278, Byte Offsets: 0x34 C nss3.dll!NamedCRLCacheEntry_Destroy(NamedCRLCacheEntryStr* entry = 0x68eb0370) Line: 1321, Byte Offsets: 0x78 C nss3.dll!FreeNamedEntries(PLHashEntry* he = 0x68ea94e0, int i = 2, void* arg = 0x2fbbed10) Line: 1372, Byte Offsets: 0xbc C 0x010c2b1c nss3.dll!ShutdownCRLCache(void) Line: 1426, Byte Offsets: 0x118 C nss3.dll!NSS_Shutdown(void) Line: 883, Byte Offsets: 0x64 C xul.dll!nsNSSComponent::ShutdownNSS(void) Line: 1750, Byte Offsets: 0x1c8 C++ xul.dll!nsNSSComponent::DoProfileBeforeChange(nsISupports* aSubject = 0x00000000) Line: 2494, Byte Offsets: 0xe0 C++ xul.dll!nsNSSComponent::Observe(nsISupports* aSubject = 0x00000000, const char* aTopic = 0xa500636e, const wchar_t* someData = 0x79734165) Line: 2167, Byte Offsets: 0x168 C++ xul.dll!nsObserverList::NotifyObservers(nsISupports* aSubject = 0x00000000, const char* aTopic = 0xa500636e, const wchar_t* someData = 0x79734165) Line: 128, Byte Offsets: 0xa8 C++ xul.dll!nsObserverService::NotifyObservers(nsISupports* aSubject = 0x00000000, const char* aTopic = 0xa500636e, const wchar_t* someData = 0x79734165) Line: 184, Byte Offsets: 0x10c C++ xul.dll!nsXREDirProvider::DoShutdown(void) Line: 878, Byte Offsets: 0x1c4 C++ xul.dll!ScopedXPCOMStartup::~ScopedXPCOMStartup(void) Line: 993, Byte Offsets: 0x48 C++ xul.dll!XRE_main(int argc = 2037596517, char** argv = 0x00000000, nsXREAppData* aAppData = 0xa500636e) Line: 3407, Byte Offsets: 0x2604 C++ 0x00012220 0x00012404 0x0001295c 0x03f67274
Guys, we need to know which NSS CVS tag you're using.
Assignee: nobody → alexei.volkov.bugs
Priority: -- → P1
Summary: crash [@PORT_ZFree_Util] on shutdown → crash freeing named CRL entry on shutdown
Target Milestone: --- → 3.12.4
This may have already been fixed on the CVS trunk. Can you try a test using NSS built from the CVS trunk? If it's not, then it must wait until the two guys who work on that code return, one from vacation and the other from sick leave.
Version: unspecified → 3.12.3
I am back from sick leave. However, the stack provided does not match NSS 3.12.3 . The NamedCRL code didn't get added until later. Please provide an actual NSS CVS tag.
The NamedCRL code was added on the trunk for 3.12.4, which hasn't RTM'ed yet. It was added over a weekend a few weeks ago. The last change was on 4/24. The CVS revisions for crl.c are 1.62 through 1.66 . Please indicate which one you have. Looking at the stack in comment 0, I cannot exactly tell which of the two SECITEM_ZfreeItem you are tripping on. It could be the one freeing entry->crl, or entry->canonicalizedNamed . Can you figure out which one of the two it is ? A debug build would help a lot if you don't already have one. There are a lot of assertions in the code to detect corruption problems in the hash table. Also, please print out the content of the entry structure when you get this crash. I just reviewed the code and found a minor problem, but it would only occur if the NSPR hash table calls fail. I don't think it will solve your problem. But I will attach a patch.
Version: 3.12.3 → trunk
Comment on attachment 376829 [details] [diff] [review] Don't destroy entries if they couldn't be removed from the hash table Julien, I have a question about this patch. It does what you said it does, but I wonder if that is what we want to do. Q: If an attempt to remove an entry from a hash table fails, is the correct interpretation that the entry is still in the table? Or is it that the entry was already not in the table? Thinking out loud here ... I seem to recall seeing some warnings about attempts to insert entries into the table that are already in the table. I wonder if perhaps the code tries to remove an entry as many times as it has tried to insert it, so that the first attempt to remove succeeds and the rest fail. Maybe we need reference counting on these entries...
Nelson, The old entry was obtained from the table with cert_FindCRLByGeneralName . See http://mxr.mozilla.org/security/source/security/nss/lib/certdb/crl.c#3170 . The table is locked. So, we must assume that the old entry is still in the table. I don't know of any good reason why the NSPR hash table call to remove the old entry would fail. I am just trying to do proper error checking. This is probably a more theoritical case than anything else. If we insert a new entry in the table with the same index, the old entry is replaced in the table. Unless somebody still has a pointer to that old entry and subsequently frees it, that old entry is leaked. The comment about the leak is at the time of the old entry removal failure, though the actual leak happens later when a new entry is inserted to replace it. In summary, yes, I think the patch is doing the right thing.
Attachment #376829 - Flags: review?(nelson) → review+
Comment on attachment 376829 [details] [diff] [review] Don't destroy entries if they couldn't be removed from the hash table I'm not wild about this, but I'd rather have it leak than crash.
I confirmed this crash. The CVS tag is NSS_3_12_4_FIPS1 according to bug 490864.
(In reply to comment #5) > Looking at the stack in comment 0, I cannot exactly tell which of the two > SECITEM_ZfreeItem you are tripping on. In my case, it was entry->crl. I think the length in PORT_ZFree is incredibly big (it is 2768266094 in comment #1). Is it an ordinary value?
entry->crl should be cleared with 0 when CERT_CacheCRL fails in addCRLToCache.
Comment on attachment 377095 [details] [diff] [review] Clear entry->crl address with 0 Hiroyuki, Good catch. The assignment should be of NULL, not 0, however. The SECItem length field should not be large. It probably indicated a corrupt/previously freed block. I will need to start running the browser with the current NSS code manually, since unfortunately our QA tests still have zero coverage for this code.
Attachment #377095 - Flags: review?(julien.pierre.boogz) → review+
Use NULL instead of 0. Thank you, Julien.
I don't SR Hg patches. Turn this into a CVS patch which works with bugzilla's diff context tool, and then I'll be happy to SR it.
Nelson, out of curiosity, what's not working for you with bugzilla's diff context tool?
I could not find bugzilla's diff context tool so I recreated CVS diff.
Comment on attachment 377270 [details] [diff] [review] CVS diff Oops! I am sorry. I checked sr+ by myself.
Attachment #377270 - Flags: superreview+ → superreview?(nelson)
Brad, What's not working is that : a) we can't tell what revision of the source the patch is made against b) we can't extended diffs . When clicking on diff, if typing a larger number of lines, it doesn't work. Hiroyuki, Thanks for attaching a CVS patch. Please note that we never "carry over" r+ or sr+ for NSPR or NSS patches . All new patches need new reviews. Right now, the NSS trunk only needs one review.
Attachment #377270 - Flags: review+ → review?(julien.pierre.boogz)
Attachment #377270 - Flags: review?(julien.pierre.boogz) → review+
(In reply to comment #20) > Hiroyuki, > > Thanks for attaching a CVS patch. Please note that we never "carry over" r+ or > sr+ for NSPR or NSS patches . All new patches need new reviews. Right now, the > NSS trunk only needs one review. Thank you for your advices. I will care about these rules next time.
In reply to comment 17 and comment 18, Go click on the "diff" link for that Hg patch in the bug above. That will take you to https://bugzilla.mozilla.org/attachment.cgi?id=377142&action=diff At the top of that page (and any diff page), you will see Context: (Patch / File / [3 ]) That is bugzilla's diff context tool. It allows you to see the patch with different amounts of context. Click on "File", or change "3" to some other number (I commonly use 32), and press enter. See what you get. Now, try it with the CVS patch, and see how it works with that. https://bugzilla.mozilla.org/attachment.cgi?id=377270&action=diff (Thanks for the CVS patch, Hiroyuki!) It works with the CVS patch, but not with the Hg patch. There is a bugzilla bug against bugzilla about this. But frankly, given that the CVS repository, and not the Hg repositoriry, is the upstream repository for NSS and NSPR, patches should be against the upstream repository, not against some downstream repository.
Comment on attachment 377270 [details] [diff] [review] CVS diff sr=nelson, thanks!
Attachment #377270 - Flags: superreview?(nelson) → superreview+
I checked in attachment 376829 [details] [diff] [review] and attachment 377270 [details] [diff] [review] to the trunk. Checking in crl.c; /cvsroot/mozilla/security/nss/lib/certdb/crl.c,v <-- crl.c new revision: 1.67; previous revision: 1.66 done
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
(In reply to comment #22) > In reply to comment 17 and comment 18, > Go click on the "diff" link for that Hg patch in the bug above. > That will take you to > https://bugzilla.mozilla.org/attachment.cgi?id=377142&action=diff > At the top of that page (and any diff page), you will see > Context: (Patch / File / [3 ]) > That is bugzilla's diff context tool. It allows you to see the patch with > different amounts of context. Click on "File", or change "3" to some other > number (I commonly use 32), and press enter. See what you get. > > Now, try it with the CVS patch, and see how it works with that. > https://bugzilla.mozilla.org/attachment.cgi?id=377270&action=diff I did not aware this. You are very kind!
You need to log in before you can comment on or make changes to this bug.