Open Bug 1443020 Opened 2 years ago Updated 1 year ago
Crash on shutdown in NSS 3
.35 in sdb _Close()
We've incorporated an update of NSS (from 3.31.1 to 3.35) into the latest Pale Moon version (mainly to update our TLS 1.3 support) and we've started to receive reports of appcrashes on browser shutdown/restart. Running with a debugger attached, it pointed to NSS with memory signature e5e5e5e5 which would indicate a UAF situation if I understand the fill values correctly. Marking sensitive because suspected UAF. I'll include the call stack trace from VS2013 below. It seems to be easiest to trigger with Greasemonkey installed but the reason for that is unknown. Is there anything specific about 3.35 that we might have missed in our update causing this, or is this a pure NSS lib bug? The thread 0x2588 has exited with code 0 (0x0). Unhandled exception at 0x5C5320A8 (mozglue.dll) in palemoon.exe: 0xC0000005: Access violation reading location 0xE5E00000. > mozglue.dll!je_free(void * ptr) Line 6461 C ptr = 0xe5e5e5e5 softokn3.dll!sdb_Close(SDBStr * sdb) Line 1694 C softokn3.dll!sftkdb_CloseDB(SFTKDBHandleStr * handle) Line 1459 C softokn3.dll!sftk_freeDB(SFTKDBHandleStr * handle) Line 2388 C softokn3.dll!sftk_DBShutdown(SFTKSlotStr * slot) Line 2707 C softokn3.dll!SFTK_ShutdownSlot(SFTKSlotStr * slot) Line 2739 C softokn3.dll!SFTK_DestroySlotData(SFTKSlotStr * slot) Line 2753 C softokn3.dll!nscFreeAllSlots(int moduleIndex) Line 2946 C softokn3.dll!nsc_CommonFinalize(void * pReserved, int isFIPS) Line 3133 C softokn3.dll!NSC_Finalize(void * pReserved) Line 3228 C nss3.dll!SECMOD_UnloadModule(SECMODModuleStr * mod) Line 605 C nss3.dll!SECMOD_SlotDestroyModule(SECMODModuleStr * module, int fromSlot) Line 924 C nss3.dll!PK11_DestroySlot(PK11SlotInfoStr * slot) Line 460 C nss3.dll!PK11_FreeSlot(PK11SlotInfoStr * slot) Line 472 C nss3.dll!SECMOD_DestroyModule(SECMODModuleStr * module) Line 895 C nss3.dll!SECMOD_DestroyModuleListElement(SECMODModuleListStr * element) Line 942 C nss3.dll!SECMOD_DestroyModuleList(SECMODModuleListStr * list) Line 956 C nss3.dll!SECMOD_Shutdown(...) Line 67 C nss3.dll!nss_Shutdown() Line 1164 C nss3.dll!NSS_Shutdown() Line 1222 C xul.dll!nsNSSComponent::ShutdownNSS() Line 1194 C++ xul.dll!nsNSSComponent::Observe(nsISupports * aSubject, const char * aTopic, const wchar_t * someData) Line 1323 C++ xul.dll!nsObserverList::NotifyObservers(nsISupports * aSubject, const char * aTopic, const wchar_t * someData) Line 100 C++ xul.dll!nsObserverService::NotifyObservers(nsISupports * aSubject, const char * aTopic, const wchar_t * aSomeData) Line 333 C++ xul.dll!nsXREDirProvider::DoShutdown() Line 903 C++
Is it possible you're calling ShutdownNSS() or DoShutdown() twice?
Literally all I did was upgrade the library, actually from 3.32.1 to 3.35-rtm, with paired-up NSPRs, and this started happening. A downgrade resolved it again. No other surrounding code was changed. Having an old greasemonkey extension installed would quite reliably reproduce it (1.15) but it was not limited to having the extension installed, just much harder to reproduce. Of note I did have to remove the "inline" keywords from "static inline" declared functions because VS2013 does not like that -- however that should have no functional impact.
Some users apparently also crashed during startup (intermittently). I have not been able to reproduce that myself.
(In reply to Mark Straver from comment #2) > Literally all I did was upgrade the library, That doesn't answer the question though. It's possible you were calling it twice all along and not noticing ill effects, and in this version where we changed the database code maybe it starts to matter more. The presence of addons can affect the shutdown order and timing which might be why Greasemonkey can make this worse. This doesn't look exploitable: none of these objects can be influenced by web content.
Summary: Crash on shutdown in NSS 3.35 → Crash on shutdown in NSS 3.35 in sdb_Close()
Well if I was calling it twice all along, then that must be something that is present in Firefox's security manager code. This has not been touched in any way post-forking by us, that I am aware of. From the call stack, it looks like it's related to unloading security modules (perhaps FIPS?); is there anything specific that was changed in that area around the time these parts of NSS were rewritten? That might provide a pointer to home in on the root cause here.
Of course, if ShutdownNSS() is unsafe to be called more than once, it would probably benefit from a mutex lock of some sort. That might be something the lib could use anyway in this case to prevent similar scenarios.
I did some further testing, including putting a simple bool re-entrant guard on nsNSSComponent::ShutdownNSS(), which would have prevented any crash caused by calling it multiple times. The crashes did NOT go away, so I'm pretty sure it's not because of calling ShutdownNSS() or DoShutdown() more than once. As far as I can tell it's deeper in NSS. The crash trigger is a "profile-before-change" event on shutdown, which causes a call to ShutdownNSS() -> NSS_Shutdown -> SECMOD_Shutdown -> SECMOD_DestroyModuleList which is where things start to go awry. The list is populated, by the looks of it, but module->slots has been freed (I see the poison value e5 there). As a result PK11_FreeSlot crashes.
An update to this story. Our more recent work has been on a fork of mozilla-ESR52, with multiple applications building on it. Updating NSPR and NSS to 3.35 on that source tree caused different issues, specifically permissions database access problems, however, only in one of the three applications (ironically, the one closest to Firefox). Going by all information gathered so far, the most likely cause for these problems that are all database related is that with the rewrite of parts of NSS in 3.35, there is now confusion created for applications which sqlite driver should be called: one internal to NSS or one in use in the source tree for higher functions. If both are called, somehow, it would explain the double shutdown request and crash here, and issues accessing database data due to low-level concurrent calls. It would be very timing-dependent in that case which would explain the intermittent or irregular nature of these problems, and may be dodged so far by chance with Firefox due to a different VC compiler. Of note: we've now seen issues with both vs2013 and vs2015 with this library version.
You need to log in before you can comment on or make changes to this bug.