Closed
Bug 451024
Opened 16 years ago
Closed 16 years ago
certutil.exe crashes with Segmentation fault inside PR_Cleanup
Categories
(NSS :: Libraries, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
3.12.2
People
(Reporter: maksr3, Assigned: wtc)
Details
Attachments
(1 file, 1 obsolete file)
1.01 KB,
patch
|
nelson
:
review+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.1) Gecko/2008081311 Minefield/3.0.1
Build Identifier: nss 3.12
May be I have a somewhat unusual build, but may be that is why the bug at least showed up. I've succeeded to build Firefox unofficially on Windows XP with gcc+mingw+buildtools. Firefox runs OK.
Then in the same build directories, inside mozilla/security/nss/cmd/certutil I've built certutil.exe with the command
/bin/make -C $(pwd) MAKE="/bin/make -j1" -j1 CC="gcc -g -mno-cygwin" SOURCE_MD_DIR=d:/Maksr/Produkti/Install/Mozilla/mozilla/dist DIST=d:/Maksr/Produkti/Install/Mozilla/mozilla/dist NSPR_INCLUDE_DIR=d:/Maksr/Produkti/Install/Mozilla/mozilla/dist/include/nspr NSPR_LIB_DIR=d:/Maksr/Produkti/Install/Mozilla/mozilla/dist/lib MOZILLA_CLIENT=1 NO_MDUPDATE=1 NSS_ENABLE_ECC=1 BUILD_TREE=d:/Maksr/Produkti/Install/Mozilla/mozilla BUILD_OPT=1 OPT_CODE_SIZE=1 NS_USE_GCC=1 NS_USE_NATIVE= OS_TARGET=WIN95
I add the .../mozilla/dist/bin to my PATH.
I copy the three files, cert8.db, key3.db, secmod.db from my Firefox profile directory to a temporary working directory and run the command certuil -L -d .
It lists a few certs, then it crashes with a Dialog box to send an error report to Microsoft.
I've tried to run it with gdb and I've found out, that at the end of the certutil_main function in nss/cmd/certutil/certutil.c there is a call to NSS_Shutdown, which unloads among others the library nssckbi.dll, but after that the call to PR_Cleanup wants to clean some thread and calls a destructor at address 0x63c0b96c that is no longer in mermory, because it has been unloaded with nssckbi.
Here is an excerpt from my session with gdb:
Breakpoint 9, certutil_main (argc=0, argv=0x22fc98, initialize=1)
at certutil.c:2894
2894 if ((initialized == PR_TRUE) && NSS_Shutdown() != SECSuccess) {
(gdb) info share
From To Syms Read Shared Object Library
0x7c901000 0x7c9afe88 Yes C:\WINNT\system32\ntdll.dll
0x7c801000 0x7c8f4bec Yes C:\WINNT\system32\kernel32.dll
0x77c11000 0x77c67d74 Yes C:\WINNT\system32\msvcrt.dll
0x643c1000 0x644943ec Yes D:\Maksr\Produkti\Install\Mozilla\mozilla\di
st\bin\nss3.dll
0x64f41000 0x64f727b0 Yes D:\Maksr\Produkti\Install\Mozilla\mozilla\di
st\bin\nspr4.dll
0x77dd1000 0x77e6ab38 Yes C:\WINNT\system32\advapi32.dll
0x77e71000 0x77f01464 Yes C:\WINNT\system32\rpcrt4.dll
0x77fe1000 0x77ff0884 Yes C:\WINNT\system32\secur32.dll
0x76b41000 0x76b6c8b4 Yes C:\WINNT\system32\winmm.dll
0x7e411000 0x7e49fde8 Yes C:\WINNT\system32\user32.dll
0x77f11000 0x77f568a0 Yes C:\WINNT\system32\gdi32.dll
0x71ad1000 0x71ad804c Yes C:\WINNT\system32\wsock32.dll
0x71ab1000 0x71ac6dc8 Yes C:\WINNT\system32\ws2_32.dll
0x71aa1000 0x71aa7324 Yes C:\WINNT\system32\ws2help.dll
0x622c1000 0x622c9110 Yes D:\Maksr\Produkti\Install\Mozilla\mozilla\di
st\bin\plds4.dll
0x6ce41000 0x6ce49140 Yes D:\Maksr\Produkti\Install\Mozilla\mozilla\di
st\bin\plc4.dll
0x68f81000 0x68f947cc Yes D:\Maksr\Produkti\Install\Mozilla\mozilla\di
st\bin\nssutil3.dll
0x6e1c1000 0x6e1e2718 Yes D:\Maksr\Produkti\Install\Mozilla\mozilla\di
st\bin\smime3.dll
0x64a81000 0x64aafe08 Yes D:\Maksr\Produkti\Install\Mozilla\mozilla\di
st\bin\softokn3.dll
0x61c01000 0x61c65f84 Yes D:\Maksr\Produkti\Install\Mozilla\mozilla\di
st\bin\sqlite3.dll
0x679c1000 0x679e2914 Yes D:\Maksr\Produkti\Install\Mozilla\mozilla\di
st\bin\nssdbm3.dll
0x696c1000 0x69707f04 Yes D:\Maksr\Produkti\Install\Mozilla\mozilla\di
st\bin\freebl3.dll
0x63c01000 0x63c4c030 Yes D:\Maksr\Produkti\Install\Mozilla\mozilla\di
st\bin\nssckbi.dll
(gdb) info symbol 0x63c0b96c
PR_Free in section .text
(gdb) next
2898 PR_Cleanup();
(gdb) info symbol 0x63c0b96c
No symbol matches 0x63c0b96c.
(gdb) next
Breakpoint 6, _PR_DestroyThreadPrivate (self=0x3d4790) at prtpd.c:245
245 {
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x63c0b96c in ?? ()
(gdb)
(gdb) backtrace
#0 0x63c0b96c in ?? ()
#1 0x64f484ed in _PR_DestroyThreadPrivate (self=0x3d4790) at prtpd.c:265
#2 0x64f57fe2 in _PR_CleanupThread (thread=0x3d4790) at prcthr.c:62
#3 0x64f4d8ee in PR_Cleanup () at prinit.c:424
#4 0x00406f5d in certutil_main (argc=0, argv=0x22fc98, initialize=1)
at certutil.c:2898
#5 0x004095aa in main (argc=4, argv=0x3d2890) at certutil.c:2910
(gdb)
The most informing are two different answers to "info symbol 0x63c0b96c".
Maks Romih.
Reproducible: Sometimes
Steps to Reproduce:
1.
2.
3.
Comment 1•16 years ago
|
||
What version of NSPR are you using?
Where/how did you get it?
Assignee: nobody → wtc
Component: Tools → NSPR
Product: NSS → NSPR
QA Contact: tools → nspr
Version: unspecified → other
Assignee | ||
Comment 2•16 years ago
|
||
Maks,
PR_Free is the NSPR thread private data destructor installed by
this call:
http://mxr.mozilla.org/security/source/security/nss/lib/base/error.c#103
100 static PRStatus
101 error_once_function ( void)
102 {
103 return PR_NewThreadPrivateIndex(&error_stack_index, PR_Free);
104 }
I need to know the version of NSS you're using.
1. In mozilla/client.mk, what's the value of the NSS_CO_TAG variable?
2. In mozilla/security/nss/lib/nss/nss.h, what's the value of the
NSS_VERSION macro?
Reporter | ||
Comment 3•16 years ago
|
||
(In reply to comment #1)
> What version of NSPR are you using?
> Where/how did you get it?
>
I've got NSPR with the Firefox source tarball firefox-3.0.1-source.tar.bz2 that I downloaded on Jully 28th.
Properties/Version on my nspr4.dll shows 4.7.1.0.
Reporter | ||
Comment 4•16 years ago
|
||
>
> PR_Free is the NSPR thread private data destructor installed by
> this call:
> http://mxr.mozilla.org/security/source/security/nss/lib/base/error.c#103
>
> 100 static PRStatus
> 101 error_once_function ( void)
> 102 {
> 103 return PR_NewThreadPrivateIndex(&error_stack_index, PR_Free);
> 104 }
>
Yes, exactly. This function in my case gets called twice. Problematic is second call which is made after loading nssckbi.dll. Here is the call stack:
#0 PR_NewThreadPrivateIndex (newIndex=0x63c43040, dtor=0x63c0b96c <PR_Free>) at prtpd.c:131
#1 0x63c0b35a in error_once_function () at error.c:99
#2 0x64f4df74 in PR_CallOnce (once=0x63c43050, func=0x63c0b348 <error_once_function>) at prinit.c:808
#3 0x63c0b37a in error_get_my_stack () at error.c:120
#4 0x63c43050 in error_stack_index () from D:\Maksr\Produkti\Install\Mozilla\mozilla\dist\bin\nssckbi.dll
#5 0x63c0b348 in nss_arena_hash_free_entry () from D:\Maksr\Produkti\Install\Mozilla\mozilla\dist\bin\nssckbi.dll
#6 0x63c0b453 in nss_ClearErrorStack () at error.c:277
#7 0x63c0ac09 in NSSArena_Create () at arena.c:385
#8 0x63c07a7c in nssCKFWInstance_Create (pInitArgs=0x6445b70a, LockingState=MultiThreaded, mdInstance=0x63c10080, pError=0x22f954) at instance.c:217
#9 0x63c01efa in NSSCKFWC_Initialize (pFwInstance=0x63c43020, mdInstance=0x63c10080, pInitArgs=0x6445b70a) at wrap.c:205
#10 0x63c01195 in builtinsC_Initialize (pInitArgs=0x6445b70a) at d:/Maksr/Produkti/Install/Mozilla/mozilla/dist/public/nss/nssck.api:117
#11 0x643d54f7 in secmod_ModuleInit (mod=0xb38b08, alreadyLoaded=0x22f9c8) at pk11load.c:147
#12 0x643d58bf in SECMOD_LoadPKCS11Module (mod=0xb38b08) at pk11load.c:379
#13 0x643e4930 in SECMOD_AddModule (newModule=0xb38b08) at pk11util.c:487
#14 0x643e49b8 in SECMOD_AddNewModuleEx (moduleName=0x6445a244 "Root Certs", dllPath=0xb389e8 "./nssckbi.dll", defaultMechanismFlags=0, cipherEnableFlags=0, modparms=0x0, nssparms=0x0) at pk11util.c:586
#15 0x643e4a81 in SECMOD_AddNewModule (moduleName=0x6445a244 "Root Certs", dllPath=0xb389e8 "./nssckbi.dll", defaultMechanismFlags=0, cipherEnableFlags=0) at pk11util.c:626
#16 0x643c1ae6 in nss_Init (configdir=0x421e90 ".", certPrefix=0x417141 "", keyPrefix=0x417141 "", secmodName=0x418ab3 "secmod.db", updateDir=0x6445a00c "", updCertPrefix=0x6445a00c "", updKeyPrefix=0x6445a00c "", updateID=0x6445a00c "", updateName=0x6445a00c "", readOnly=1, noCertDB=0, noModDB=0, forceOpen=0, noRootInit=0, optimizeSpace=0, noSingleThreadedModules=0, allowAlreadyInitializedModules=0, dontFinalizeModules=0) at nssinit.c:388
#17 0x643c1ca1 in NSS_Initialize (configdir=0x421e90 ".", certPrefix=0x417141 "", keyPrefix=0x417141 "", secmodName=0x418ab3 "secmod.db", flags=1) at nssinit.c:661
#18 0x00406d9c in certutil_main (argc=4288833, argv=0x1, initialize=1) at certutil.c:2324
#19 0x004095aa in main (argc=4, argv=0x3d2980) at certutil.c:2910
In my nssckbi.dll the PR_Free is imported, so the address, that gets transferred in the second parameter to PR_NewThreadPrivateIndex is from the segment of nssckbi.dll. This causes segfault afterwards, when the NSS_Shutdown unloads the nssckbi.dll before PR_Cleanup tries to destroy the thread.
I will investigate now if the MSVC build has the same problem, however I'm not so at home wih WinDbg as I'm with Gdb. Maybe import mechanism between DLLs in MSVC works differently than in GCC.
> I need to know the version of NSS you're using.
> 1. In mozilla/client.mk, what's the value of the NSS_CO_TAG variable?
> 2. In mozilla/security/nss/lib/nss/nss.h, what's the value of the
> NSS_VERSION macro?
>
NSS_CO_TAG = FIREFOX_3_0_1_RELEASE
#define NSS_VERSION "3.12.0.3" _NSS_ECC_STRING _NSS_CUSTOMIZED
yeah, you're basically passing a thunk to the cleanup library. and then unloading the library. by the time you call pr_cleanup, the library containing the thunk is dead and the pointer isn't valid.
you can use nspr to lookup the real PR_Free function instead.
Assignee | ||
Comment 6•16 years ago
|
||
Maks,
Thanks for the NSS version info. I was worried that you're
using a very old version of NSS (used in Firefox 1.0.x), whose
nssckbi.dll has its own stub of PR_Free:
http://mxr.mozilla.org/aviarybranch/source/security/nss/lib/ckfw/nsprstub.c#317
If you unload that old nssckbi.dll, it would make sense that
its PR_Free stub is also gone from the address space.
You are using the latest version of NSS, whose nssckbi.dll uses
the real PR_Free exported from nspr4.dll. Except that in this
case it is a thunk in nssckbi.dll for PR_Free that is passed to
PR_NewThreadPrivateIndex.
The fix that timeless suggested in comment 5 should work, but is
ugly.
Can you modify GCC (or suggest to the GCC maintainers) so that
GCC uses the address of the real function rather than the thunk
when the address is passed to a function defined in another DLL?
We fixed a related memory leak (bug 431805) before. I believe
the same fix applied to nssckbi.dll would fix this crash. In
mozilla/security/nss/lib/ckfw/wrap.c, function NSSCKFWC_Finalize,
could you add this after the two NULL pointer checks?
nss_DestroyErrorStack();
There are two other ways to fix this crash:
1. Make lib/base a shared library. (We could just make it part
of nssutil3.dll.) Right now lib/base is a static library. So
anything that links with lib/base (nss3.dll and nssckbi.dll) has
its own copy of the NSS error stack, and each one needs to call
nss_DestroyErrorStack() before shutdown.
2. Remove the NSS error stack. I don't think anyone is using it.
Assignee: wtc → nobody
Status: UNCONFIRMED → NEW
Component: NSPR → Libraries
Ever confirmed: true
Product: NSPR → NSS
QA Contact: nspr → libraries
Version: other → unspecified
Assignee | ||
Updated•16 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Updated•16 years ago
|
Assignee: nobody → wtc
Status: ASSIGNED → NEW
Comment 7•16 years ago
|
||
Is this a consequence of the fix for bug 337887, which added
-mnop-fun-dllimport to the CFLAGS for GCC on win32?
That was a workaround for an unnecessary gcc incompatibility.
Maybe we can get the gcc folks to fix the cause of bug 337887 and
then back out that workaround?
Reporter | ||
Comment 8•16 years ago
|
||
(In reply to comment #6)
Wan-Teh, thank you for the solution!
> We fixed a related memory leak (bug 431805) before. I believe
> the same fix applied to nssckbi.dll would fix this crash. In
> mozilla/security/nss/lib/ckfw/wrap.c, function NSSCKFWC_Finalize,
> could you add this after the two NULL pointer checks?
>
> nss_DestroyErrorStack();
>
This really helps! No more crash.
I've also verified with gdb that the destructor PR_Free is now called only once from _PR_DestroyThreadPrivate.
Maks.
Reporter | ||
Comment 9•16 years ago
|
||
(In reply to comment #8)
> (In reply to comment #6)
>
> Wan-Teh, thank you for the solution!
>
> > We fixed a related memory leak (bug 431805) before. I believe
> > the same fix applied to nssckbi.dll would fix this crash. In
> > mozilla/security/nss/lib/ckfw/wrap.c, function NSSCKFWC_Finalize,
> > could you add this after the two NULL pointer checks?
> >
> > nss_DestroyErrorStack();
> >
>
> This really helps! No more crash.
>
I've tried certutil and it really works now. But there's still a problem with modutil, that still crashes in a similar manner. I think it would be better to put nss_DestroyErrorStack after the row
error = nssCKFWInstance_Destroy(*pFwInstance);
because, when linked into modutil, this row creates a new error stack.
Maks.
Assignee | ||
Comment 10•16 years ago
|
||
Maks,
You're right. The NSSArena_Destroy call at the end of
nssCKFWInstance_Destroy creates the error stack if it
doesn't exist.
We should put nss_DestroyErrorStack right before the
"return error;" statement at the end of NSSCKFWC_Finalize.
I am wondering why certutil doesn't crash when we put
nss_DestroyErrorStack before the line
error = nssCKFWInstance_Destroy(*pFwInstance);
nssCKFWInstance_Destroy should also create a new error stack
when linked into certutil, right? Can you find out what
happens in gdb?
Reporter | ||
Comment 11•16 years ago
|
||
(In reply to comment #10)
>
> We should put nss_DestroyErrorStack right before the
> "return error;" statement at the end of NSSCKFWC_Finalize.
>
Now I've put it there and both, certutil and modutil seem to work OK.
> I am wondering why certutil doesn't crash when we put
> nss_DestroyErrorStack before the line
>
> error = nssCKFWInstance_Destroy(*pFwInstance);
>
> nssCKFWInstance_Destroy should also create a new error stack
> when linked into certutil, right? Can you find out what
> happens in gdb?
Yeah, I was wondering also. The fact is in comment 8 and comment 9 I was mistaken. This bug was as elusive as dangling pointers can be (I marked it as "Reproducible: Sometimes" in comment 1).
When tracing with gdb I was misled, because in the function _PR_DestroyThreadPrivate the destructor really got called one time less than before, so I thought it was OK. But one call less was the result of my applying the patch from bug 431805, not the fix suggested by Wan-Teh.
Now, that I've put the nss_DestroyErrorStack just before return in NSSCKFWC_Finalize, the destructor PR_Free doesn't get called at all. Both error stacks are destroyed before PR_Cleanup, one from NSSCKFWC_Finalize (called indirectly from NSS_Shutdown) and one from NSS_Shutdown itself.
I hope this concludes the matter.
This was my first reported bug in this bugzilla and I'm really not handy enough to make an official patch. Wan-Teh, will you take care?
Maks.
Reporter | ||
Comment 12•16 years ago
|
||
(In reply to comment #11)
> NSSCKFWC_Finalize, the destructor PR_Free doesn't get called at all. Both
PR_Free of course do get called, but I mean not from _PR_DestroyThreadPrivate when PR_Cleanup destroys the threads.
Maks.
Assignee | ||
Comment 13•16 years ago
|
||
Maks, please test this patch, which is slightly different
from what you tested before.
Note that I call nss_DestroyErrorStack() right before
NSSCKFWC_Finalize returns, but only if NSSCKFWC_Finalize
returns successfully (CKR_OK). The reason is that
people may want to examine the error stack if
NSSCKFWC_Finalize returns an error.
Of course, people may ignore the NSSCKFWC_Finalize
failure and unload the nssckbi.dll module anyway,
so this may not be a good idea.
Attachment #335151 -
Flags: superreview?(nelson)
Attachment #335151 -
Flags: review?(maksr3)
Comment 14•16 years ago
|
||
Comment on attachment 335151 [details] [diff] [review]
Proposed patch
If I remove the very large comment from the patch, it boils down to:
> switch( error ) {
> PRInt32 remainingInstances;
> case CKR_OK:
> remainingInstances = PR_AtomicDecrement(&liveInstances);
> if (!remainingInstances) {
> nssArena_Shutdown();
> }
>+ nss_DestroyErrorStack();
> break;
> case CKR_CRYPTOKI_NOT_INITIALIZED:
> case CKR_FUNCTION_FAILED:
> case CKR_GENERAL_ERROR:
> case CKR_HOST_MEMORY:
> break;
> default:
> error = CKR_GENERAL_ERROR;
> break;
> }
>
> return error;
Seems like the crash will occur in the event that "error" is any value
other than CKR_OK. No?
Assignee | ||
Comment 15•16 years ago
|
||
Right. But that's my dilemma. By definition, the
error stack is used to provide more information about
an error. That's why I don't want to destroy the
error stack on error.
But I suspect that our users will ignore the return value
of NSSCKFWC_Finalize and unload the nssckbi.dll module
anyway. So I'll be happy to move the nss_DestroyErrorStack()
call before the "return error;" statement.
Comment 16•16 years ago
|
||
Well, my first preference would be to find a fix that avoids/prevents
passing an address from within nssckbi.dll to PR_NewThreadPrivateIndex.
Comment 17•16 years ago
|
||
Thinking out loud here.
Comment 16 wasn't meant to imply that I oppose the fix you proposed in
comment 14.
Maybe the new line you're proposing to add should be ifdef'ed for MinGW too.
I think that's entirely appropriate.
Or Maybe nssckbi should call NSPR's equivalent of dlsym to get the address
of PR_Free, and then pass that address to PR_NewThreadPrivateIndex, but only
for these MinGW builds.
Assignee | ||
Comment 18•16 years ago
|
||
The nss_DestroyErrorStack() call fixes the leak of the
primordial thread's error stack if PR_Cleanup is not
called, so it should not be ifdef'ed for MinGW. (This
leak is similar to the leak of bug 431805, and is caused
by lib/base being a static library and being included
in nss3.dll and nssckbi.dll, resulting in two error
stacks per thread.)
I don't want to spend too much time on this MinGW bug.
I'd rather be working on more important NSS bugs.
Comment 19•16 years ago
|
||
Well, if you attach the patch that I think you were suggesting at the end
of comment 15, I'm willing to r+ it.
Assignee | ||
Comment 20•16 years ago
|
||
Thanks, Nelson. Here is the patch that Maks tested.
Attachment #335151 -
Attachment is obsolete: true
Attachment #335228 -
Flags: review?(nelson)
Attachment #335151 -
Flags: superreview?(nelson)
Attachment #335151 -
Flags: review?(maksr3)
Comment 21•16 years ago
|
||
Comment on attachment 335228 [details] [diff] [review]
Proposed patch v2
r=nelson
Attachment #335228 -
Flags: review?(nelson) → review+
Assignee | ||
Comment 22•16 years ago
|
||
I checked in the patch on the NSS trunk (NSS 3.12.2).
Checking in wrap.c;
/cvsroot/mozilla/security/nss/lib/ckfw/wrap.c,v <-- wrap.c
new revision: 1.17; previous revision: 1.16
done
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Target Milestone: --- → 3.12.2
You need to log in
before you can comment on or make changes to this bug.
Description
•