Closed
Bug 470500
Opened 15 years ago
Closed 14 years ago
Firefox 3.1b2 Crash Report [@ nssutil3.dll@0x34c0 ]
Categories
(NSS :: Libraries, defect, P2)
Tracking
(Not tracked)
RESOLVED
FIXED
3.12.4
People
(Reporter: chofmann, Assigned: nelson)
References
Details
(Keywords: regression, topcrash)
Crash Data
Attachments
(2 files)
5.10 KB,
text/plain
|
Details | |
11.00 KB,
patch
|
rrelyea
:
review+
|
Details | Diff | Splinter Review |
This crash is currently ranked #9 on Firefox 3.1b2 -- appears to be a new regression 0 nssutil3.dll nssutil3.dll@0x34c0 1 nss3.dll nss3.dll@0x2551c 2 nss3.dll nss3.dll@0x2565a 3 nss3.dll nss3.dll@0x999a 4 xul.dll NS_InvokeByIndex_P xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp:101 5 xul.dll XPCWrappedNative::CallMethod js/src/xpconnect/src/xpcwrappednative.cpp:2422 Not much to go on but comments indicate: * Every time I close the program, the Crash Reporter appears, but there are no other indications of an issue. * Updated Session Manager add-on and crashed on restart * brought computer out of stand by Any easy way to tell if this is in NSS or the calling code?
Flags: blocking1.9.1?
Assignee | ||
Comment 1•15 years ago
|
||
This would all be much easier if Firefox's builds of NSS kept the symbols.
Depends on: 458553
Assignee | ||
Comment 2•15 years ago
|
||
Chris, I suspect this is yet another crash due to using NSS while it is uninitilized (e.g. before it is initialized or after it has been shut down). This looks a LOT like the stack shown in Bug 465974. That bug was exacerbated by the fix for Bug 462806, which caused a lot of code to use NSS without initializing it. But since this bug claims to happen at shutdown, I'm thinking it's more likely to be related to Bug 427715 or Bug 450468.
Comment 3•15 years ago
|
||
Benjamin: is this maybe a straight dupe of bug 427715? There's not a lot to go on here to make this a blocker, especially since it's at shutdown. Please renominate if the numbers continue to climb, though.
Flags: blocking1.9.1? → blocking1.9.1-
Reporter | ||
Comment 4•14 years ago
|
||
more comments from recent 3.5 b4 users. this has now moved up to #4 top crash. users seem to also be in low memory conditions when they hit this signature. many running facebook apps and viewing pages there. a few other sites also listed in the attachment. Maybe the shutdown problems are also under low memory...
Comment 5•14 years ago
|
||
I agree with Nelson that there should be debug symbols in your NSS builds. At the minimum, we need to know the names of the entry point(s) in nss3.dll and nssutil3.dll that are being suspected of causing a problem here.
julien: it'd be great if your team fixed NSS so that debug symbols are built by default.
Comment 7•14 years ago
|
||
Did something change? We have symbols on the 1.9.0 branch for Windows, but apparently not 1.9.1. 1.9.1 mozconfig: http://hg.mozilla.org/build/buildbot-configs/file/d943ec01e814/mozilla2/win32/mozilla-1.9.1/release/mozconfig (note it sets MOZ_DEBUG_SYMBOLS=1) 1.9.0 mozconfig: http://mxr.mozilla.org/mozilla/source/tools/tinderbox-configs/firefox/win32/mozconfig (I believe the tinderbox client script sets MOZ_DEBUG_SYMBOLS=1 here) I can't tell that we're doing anything different from our end. Did something in NSS change to break this?
dunno, we haven't had symbols in nss for months now. it's really bad. i've been complaining and been ignored for a while.
Comment 9•14 years ago
|
||
I filed bug 468701 a while ago, but I thought it was only for Linux/Mac.
Comment 10•14 years ago
|
||
*shrug*, we've been w/o coverage on all platforms for a while.
Comment 11•14 years ago
|
||
timeless, Re: comment 6, I wasn't aware of this problem until now, and I don't think there is an NSS bug filed. Most NSS developers don't build the browser, only NSS standalone. I'm not sure that it is an NSS problem. Debug symbols are built with debug builds of NSS by default. I think if you set MOZ_DEBUG_SYMBOLS=1, you can get them for optimized builds as well, at least for Windows. I'm not sure about other platforms.
Assignee | ||
Comment 12•14 years ago
|
||
I'm not sure this is an NSS bug AT ALL. Stepping back from issues about NSS symbols, I think there's a more fundamental question to be asked, which is: How & why did NS_InvokeByIndex_P call any NSS code at all? NS_InvokeByIndex_P is a function that enables JavaScript code to call C++ methods on C++ objects. It finds the vTable for the object, and finds the entry point in that vTable using an index, then calls that vTable entry point. But NSS is not written in C++. It has no C++ classes and no vtables. So, it seems unlikely that NS_InvokeByIndex_P could have legitimately called any NSS function at all. I suspect this is a case of a call through a "wild pointer". It's conceivable to me that someone has cobbled together some structures to look like a C++ object and C++ vTable, and has put the address of an NSS function in that table. If that was done, it was not done in NSS. It's not clear to me how that would work, since no NSS function expects a "this" pointer. So, I think any further effort asking "why did NSS crash" will be fruitless. It's not surprising that NSS would crash if there was a wild call into it. A better question is: how/why does that wild call occur?
Assignee | ||
Comment 13•14 years ago
|
||
In reply to comment 11: Julien, there are at least two bugs on file about the fact that, when NSS is built as part of Firefox, it is built without any symbols. Two such bugs are cited in previous comments in this bug. But let's let that that issue be addressed and resolved in those bugs, not in this one.
Comment 14•14 years ago
|
||
NS_IvokeByIndex is mostly a placeholder. Without symbols for NSS, the debuggers have no clue what the optimizers have done to NSS or the callees and can't guess where the functions are in the call stack. I think because of the sheer size of an invokebyindex frame, it might be easier for the debuggers to find them. But really, until someone gets us symbols for nss, we can't do anything.
Assignee | ||
Comment 15•14 years ago
|
||
Timeless, When I build NSS with NSS Makefiles on Windows, I get symbols. Mozilla builds NSS differently than the NSS team does. Mozilla builds override numerous NSS Makefile variables. Any differences between the NSS team's builds and Mozilla's builds of NSS are Mozilla's responsibility. It is likely that Mozilla's builds either a) do not define MOZ_DEBUG_SYMBOLS , or b) override the definition of OPTIMIZER and/or CFLAGS and/or NOMD_CFLAGS
Reporter | ||
Comment 16•14 years ago
|
||
there are a handfull of different signatures of a longer length than in comment 0 http://crash-stats.mozilla.com/report/index/010382ab-f863-4e76-802b-a1bb72090513 0 nssutil3.dll nssutil3.dll@0x34c0 1 nss3.dll nss3.dll@0x25be6 2 nss3.dll nss3.dll@0x25d24 3 nss3.dll nss3.dll@0x99c1 4 xul.dll XPCWrappedNative::CallMethod js/src/xpconnect/src/xpcwrappednative.cpp:2450 5 xul.dll XPC_WN_CallMethod js/src/xpconnect/src/xpcwrappednativejsops.cpp:1583 6 js3250.dll js_Invoke js/src/jsinterp.cpp:1365 7 js3250.dll js_Interpret js/src/jsinterp.cpp:5132 8 js3250.dll js_Invoke js/src/jsinterp.cpp:1373 9 js3250.dll js_fun_call js/src/jsfun.cpp:1688 10 js3250.dll js_Interpret js/src/jsinterp.cpp:5100 11 js3250.dll js_Invoke js/src/jsinterp.cpp:1373 12 xul.dll nsXPCWrappedJSClass::CallMethod js/src/xpconnect/src/xpcwrappedjsclass.cpp:1614 13 xul.dll nsXPCWrappedJS::CallMethod js/src/xpconnect/src/xpcwrappedjs.cpp:561 14 xul.dll PrepareAndDispatch xpcom/reflect/xptcall/src/md/win32/xptcstubs.cpp:114 15 xul.dll SharedStub xpcom/reflect/xptcall/src/md/win32/xptcstubs.cpp:141 16 xul.dll nsTimerImpl::Fire xpcom/threads/nsTimerImpl.cpp:465 17 xul.dll nsTimerEvent::Run xpcom/threads/nsTimerImpl.cpp:512
Comment 17•14 years ago
|
||
Nelson, Re: comment 13, I looked at every bug referenced in this one, and didn't find any filed against NSS that complains about debug symbols. The only other one that is about missing NSS debug symbols is bug 468701, but it's filed against product "Core", not against NSS. If there is an NSS bug for missing debug symbols, we should make it a blocker for this one.
Flags: wanted1.9.1?
Assignee | ||
Comment 18•14 years ago
|
||
Julien, The two bugs cited above are bug 458553 and bug 468701. One of them already blocks this bug. One of them is probably a dup of the other. Neither is an NSS bug because this bug is not the fault of any NSS file. See comment 15.
Comment 19•14 years ago
|
||
(In reply to comment #14) > But really, until someone gets us symbols for nss, we can't do anything. I believe that the patch I just attached to bug 458553 (attachment 379056 [details] [diff] [review]) should help in debugging these crashes. Is there any chance to get it landed for RC1, or is it too late for that?
Comment 20•14 years ago
|
||
Now that symbols are available at least for Windows (after the resolution of bug 458553), could nssutil3.dll@0x34c0 actually mean NSSRWLock_LockRead_Util? See e.g. http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.5b99&platform=windows&query_search=signature&query_type=exact&query=NSSRWLock_LockRead_Util&date=&range_value=1&range_unit=weeks&do_query=1&signature=NSSRWLock_LockRead_Util (from the preview of Firefox 3.5, crashes within the last week, Windows only)
Comment 21•14 years ago
|
||
Reminiscent of bug 427715, if so.
Comment 22•14 years ago
|
||
The lock is one that is created at NSS initialization time. If the call to NSSRWLock_LockRead crashes, that means the lock is not there. This is most likely because NSS has already been shut down, or not initialized yet. I recommend this bug be moved to PSM. It should not call HASH_Create without NSS being initialized.
Comment 23•14 years ago
|
||
Given those crash reports, probably a straight dupe of bug 427715 (which needs to be reopened, I guess)
Assignee | ||
Comment 24•14 years ago
|
||
Here is a URL that fetches a table of crash reports that all bear the same "signature" as the one reported for this bug. http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.6a1pre&query_search=signature&query_type=exact&query=&date=&range_value=4&range_unit=weeks&do_query=1&signature=nssutil3.dll%400x34c0 An examination of that table reveals that not all the stacks are alike. There are several very distinct stacks in that table. Consider these different and unique stacks: http://crash-stats.mozilla.com/report/index/0ce72a70-0c85-4d43-b24e-f76382090602 http://crash-stats.mozilla.com/report/index/37751c81-d531-483f-af4f-1c4732090528 http://crash-stats.mozilla.com/report/index/a819f4df-3c4f-48fd-890a-b69802090524 http://crash-stats.mozilla.com/report/index/323e8b84-0a1c-42c6-b360-673f22090527 http://crash-stats.mozilla.com/report/index/05142ce2-0a5d-47a6-b120-f544b2090521 I wouldn't assume that they all have the same cause (although, they might). If indeed it is the case in each and all of these that NSS has been invoked while it is not initialized, then these 5 stacks show 5 different bugs that all need to be fixed. It's unfortunate that crash-stats lump them all together.
Assignee | ||
Comment 25•14 years ago
|
||
Bob, please review. In the past, the NSS team's position has been that all crashes that occur in NSS as a result of calling NSS while NSS is in an uninitialized state were, by definition, not the fault of NSS. We also argued that eliminating those crashes would, in most cases, merely delay the failure and/or conceal the true cause of the problem. But clearly the browser folks cannot rid themselves of this fault, and there are certain very common cases that we can easily detect and avoid. So, this patch attempts to do the very thing we have avoided doing, for at least some (by no means all) common cases. If we put this patch in, I predict that the browser will begin to experience all sorts of new failures, failures that formerly crashed. But in some of those cases, the failure will be in non-NSS code, and so mozilla won't blame NSS for those failures.
Assignee | ||
Updated•14 years ago
|
Priority: -- → P2
Target Milestone: --- → 3.12.4
Version: unspecified → 3.12.2
Comment 26•14 years ago
|
||
Comment on attachment 383205 [details] [diff] [review] Patch v1 for NSS Trunk (untested) r+ rrelyea These don't hurt. I'm almost tempted to Assert ("You haven't Initialized NSS yet!!!"), but the diplomatic return error is probably sufficient. bob
Attachment #383205 -
Flags: review?(rrelyea) → review+
Comment 27•14 years ago
|
||
Nelson, Re: comment 25, just returning an error as in your patch is likely to move failures to some other place. However they are likely to still manifest themselves in usages of NSS. I'm not sure how much time we should spend on this. I don't see any alternative to the browser fixing those issues to resolve the problem. If we are going to spend time to return errors in optimized builds, I think we probably should assert for debug builds too. I also think there is a much more systematic way to go about this than trying to figure out the failure relatively "late" as in your patch, ie. when a global structure is missing : if (!NSS_Initialized()) { PORT_SetError(SEC_ERROR_NOT_INITIALIZED); PORT_Assert(0); return SECFailure; /* or whatever other error code is appropriate */ } We can turn this into a macro that takes an argument for the correct value to return. And insert a call to this macro at the top of most entry points from libnss/libssl/libsmime, which require NSS to be initialized before calling them. There are a few exceptions, like the SSL session cache init functions, but we should be able to make the list and not have the macro for them. Of course, even those macros wouldn't completely take care of the problem, if there is a race condition - one thread shuts down NSS, while another thread was in the middle of executing an NSS function. But they still would probably go a long way.
Comment 28•14 years ago
|
||
julien: what's the point, we already recognize this crash as meaning nss isn't initialized, it's been years of crashing like this, it hasn't gotten fixed :)
Comment 29•14 years ago
|
||
timeless, It can't be fixed by changes in NSS. The best NSS can do is give earlier warnings/errors/asserts when this erroneous situation occurs. NSS cannot implicitly initialize like NSPR.
Assignee | ||
Comment 30•14 years ago
|
||
Checking in pk11auth.c; new revision: 1.10; previous revision: 1.9 Checking in pk11slot.c; new revision: 1.98; previous revision: 1.97 Checking in pk11util.c; new revision: 1.54; previous revision: 1.53 These changes won't fix ALL possible crashes that are due to using NSS while it is uninitialized, but it will detect and avoid the ones reported in this bug.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•14 years ago
|
Flags: wanted1.9.1?
Assignee | ||
Updated•14 years ago
|
Flags: wanted1.9.1.x?
Updated•14 years ago
|
Flags: wanted1.9.1.x?
Updated•12 years ago
|
Crash Signature: [@ nssutil3.dll@0x34c0 ]
You need to log in
before you can comment on or make changes to this bug.
Description
•