Closed
Bug 524256
Opened 15 years ago
Closed 15 years ago
Crash stacks not unwound past XPCWrappedNative::CallMethod
Categories
(Core :: XPCOM, defect, P2)
Core
XPCOM
Tracking
()
VERIFIED
FIXED
mozilla1.9.3a1
Tracking | Status | |
---|---|---|
status1.9.2 | --- | beta2-fixed |
status1.9.1 | --- | .8-fixed |
People
(Reporter: jruderman, Assigned: neil)
References
Details
(Keywords: verified1.9.1, Whiteboard: [crashkill])
Attachments
(2 files)
1.14 KB,
patch
|
ted
:
review+
|
Details | Diff | Splinter Review |
1.09 KB,
patch
|
benjamin
:
review+
samuel.sidler+old
:
approval1.9.1.8+
|
Details | Diff | Splinter Review |
Stacks for many signatures are getting cut off at XPCWrappedNative::CallMethod.
bp-8c98528d-a4d5-4ee5-a0cc-c26332091023
Bug 505059 -- [@ CSSLoaderImpl::LoadSheet...]
bp-d04a9331-4adc-4240-8435-bc3c52091022
Bug 524255 -- [@ _purecall | CallQueryInterface...]
CCing the heroes of bug 519616.
Seems more likely a Gecko bug, in that our assembly-coded NS_InvokeByIndex_P probably isn't providing all the debug information that compiler-generated code does.
Comment 2•15 years ago
|
||
(In reply to comment #1)
> Seems more likely a Gecko bug, in that our assembly-coded NS_InvokeByIndex_P
> probably isn't providing all the debug information that compiler-generated code
> does.
I know it's possible to emit CFI annotations on linux and presumably OS X (http://lkml.indiana.edu/hypermail/linux/kernel/0509.1/0118.html). Has anyone seen anything similar for MSVC?
Still, even without stack unwind information, we should be able to unwind the stack. Does anyone have a minidump of a crash that fails like these ones?
Comment 3•15 years ago
|
||
Assuming this is Windows-only: does NS_InvokeByIndex use a frame pointer? Breakpad (and most debuggers) assume that if debugging information is not present for a function, the function uses a frame pointer in EBP
http://mxr.mozilla.org/mozilla-central/source/xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp#79
Component: Socorro → XPCOM
Product: Webtools → Core
QA Contact: socorro → xpcom
Comment 4•15 years ago
|
||
(In reply to comment #3)
> Assuming this is Windows-only: does NS_InvokeByIndex use a frame pointer?
> Breakpad (and most debuggers) assume that if debugging information is not
> present for a function, the function uses a frame pointer in EBP
Yes, it appears to use a frame pointer.
Comment 5•15 years ago
|
||
I wonder if declspec(naked) emits some sort of debugging information that isn't correct for the function.
Comment 6•15 years ago
|
||
Jeff: you can easily produce one using "Crash Me Now!" and grabbing the minidump out of the Crash Reports/pending/ directory before quitting the crash reporter:
Stacks from that look like:
http://crash-stats.mozilla.com/report/index/bp-73cd938b-98ae-4148-9757-4433b2090929
Comment 7•15 years ago
|
||
Running the stackwalker on the minidump from this crash:
http://crash-stats.mozilla.com/report/index/bd254b84-4ed3-4a37-a763-268702091026
gives me a much more useful. Perhaps breakpad was not properly updated?
Comment 8•15 years ago
|
||
The minidump is available at:
http://people.mozilla.org/~jmuizelaar/
Comment 9•15 years ago
|
||
AFAICT the new code is running in production, evidenced by the lack of 0x0 crashes in the topcrash list.
Comment 10•15 years ago
|
||
When I run minidump_stackwalk (from SVN tip) on your dump, I just get:
Thread 0 (crashed)
0 crashme.dll!nsCrasher::Crash() [nsCrasher.cpp : 21 + 0x0]
eip = 0x009311bd esp = 0x0013eae0 ebp = 0x0013eae8 ebx = 0x0013eda8
esi = 0x00000000 edi = 0x00000003 eax = 0x00000003 ecx = 0x038d1120
edx = 0x00951c00 efl = 0x00010246
Found by: given as instruction pointer in context
1 xul.dll!NS_InvokeByIndex_P [xptcinvoke.cpp:ef4565b6f268 : 101 + 0x2]
eip = 0x50691636 esp = 0x0013eae4 ebp = 0x0013eae8
Found by: call frame info
2 xul.dll!XPCWrappedNative::CallMethod(XPCCallContext &,XPCWrappedNative::Call
Mode) [xpcwrappednative.cpp:ef4565b6f268 : 2690 + 0x1a]
eip = 0x5052e7f1 esp = 0x0013eaf0 ebp = 0x0013eae8
Found by: call frame info with scanning
in the crashing thread, same as crash-stats. You can see that it resorts to stack scanning to even get frame 2, then it must just get lost.
Comment 11•15 years ago
|
||
(In reply to comment #10)
> When I run minidump_stackwalk (from SVN tip) on your dump, I just get:
> Thread 0 (crashed)
> 0 crashme.dll!nsCrasher::Crash() [nsCrasher.cpp : 21 + 0x0]
> eip = 0x009311bd esp = 0x0013eae0 ebp = 0x0013eae8 ebx = 0x0013eda8
> esi = 0x00000000 edi = 0x00000003 eax = 0x00000003 ecx = 0x038d1120
> edx = 0x00951c00 efl = 0x00010246
> Found by: given as instruction pointer in context
> 1 xul.dll!NS_InvokeByIndex_P [xptcinvoke.cpp:ef4565b6f268 : 101 + 0x2]
> eip = 0x50691636 esp = 0x0013eae4 ebp = 0x0013eae8
> Found by: call frame info
> 2 xul.dll!XPCWrappedNative::CallMethod(XPCCallContext
> &,XPCWrappedNative::Call
> Mode) [xpcwrappednative.cpp:ef4565b6f268 : 2690 + 0x1a]
> eip = 0x5052e7f1 esp = 0x0013eaf0 ebp = 0x0013eae8
> Found by: call frame info with scanning
>
> in the crashing thread, same as crash-stats. You can see that it resorts to
> stack scanning to even get frame 2, then it must just get lost.
The reason you run into the problem is because you have symbols for crashme.dll. I didn't have those so I was going from Crash() to CallMethod() and using the frame pointer from there.
Why does the crash server have symbols for crashme.dll and how do I get them? They don't seem to be listed in symbol index file.
Comment 12•15 years ago
|
||
Ah, interesting. I manually uploaded the crashme symbols after I built the extension. They're in the same directory as our OS symbols, which isn't available via HTTP. I copied it over to people if you want it:
http://people.mozilla.org/~tmielczarek/crashme.sym
Comment 13•15 years ago
|
||
Yeah, that is interesting. If I process it without the crashme symbols it gets the full stack. I guess the NS_InvokeByIndex_P frame is kind of a tar pit. Regardless, the crashes in comment 0 have the same exact problem exhibited by your minidump when we have symbols for the top of the stack, so we should figure out why this is. Here's the top 4 frames I get when I process without crashme symbols:
Thread 0 (crashed)
0 crashme.dll + 0x11bd
eip = 0x009311bd esp = 0x0013eae0 ebp = 0x0013eae8 ebx = 0x0013eda8
esi = 0x00000000 edi = 0x00000003 eax = 0x00000003 ecx = 0x038d1120
edx = 0x00951c00 efl = 0x00010246
Found by: given as instruction pointer in context
1 xul.dll!XPCWrappedNative::CallMethod(XPCCallContext &,XPCWrappedNative::Call
Mode) [xpcwrappednative.cpp:ef4565b6f268 : 2690 + 0x1a]
eip = 0x5052e7f1 esp = 0x0013eaf0 ebp = 0x0013ed78
Found by: previous frame's frame pointer
2 xul.dll!XPC_WN_CallMethod(JSContext *,JSObject *,unsigned int,int *,int *) [
xpcwrappednativejsops.cpp:ef4565b6f268 : 1756 + 0x11]
eip = 0x5052ad40 esp = 0x0013ed80 ebp = 0x035010b4
Found by: previous frame's frame pointer
3 mozjs.dll!js_Invoke [jsinterp.cpp:ef4565b6f268 : 1365 + 0x14]
eip = 0x623a0aca esp = 0x0013ee4c ebp = 0x0013eee8
Found by: call frame info
4 mozjs.dll!js_Interpret [jsops.cpp:ef4565b6f268 : 2291 + 0xd]
eip = 0x623986e1 esp = 0x0013eef0 ebp = 0x0013f0e4
Found by: previous frame's frame pointer
I poked through the debug info a little bit, and here's a brain dump. NS_InvokeByIndex_P looks like this:
FILE 338447 hg:hg.mozilla.org/mozilla-central:xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp:ef4565b6f268
FUNC 27160f 2b 10 NS_InvokeByIndex_P
STACK WIN 4 27160f 2b 0 0 10 0 0 0 1 $T2 $esp .cbLocals + .cbSavedRegs + = $T0 .raSearchStart = $eip $T0 ^ = $esp $T0 4 + =
That program string says:
$T0 = .raSearchStart
$eip = *$T0
$esp = $T0 + 4
(Where .raSearchStart is http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/stackwalker_x86.cc#160, 0x13eae8 in this specific case)
The value of %eip comes out bogus here (0x13ed78), so we resort to stack scanning, which does find the next frame. However, it gives us %eip = 0x5052e7f1 (rva = 0x10e7f1), which is actually way past the end of XPC_WN_CallMethod (rva = 0x10aba0, size = 0x26a). Since the IP isn't actually contained in this function, we fail to find the FPO data for it here:
http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/basic_source_line_resolver.cc#427
Which means we have to resort to %ebp walking, but our value of %ebp is clearly not valid for this frame, so we just fail. I think this indicates that there's something failing in the debug info for NS_InvokeByIndex_P, since we should be able to find the next frame from there strictly from program string evaluation. I wouldn't be surprised to find that the __declspec(naked) is doing something weird.
Comment 14•15 years ago
|
||
Yeah, that program string is wrong for NS_InvokeByIndex: no reference to $ebp, it's assuming the function doesn't have a frame pointer because optimizations are on and it doesn't see the assembly. I have a patch which might work.
Comment 15•15 years ago
|
||
I don't think we should spend a lot of time on this, while it's clearly a bug, I don't think the resulting stacks are likely to help much with crash analysis. For example, the rest of the stack in this "crash me now" crash would just tell you that JavaScript was invoked from the event loop.
(But if bsmedberg's patch works, then great, let's just fix it.)
Reporter | ||
Comment 16•15 years ago
|
||
It could help in bug 505059 by telling us where we are in shutdown, or in bug 524255 by telling us if the _purecall is due to there being a destructor on the stack. And it would dispel the feeling that the stacks are "busted" and therefore ignorable.
Comment 17•15 years ago
|
||
Also, XPCWrappedNative is called from a lot of places (we do sometimes do have stacks that go C++ -> JS -> C++) and having that extra context can be useful.
Comment 18•15 years ago
|
||
#pragmas FTW
This basically suppresses the custom STACK operations for NS_InvokeByIndex so that the default frame-pointer operation takes over.
Assignee: nobody → benjamin
Status: NEW → ASSIGNED
Attachment #408618 -
Flags: review?(ted.mielczarek)
Comment 19•15 years ago
|
||
Comment on attachment 408618 [details] [diff] [review]
Use #pragmas to tell MSVC that NS_InvokeByIndex has a frame pointer, rev. 1
Nice!
Attachment #408618 -
Flags: review?(ted.mielczarek) → review+
Comment 20•15 years ago
|
||
Comment on attachment 408618 [details] [diff] [review]
Use #pragmas to tell MSVC that NS_InvokeByIndex has a frame pointer, rev. 1
We should land this on branches too. It's very simple, doesn't change any behavior, and will help us get better crash stacks immediately.
Attachment #408618 -
Flags: approval1.9.2+
Attachment #408618 -
Flags: approval1.9.1.5?
Comment 21•15 years ago
|
||
Landed in m-c:
http://hg.mozilla.org/mozilla-central/rev/15f4223cd977
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Reporter | ||
Updated•15 years ago
|
Whiteboard: [crashkill]
Comment 22•15 years ago
|
||
Can someone verify using today's nightly that this in fact is fixed, before I start on the branches?
Comment 23•15 years ago
|
||
Today's nightly shows:
FUNC 2763cf 2b 10 NS_InvokeByIndex_P
STACK WIN 4 2763cf 2b 0 0 10 0 0 0 1 $T2 $esp .cbLocals + .cbSavedRegs + = $T0 .
raSearchStart = $eip $T0 ^ = $esp $T0 4 + =
Perhaps WPO or PGO (or both) are screwing you up here? I notice the optimize pragma takes a "g" for "global optimizations". Might be worth a shot.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 24•15 years ago
|
||
(Also, I tried the obvious "process a minidump from crash me now produced with today's nightly" before looking at the symbol files.)
Comment 25•15 years ago
|
||
Other than "delete the NS_InvokeByIndex_P lines from the .sym file" I don't know what to do with this next.
Assignee: benjamin → nobody
Updated•15 years ago
|
Flags: blocking1.9.2+
Priority: -- → P2
Assignee | ||
Comment 26•15 years ago
|
||
Maybe allowing the compiler to generate the prologue and epilogue would help.
[I checked and the #pragma forces it even in opt builds.]
Attachment #409673 -
Flags: review?(ted.mielczarek)
Comment 27•15 years ago
|
||
Comment on attachment 409673 [details] [diff] [review]
Possible patch
I am not qualified to review this. Also, if you can make a try build with this, I can test to see if it fixes the problem.
Attachment #409673 -
Flags: review?(ted.mielczarek) → review?(benjamin)
Comment 28•15 years ago
|
||
Comment on attachment 408618 [details] [diff] [review]
Use #pragmas to tell MSVC that NS_InvokeByIndex has a frame pointer, rev. 1
Clearing approvals since this patch didn't fix our problem.
Attachment #408618 -
Flags: approval1.9.2+
Attachment #408618 -
Flags: approval1.9.1.5?
Comment 29•15 years ago
|
||
Comment on attachment 409673 [details] [diff] [review]
Possible patch
I'd be a little surprised if this actually worked, but I don't think it can hurt.
Attachment #409673 -
Flags: review?(benjamin) → review+
Comment 30•15 years ago
|
||
(In reply to comment #25)
> Other than "delete the NS_InvokeByIndex_P lines from the .sym file" I don't
> know what to do with this next.
Can we give that a try and see if that helps? This is blocking investigation of a couple of top crash bugs...
Comment 31•15 years ago
|
||
There's a patch here, we should see if this fixes the issue before trying other crazy things.
Assignee | ||
Comment 32•15 years ago
|
||
Comment on attachment 409673 [details] [diff] [review]
Possible patch
Pushed changeset c0d760c5165c to mozilla-central.
Updated•15 years ago
|
Assignee: nobody → neil
Comment 33•15 years ago
|
||
With today's nightly:
http://crash-stats.mozilla.com/report/index/4155267b-3ea7-4ec8-a3ed-ee14f2091105
Looks good, although we still get hung up at nsRunnable::Release for some reason. We can handle that separately though. This bug is blocking1.9.2+, so feel free to land on 1.9.2 at your leisure.
Status: REOPENED → RESOLVED
Closed: 15 years ago → 15 years ago
Resolution: --- → FIXED
Updated•15 years ago
|
Status: RESOLVED → VERIFIED
Assignee | ||
Comment 34•15 years ago
|
||
Comment on attachment 408618 [details] [diff] [review]
Use #pragmas to tell MSVC that NS_InvokeByIndex has a frame pointer, rev. 1
Pushed changeset 6801f86775b5 to releases/mozilla-1.9.2
Assignee | ||
Comment 35•15 years ago
|
||
Pushed changeset e3756eea61f0 to releases/mozilla-1.9.1
status1.9.2:
--- → final-fixed
Updated•15 years ago
|
Comment 36•15 years ago
|
||
Comment on attachment 409673 [details] [diff] [review]
Possible patch
We should take this patch on 1.9.1 to get better crash stacks in certain situations there. Very low risk, has been on trunk and 1.9.2 for a while now.
Attachment #409673 -
Flags: approval1.9.1.7?
Comment 37•15 years ago
|
||
(In reply to comment #35)
> Pushed changeset e3756eea61f0 to releases/mozilla-1.9.1
Er, this changeset doesn't exist on 1.9.1? (Also your patch didn't have approval1.9.1.x anyway.)
Assignee | ||
Comment 38•15 years ago
|
||
(In reply to comment #34)
>(From update of attachment 408618 [details] [diff] [review])
> Pushed changeset 6801f86775b5 to releases/mozilla-1.9.2
(In reply to comment #35)
> Pushed changeset e3756eea61f0 to releases/mozilla-1.9.1
Sorry, both of these were actually pushed to releases/mozilla-1.9.2; the bug history shows me setting the 1.9.2 status flag, and you correcting it ;-)
Attachment #409673 -
Flags: approval1.9.1.6?
Comment on attachment 409673 [details] [diff] [review]
Possible patch
Seems like we should take it on 1.9.1.6 if possible.
Updated•15 years ago
|
Attachment #409673 -
Flags: approval1.9.1.6?
Comment 40•15 years ago
|
||
(In reply to comment #36)
> We should take this patch on 1.9.1 to get better crash stacks in certain
> situations there. Very low risk, has been on trunk and 1.9.2 for a while now.
Has it been proved to work? There were several comments (e.g. comment 29) that expressed doubts. Given that it doesn't seem worth hauling 1.9.1.6 back to the station, we'll catch it in 1.9.1.7
Comment 41•15 years ago
|
||
Yes, it works, I verified in comment 33. (Sorry, perhaps that wasn't clear.)
Comment 42•15 years ago
|
||
Comment on attachment 409673 [details] [diff] [review]
Possible patch
Approved for 1.9.1.7. a=ss
Attachment #409673 -
Flags: approval1.9.1.7? → approval1.9.1.7+
Comment 43•15 years ago
|
||
Pushed both patches to 1.9.1:
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/c92853c1c3f1
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/ff4253bc919b
QA: this can be verified by using "crash me now", submitting a crash, and verifying that the resulting stack goes beyond XPCWrappedNative::CallMethod.
status1.9.1:
--- → .7-fixed
Comment 44•15 years ago
|
||
Verified fixed on 1.9.1 with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.8pre) Gecko/20100128 Shiretoko/3.5.8pre (.NET CLR 3.5.30729) ID:20100128044603
http://crash-stats.mozilla.com/report/index/0adabf50-3d4b-42de-bb37-973812100128
Keywords: verified1.9.1
Target Milestone: --- → mozilla1.9.3a1
You need to log in
before you can comment on or make changes to this bug.
Description
•