Closed
Bug 536271
Opened 15 years ago
Closed 14 years ago
Include the page containing EIP in the minidump
Categories
(Toolkit :: Crash Reporting, defect)
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
blocking2.0 | --- | beta7+ |
People
(Reporter: dmandelin, Assigned: ted)
References
Details
Attachments
(3 files)
(Spun off from bug 530955 comment 22.) In order to debug on-trace (jitted code) crashes, we need the code at the point of the crash. As a first step, we'd like the page that contains EIP (at the time of the crash) included in the minidump. This is only needed for jitted code; if breakpad can tell whether the crash EIP is in jit code (looking at segments or something?), then it doesn't need to do this if the crash is in normal code. If it's cheap enough, getting an extra page on either side is probably a good idea too.
Assignee | ||
Comment 1•15 years ago
|
||
This will require some upstream changes. For the OS X / Linux cases it will probably actually be easier, since we're writing the minidump out manually. On Windows it will be a little trickier since we're using MinidumpWriteDump.
Assignee | ||
Comment 3•14 years ago
|
||
Carrying over blocking nomination from the dupe. Interestingly, the Breakpad minidump format documentation claims that a normal minidump already contains the 256 bytes around eip: http://code.google.com/p/google-breakpad/source/browse/trunk/src/google_breakpad/common/minidump_format.h#272 I'll see if I can verify that.
blocking2.0: --- → ?
Updated•14 years ago
|
blocking2.0: ? → beta7+
Assignee | ||
Comment 4•14 years ago
|
||
From minidump examination, it looks like this is the case on Windows. I'm not sure that it's present on OS X/Linux. If you can point me at a crash report in JIT coded where this would be useful, I can attempt to verify it. If this is WFM on Windows, would you still want to block on implementing it for OS X/Linux. (On Windows you can obviously pull the data out via a debugger, on OS X/Linux it's not as easy, but you could get it via the minidump_dump tool.)
Reporter | ||
Comment 5•14 years ago
|
||
(In reply to comment #4) > From minidump examination, it looks like this is the case on Windows. What does that mean, exactly? Does it show up in the disassembly view of the debugger? I thought I had tried a minidump and couldn't get code memory.
Assignee | ||
Comment 6•14 years ago
|
||
Yeah, I just opened a random minidump I had laying around in WinDBG, and I was able to open the disassembly view and see the 256 bytes around %eip there. (The surrounding memory was all ??s.)
Comment 7•14 years ago
|
||
Dave: did you mean this to be beta7 or betaN? Upstream changes make me nervous about time, and this isn't a recent regression ...
Updated•14 years ago
|
Assignee: nobody → ted.mielczarek
This is a pretty big deal for effectively diagnosing JIT-related crashes. I'm not sure what the upstream risk is, but we can control what we ship whether it's integrated in the google repository or not, I think.
Updated•14 years ago
|
blocking2.0: beta8+ → beta7+
Assignee | ||
Comment 9•14 years ago
|
||
I tested a random Win32 minidump I had laying around, and it does in fact include 256 bytes around EIP. However, dmandelin pointed me at a crash: https://crash-stats.mozilla.com/report/index/8974e974-9664-4843-a4dc-9badd2100910 This dump contains 256 bytes around EIP, but it's the EIP from the top of the stack where the dump was actually written, not the one from the exception record, which means it's a disassembly of the memory around KiFastSystemCallRet. Not very useful. It could be that the JIT throws off Microsoft's heuristic here. Perhaps it only attempts to include that memory if it's inside a loaded module? I've found some documentation on how to include extra memory regions on Windows, so I can try forcing it to be included.
Assignee | ||
Comment 10•14 years ago
|
||
So I started looking into patching this. As a first step, I started writing a unit test, of course. Attached is the test body. I ran this as part of the Google Windows client unittests, and it produced a minidump. Before writing the code to programmatically check the minidump memory contents, I opened it in WinDBG, typed ".ecxr" to get the exception record loaded, and opened the memory view, only to see the memory the test program had written in plain sight. It definitely looks to me like Windows is trying to do this correctly. It must be something that our JIT(s?) does differently from this test program that causes it not to work. dmandelin: can you take a look at the code snippet here and see what would differ from NanoJIT/(whatever JM uses)? Perhaps memory protection flags?
Assignee | ||
Comment 11•14 years ago
|
||
Here's the minidump it produced, if you'd like to see for yourself.
Assignee | ||
Comment 12•14 years ago
|
||
WTF and NanoJIT's allocators look identical to the one in my test, AFAICT: http://mxr.mozilla.org/mozilla-central/source/js/src/assembler/jit/ExecutableAllocatorWin.cpp#44 http://mxr.mozilla.org/mozilla-central/source/js/src/nanojit/avmplus.cpp#106 is there some other difference I'm missing here?
Assignee | ||
Comment 13•14 years ago
|
||
I really can't produce a patch here without some help figuring out why crashes in our JITted code produce different results than my testcase, since it's currently impossible for me to test and verify a fix.
Assignee | ||
Comment 14•14 years ago
|
||
(I am, however, working on Mac and Linux fixes.)
Assignee | ||
Comment 15•14 years ago
|
||
I have working patches for Linux and Mac. The Linux patch is up for upstream review: (should be quick, and there's nobody at Moz that knows this code better anyway) http://breakpad.appspot.com/194001/show I'll post the Mac patch in a little bit. I'm still stumped by the Windows behavior. Without being able to reproduce the current failure mode in a test, I don't have high confidence in being able to deliver a fix.
Assignee | ||
Comment 16•14 years ago
|
||
Linux patch reviewed and landed upstream: http://code.google.com/p/google-breakpad/source/detail?r=693 Mac patch up for review: http://breakpad.appspot.com/200001/show Once that gets reviewed I should be able to land them on mozilla-central. Any thoughts about the Win32 case are appreciated. I'm willing to try any theories you might entertain.
Assignee | ||
Comment 17•14 years ago
|
||
While I'm at it, I wrote a little tool that will dump the memory around the instruction pointer from the exception record for a minidump: http://hg.mozilla.org/users/tmielczarek_mozilla.com/get-minidump-instructions/ There's a pre-built 64-bit Linux binary here: http://people.mozilla.com/~tmielczarek/get-minidump-instructions
Comment 18•14 years ago
|
||
Shaver/Sayrer: please see comment 15; looks like we're set for Windows and Linux, but need help to resolve for Windows which I take to be important given comment 8
Comment 19•14 years ago
|
||
Can we get the Linux and OSX parts landed, please, and file a follow-up marked as beta8+ for Windows?
Whiteboard: [patches in hand, follow up for Windows]
Assignee | ||
Comment 20•14 years ago
|
||
I didn't get upstream review on the OS X patch yet, I'll review-ping. I'll file a Windows follow-up, but without assistance I don't know how to fix it. The Breakpad code just calls a Microsoft library to write minidumps, which appears to do the right thing in my testing. It clearly doesn't do the right thing for all of our crashes, but I'm at a loss to explain why that is.
Assignee | ||
Comment 21•14 years ago
|
||
Upstream OS X patch has review, I'll clean it up and land them both on m-c tomorrow.
Whiteboard: [patches in hand, follow up for Windows] → [patches in hand, follow up for Windows][ETA: 09/23]
Assignee | ||
Comment 22•14 years ago
|
||
Ok, landed upstream: http://code.google.com/p/google-breakpad/source/detail?r=699 I'm getting a m-c patch together now.
Assignee | ||
Comment 23•14 years ago
|
||
Assignee | ||
Comment 24•14 years ago
|
||
Comment on attachment 477973 [details] [diff] [review] Include the page containing EIP in the minidump on Linux and Mac. This is the combined patch merged to m-c. It builds and works on my local system, I'm pushing it to try (along with my other b7 blocker) to make sure it doesn't break things.
Attachment #477973 -
Flags: review+
Assignee | ||
Comment 25•14 years ago
|
||
I pushed this to try and it looks good. I don't have time to watch the tree right now, but if someone else wanted to land it that'd be fine. (I'm still waiting on try results for the patch on bug 598507 and was going to push them both.)
Assignee | ||
Comment 26•14 years ago
|
||
Pushed to m-c: http://hg.mozilla.org/mozilla-central/rev/8b84f9ba869c Will file the Win32 followup shortly.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [patches in hand, follow up for Windows][ETA: 09/23]
You need to log in
before you can comment on or make changes to this bug.
Description
•