1018360 - Publicly display thread state for crashing thread on Intel 64-bit processors

Reporter

Description

•

11 years ago

Intel 64-bit (i.e. amd64) CPUs have a quirk that sometimes prevents the "crash address" for illegal memory access crashes from being accurately reported: Their address space isn't actually 64 bits wide. So if you crash by trying to access an invalid address (one > 0x7fffffffffff), the crash address isn't stored or reported. If you crash trying to access a valid address that you don't have permission to access, the Intel 64-bit CPU stores this value in the CR2 register, and it gets reported to Breakpad's crash handler by kernel-level code. But this doesn't happen if your "address" > 0x7fffffffffff. In this case the "crash address" gets reported as 0xffffffffffffffff (on Windows, see bug 974420) or 0 (on OS X, see bug 1002564 comment #14). Needless to say, this makes it much more difficult to figure out these crashes. Short of Intel/AMD changing the design of their 64-bit CPUs, there is no cure for this problem. But the crash address will almost always have been stored in a user-level register just before the crash (and the crash will have been triggered by trying to access memory at the "address" in that register). So seeing the contents of all these registers (the "thread state") as of the crash will often allow us to guess/infer the crash address, even if it wasn't reported by kernel-level code.

Steven Michaud [:smichaud] (Retired)

Reporter

Comment 1

•

11 years ago

Breakpad currently stores this information in minidumps, plus much more: It stores the thread state for every thread, and for each frame of each thread's stack. (The actual thread state is stored for the top frame.) But this information isn't displayed by Socorro, at least publicly. Surely part of the reason is that it's very bulky. But apparently people also think that this information could be used to identify the person who submitted a given minidump. However, the thread state (properly so called) for the crashing thread (for its top frame) isn't particularly bulky. And I find it hard to believe that having this information would make it any easier to match up a particular crash report with a particular user. Any qualms about these issues are heavily outweighed by the fact that having this information is often extremely useful. It will often make the difference between whether or not a bug can be fixed. Bug 997908 is a good case.

Steven Michaud [:smichaud] (Retired)

Reporter

Updated

•

11 years ago

Severity: normal → major

Steven Michaud [:smichaud] (Retired)

Reporter

Comment 2

•

11 years ago

Up until very recently, minidump-stackwalk didn't report the full thread state for even the top frame of each thread's stack. But this has now been fixed. See bug 1002564 comment #39 and http://breakpad.appspot.com/7654002/.

Steven Michaud [:smichaud] (Retired)

Reporter

Comment 3

•

11 years ago

> But this has now been fixed Only partly. See bug 1002564 comment #43.

Robert Kaiser

Comment 4

•

11 years ago

Benjamin, how would we best present this in Socorro so that it's as devs would expect? Ted, does Breakpad expose what we need?

(not currently active) Ted Mielczarek

Comment 5

•

11 years ago

Kairo: per the preceding comment, we don't have this in the JSON output yet, but it's fairly easy to add.

Benjamin Smedberg

Comment 6

•

11 years ago

Currently in the "raw dump" tab we display the pipe-delimited output. We should switch to displaying the JSON output (prettified if necessary) without the restricted data. This is strictly more information, and would make it possible to show other data in the future without having to code custom display.

(not currently active) Ted Mielczarek

Comment 7

•

11 years ago

bug 976077 covers using the JSON dump to build everything in report/index, probably falls under the purview of that.

(not currently active) Ted Mielczarek

Updated

•

11 years ago

Depends on: 1026111

(not currently active) Ted Mielczarek

Updated

•

11 years ago

Depends on: 1030276

(not currently active) Ted Mielczarek

Comment 8

•

11 years ago

My patch in bug 1030276 will let you view this in the "Raw Dump" tab, that's probably sufficient for now.

Steven Michaud [:smichaud] (Retired)

Reporter

Comment 9

•

11 years ago

Ted: When/if you do a mockup, please let us know. Otherwise let us know when your patch has landed and we can see its results in production Socorro.

(not currently active) Ted Mielczarek

Comment 10

•

11 years ago

Here's what it looks like locally: http://people.mozilla.org/~tmielczarek/raw-json.png

Steven Michaud [:smichaud] (Retired)

Reporter

Comment 11

•

11 years ago

> http://people.mozilla.org/~tmielczarek/raw-json.png That actually doesn't show all the registers. Could you make the window bigger, or scroll it down, before making another snapshot?

(not currently active) Ted Mielczarek

Comment 12

•

11 years ago

I'll just copy and paste what's there: "registers": { "r10": "0x00007fff928d2c70", "r11": "0x00007f14226002cf", "r12": "0x00007f13fea509e8", "r13": "0xfffffffffffffffc", "r14": "0x0000000000000008", "r15": "0x00007fff928d2dc0", "r8": "0x00007fff928d2cb0", "r9": "0x00007f13febd73f8", "rax": "0x00007f14226002f1", "rbp": "0x00007fff928d2d20", "rbx": "0x00007fff928d2dd0", "rcx": "0x00007f1431c13ff0", "rdi": "0x0000000000000000", "rdx": "0xffffffffffffff51", "rip": "0x00007f14226002f1", "rsi": "0x00007f13febd73f8", "rsp": "0x00007fff928d2d00" }

Steven Michaud [:smichaud] (Retired)

Reporter

Comment 13

•

11 years ago

Thanks! Looks good to me.

Steven Michaud [:smichaud] (Retired)

Reporter

Comment 14

•

11 years ago

The fix for bug 1030276 has gotten into production Socorro, so I'd say this is now fixed. But before I mark it so, I'm going to play around with various kinds of crashes on various threads.

Steven Michaud [:smichaud] (Retired)

Reporter

Comment 15

•

11 years ago

I tested with various crash addresses, on the main thread and off, and didn't see any problems. In all cases the raw dump displayed the user-level registers for the crashing thread, including one that contained the crash address.

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Steven Michaud [:smichaud] (Retired)

Reporter

Comment 16

•

11 years ago

I should mention that I tested only on OS X. But, as bug 1030276 only changed "cross-platform" code, I think this bug must also be fixed for crash reports on Windows and Linux.

Steven Michaud [:smichaud] (Retired)

Reporter

Comment 17

•

11 years ago

Thanks again, Ted! Now that we've started memory poisoning, there will be lots more cases where the "crash address" > 0x7fffffffffff (in 64-bit mode). So this fix will be crucial in detecting cases of memory poisoning. It will also make it possible to work around our current lack of a reviewed patch for bug 1002564.

(not currently active) Ted Mielczarek

Comment 18

•

11 years ago

You're welcome, I'm glad that was useful! It should work on any platform, the underlying data structures are the same.

Steven Michaud [:smichaud] (Retired)

Reporter

Updated

•

10 years ago

Bugzilla

Publicly display thread state for crashing thread on Intel 64-bit processors

Categories

(Socorro :: General, task)

Tracking

(Not tracked)

People

(Reporter: smichaud, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Updated

Updated

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Comment 17

Comment 18

Updated