I'm about to post a revision of my patch. I basically made three changes: 1) I found a fix for the problem that gsvelto reported in comment 24 and comment 25. My mistake was to dereference a pointer [after the original object had been resized, and therefore moved in memory](https://searchfox.org/mozilla-central/source/toolkit/crashreporter/breakpad-client/mac/handler/dynamic_images.cc#423-444). I invoked `header->cpusubtype` after the second call to `ReadTaskMemory()`. But by then it referenced random memory, and the values I got were always incorrect. 2) I noticed that Apple code which references `mach_header.cpusubtype` always ands it with `~CPU_SUBTYPE_MASK`. Apparently the top 8 bits are reserved for "feature flags", including `CPU_SUBTYPE_LIB64` == 0x80000000. I actually saw one instance of this as I was debugging. There are lots of references to `mach_header.cpusubtype` in Breakpad code. I didn't fix all of them -- only those relevant to the code used from XUL, which creates minidumps. Later I'll post another patch for the code used by utilities like `minidump_dump` and `minidump_stackwalk`. But since those read minidumps, the changes aren't as urgent. 3) Last and most importantly, I fixed how Breakpad code computes the `slide` for modules in the "dyld shared cache". To implement ASLR, all modules are "slid" by a random amount whenever they're loaded into memory. To know where a particular module (like XUL or the CoreFoundation framework) is loaded into memory, one needs to know both its "original" base address and the amount by which it has been slid. When minidump_stackwalk is trying to symbolicate a given address in memory, it needs to be able to correctly identify the module it points to. The dyld shared cache is a single module into which commonly used system dylibs and frameworks are incorporated. `dyld` maps it into every process at load time. The component modules all have the same slide. For some time (and maybe since the very beginning), Breakpad code has computed the slide for these modules incorrectly, by assuming that it can use the same procedure it (correctly) uses for modules not in the dyld shared cache. Breakpad only uses this code to create minidumps for other processes (child processes like the content process). But as a result, almost all system calls in content process crash stacks are either symbolicated incorrectly or not at all. My patch fixes this problem. The "shared cache slide" (`sharedCacheSlide`) is stored in the dyld_all_image_infos structure. This field is only available on OS X 10.7 and later. But Firefox only supports OS X 10.9 and later, so we don't have to worry about this.
Bug 1371390 Comment 31 Edit History
Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.
I'm about to post a revision of my patch. I basically made three changes: 1) I found a fix for the problem that gsvelto reported in comment 24 and comment 25. My mistake was to dereference a pointer [after the original object had been resized, and therefore moved in memory](https://searchfox.org/mozilla-central/source/toolkit/crashreporter/breakpad-client/mac/handler/dynamic_images.cc#423-444). I invoked `header->cpusubtype` after the second call to `ReadTaskMemory()`. But by then it referenced random memory, and the values I got were always incorrect. 2) I noticed that Apple code which references `mach_header.cpusubtype` always ands it with `~CPU_SUBTYPE_MASK`. Apparently the top 8 bits are reserved for "feature flags", including `CPU_SUBTYPE_LIB64` == 0x80000000. I actually saw one instance of this as I was debugging. There are lots of references to `mach_header.cpusubtype` in Breakpad code. I didn't fix all of them -- only those relevant to the code used from XUL, which creates minidumps. Later I'll post another patch for the code used by utilities like `minidump_dump` and `minidump_stackwalk`. But since those read minidumps, the changes aren't as urgent. 3) Last and most importantly, I fixed how Breakpad code computes the "slide" for modules in the "dyld shared cache". To implement ASLR, all modules are "slid" by a random amount whenever they're loaded into memory. To know where a particular module (like XUL or the CoreFoundation framework) is loaded into memory, one needs to know both its "original" base address and the amount by which it has been slid. When minidump_stackwalk is trying to symbolicate a given address in memory, it needs to be able to correctly identify the module it points to. The dyld shared cache is a single module into which commonly used system dylibs and frameworks are incorporated. `dyld` maps it into every process at load time. The component modules all have the same slide. For some time (and maybe since the very beginning), Breakpad code has computed the slide for these modules incorrectly, by assuming that it can use the same procedure it (correctly) uses for modules not in the dyld shared cache. Breakpad only uses this code to create minidumps for other processes (child processes like the content process). But as a result, almost all system calls in content process crash stacks are either symbolicated incorrectly or not at all. My patch fixes this problem. The "shared cache slide" (`sharedCacheSlide`) is stored in the dyld_all_image_infos structure. This field is only available on OS X 10.7 and later. But Firefox only supports OS X 10.9 and later, so we don't have to worry about this.
I'm about to post a revision of my patch. I basically made three changes: 1) I found a fix for the problem that gsvelto reported in comment 24 and comment 25. My mistake was to dereference a pointer [after the original object had been resized, and therefore moved in memory](https://searchfox.org/mozilla-central/source/toolkit/crashreporter/breakpad-client/mac/handler/dynamic_images.cc#423-444). I invoked `header->cpusubtype` after the second call to `ReadTaskMemory()`. But by then it referenced random memory, and the values I got were always incorrect. 2) I noticed that Apple code which references `mach_header.cpusubtype` always `and`s it with `~CPU_SUBTYPE_MASK`. Apparently the top 8 bits are reserved for "feature flags", including `CPU_SUBTYPE_LIB64` == 0x80000000. I actually saw one instance of this as I was debugging. There are lots of references to `mach_header.cpusubtype` in Breakpad code. I didn't fix all of them -- only those relevant to the code used from XUL, which creates minidumps. Later I'll post another patch for the code used by utilities like `minidump_dump` and `minidump_stackwalk`. But since those read minidumps, the changes aren't as urgent. 3) Last and most importantly, I fixed how Breakpad code computes the "slide" for modules in the "dyld shared cache". To implement ASLR, all modules are "slid" by a random amount whenever they're loaded into memory. To know where a particular module (like XUL or the CoreFoundation framework) is loaded into memory, one needs to know both its "original" base address and the amount by which it has been slid. When minidump_stackwalk is trying to symbolicate a given address in memory, it needs to be able to correctly identify the module it points to. The dyld shared cache is a single module into which commonly used system dylibs and frameworks are incorporated. `dyld` maps it into every process at load time. The component modules all have the same slide. For some time (and maybe since the very beginning), Breakpad code has computed the slide for these modules incorrectly, by assuming that it can use the same procedure it (correctly) uses for modules not in the dyld shared cache. Breakpad only uses this code to create minidumps for other processes (child processes like the content process). But as a result, almost all system calls in content process crash stacks are either symbolicated incorrectly or not at all. My patch fixes this problem. The "shared cache slide" (`sharedCacheSlide`) is stored in the dyld_all_image_infos structure. This field is only available on OS X 10.7 and later. But Firefox only supports OS X 10.9 and later, so we don't have to worry about this.