Closed Bug 946877 Opened 7 years ago Closed 7 years ago
__declspec(noreturn) may be adversely affecting stackwalking of NS
_ABORT _OOM (Windows)
The Windows declaration of MOZ_REALLY_CRASH calls a MOZ_NoReturn function, which is __declspec(noreturn). I am experiencing issues with MOZ_CRASH not walking the stack correctly in crash reports, and I suspect this noreturn declaration may be part of the problem. In particular I'm seeing this behavior with NS_ABORT_OOM: many functions call NS_ABORT_OOM: the Windows LTO system "outlines" the three instructions: 027FBDC0 push dword ptr [edi+4] 027FBDC3 call NS_ABORT_OOM (2955F9Eh) $LN14: 027FBDC8 int 3 Each function which calls NS_ABORT_OOM jumps to this outlined hunk, which means we can't really tell which function actually made the call. I'm not sure whether this is a problem specific to NS_ABORT_OOM or whether it affects other functions like this... It's possible that we can rectify this just for NS_ABORT_OOM by building nsDebugImpl.cpp without PGO/LTO optimizations... I'm going to try that first. That will have the effect of hiding the "noreturn" clause from the optimizer, so that it assumes a normal return pattern and probably won't perform this outlining. I understand from blame that the reason we have __declspec(noreturn) is to suppress warnings/errors about functions which don't return/don't return a value. I'd also like to see what actually happens in practice if we remove the "noreturn" bits and force people to return dummy values.
This is the patch racing with 945042.
Attachment #8343247 - Flags: review?(ted)
Attachment #8343247 - Flags: review?(ted) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/5118bd7f0954 removing the noreturn turns out to be really hard because of warnings, and I can't find a way to disable the side effects, so I'm going to hope this is sufficient and reopen later if it doesn't actually help.
Target Milestone: --- → mozilla28
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
That's not going to work until this reaches aurora, because nsDebugImpl.cpp is a unified source.
(In reply to Mike Hommey [:glandium] from comment #4) > That's not going to work until this reaches aurora, because nsDebugImpl.cpp > is a unified source. Note bug 945042 will fix this, and i want to have it in the tree before aurora.
Reopening to move this out of unified sources for now.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
https://hg.mozilla.org/integration/mozilla-inbound/rev/f15bd9ef9b7e Moving version flags to track this back to FF28 and hopefully to 27 as well, since NS_ABORT_OOM landed on 27.
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Resolution: --- → FIXED
Summary: __declspec(noreturn) may be adversely affecting stackwalking at MOZ_CRASH (Windows) → __declspec(noreturn) may be adversely affecting stackwalking of NS_ABORT_OOM (Windows)
Comment on attachment 8343247 [details] [diff] [review] bug946877-DebugImpl-nopgo [Approval Request Comment] Bug caused by (feature/regressing bug #): bug 938794 made better annotations for OOM crashes in our data structures (strings etc) but the stacks are less than great because of this bug User impact if declined: Worse OOM crash analysis Testing completed (on m-c, etc.): landed and appears to be helping in crash-stats Risk to taking this patch (and alternatives if risky): very low, mainly performance but that's unlikely for this file String or IDL/UUID changes made by this patch: none There were two patches in this bug, the first landed for 28 but the second was for 29 only, so requesting approval for both branches to get everything in sync.
Benjamin, out of curiosity, do you know yet if this worked successfully?
Yes, based on this crash-stats query: https://crash-stats.mozilla.com/search/?release_channel=nightly&build_id=%3E20131211000000&signature=~NS_ABORT_OOM&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform I loaded a sample of crashes and the calling frame appears to be correct in each case without folding.
https://hg.mozilla.org/releases/mozilla-aurora/rev/5ba3867108cd https://hg.mozilla.org/releases/mozilla-beta/rev/a05c70a52b9b (the unified-sources patch wasn't necessary on beta because unified XPCOM didn't land there)
It looks like this caused significant perf improvements in Windows PGO builds on a variety of Talos benchmarks (tpaint, a11y, svg, tart, tp5): https://groups.google.com/d/topic/mozilla.dev.tree-management/ZkRWKK-2Gew/discussion
(In reply to comment #13) > It looks like this caused significant perf improvements in Windows PGO builds > on a variety of Talos benchmarks (tpaint, a11y, svg, tart, tp5): > https://groups.google.com/d/topic/mozilla.dev.tree-management/ZkRWKK-2Gew/discussion That... seems bogus. :-)
You need to log in before you can comment on or make changes to this bug.