Closed Bug 1451902 Opened 7 years ago Closed 3 years ago

Consider shipping release builds with frame pointers enabled on macOS

Categories

(Firefox Build System :: General, enhancement)

Desktop
macOS
enhancement

Tracking

(firefox99 fixed)

RESOLVED FIXED
99 Branch
Tracking Status
firefox99 --- fixed

People

(Reporter: mstange, Assigned: mstange)

References

(Blocks 1 open bug)

Details

(Whiteboard: [fxp])

Attachments

(1 file)

On macOS, our Nightlies are built with frame pointers enabled, ever since bug 764216. However, our Beta and Release builds are not. The only way that the Gecko Profiler is able to unwind native stacks on macOS is using frame pointers. This means that we only have working stackwalking support on Nightly, and not on release builds. If our users submit a profile that was taken on a Mac release build, that profile does not contain any native stacks; it only contains JS stacks and pseudo stacks but is otherwise rather flat and unhelpful. On other platforms, the situation is as follows: - On Windows 32 bit builds, we have chosen to take the hit and have enabled frame pointers on release. This was done in bug 1322735. - On Windows 64 bit builds, there is no way to enable frame pointers. Stackwalking in those builds makes use of unwinding tables. - On Linux, 32 bit and 64 bit, the situation is the same as on Mac: We have frame pointers on Nightly but not on Beta or Release. On Beta and Release, we use lul for unwinding, which consults unwinding tables in the binary. Lul does not currently support unwinding with unwinding tables on macOS. It could be extended to support it but it would be some work. The easiest way to get useful profiles from release users would be to enable frame pointers on Beta and Release builds on macOS. This might come with a small perf impact.
Whiteboard: [perf-tools]
Why did we disable frame pointers in the first place? To have an extra register available to the compiler? Bug 492688 seems to indicate possibly so. Assuming this is about performance, the decision to omit frame pointers was made when we used 32-bit builds pretty much everywhere. My understanding is the extra register doesn't buy you much on 64-bit processors because 64-bit processors have more registers available. I think having stack unwinding and a consistent build between Nightly and release is more important than <some> raw performance. Quantifying <some> is not something I'm comfortable answering. So if this is about performance, let's throw some builds through our perf tests so we can make a data-informed decision? ted: you live at the intersection of builds, crash handling, and are a general repository of knowledge. Anything you'd like to add?
Flags: needinfo?(ted)
(In reply to Markus Stange [:mstange] from comment #0) > The easiest way to get useful profiles from release users would be to enable > frame pointers on Beta and Release builds on macOS. This might come with a > small perf impact. On the Win32 side there were only a couple small regressions on PGO builds (bug 1322735, comment 12) and a 0.19% increase in the size of libxul. That might be a good baseline for evaluating what we deem as okay.
(In reply to Gregory Szorc [:gps] from comment #1) > Why did we disable frame pointers in the first place? To have an extra > register available to the compiler? Bug 492688 seems to indicate possibly so. That sounds right, but that decision was obviously made when we were shipping x86 binaries on Mac, not x86-64. I doubt we've ever revisited that decision in light of shipping on x86-64. The odd thing is that on x86-64 the ABI spec says that frame pointers should be omitted by default despite having plenty of registers because the ABI also mandates having unwind info (like .eh_frame). Despite that, it looks like clang on Mac doesn't actually default to omitting frame pointers. Because of the amount of available registers on x86-64 compared to x86 the perf hit should be much smaller. It would be entirely due to having the extra instructions for pushing and popping the frame pointer register in each function, and the slight codesize increase due to this. (In reply to Markus Stange [:mstange] from comment #0) > - On Linux, 32 bit and 64 bit, the situation is the same as on Mac: We have > frame pointers on Nightly but not on Beta or Release. On Beta and Release, > we use lul for unwinding, which consults unwinding tables in the binary. > > Lul does not currently support unwinding with unwinding tables on macOS. It > could be extended to support it but it would be some work. Is there any particular reason here other than nobody did the work? Mach-O binaries use the same eh_frame data as ELF binaries, so it doesn't seem like it'd be that hard to wire this up. Presumably it's just a matter of dealing with the platform-specific bits? I would generally defer this decision to someone with better knowledge of the CPU architecture like jrmuizel. It should be easy enough to flip the setting, push to try, and get talos data for comparison.
Flags: needinfo?(ted)
Version: Version 3 → 3 Branch

The conversation about LUL aside, I think we should just find out how much framepointers affect our performance. Stackwalking using framepointers is simpler, faster, and often more reliable than using more involved methods of unwinding. And LUL adds serious startup overhead.

(I've copied the comments about LUL into bug 1635812, and collapsed them in this bug.)

Let's do this for the arm64 builds now, and for x86_64 builds later.

Depends on: 1753107

I've done a tryserver comparison for x86_64 macOS, for Talos and browsertime.

Try push: No frame pointers (like release)
Try push: With frame pointers (like nightly)

Talos comparison
Browsertime comparison

The vast majority of performance tests and benchmarks are completely unaffected by frame pointers. There are two Talos tests which show a difference with "high" confidence:

  • "a11yr dhtml.html opt e10s fission stylo webrender-sw" reports a regression of 2.5%.
  • "perf_reftest_singletons opt e10s fission stylo webrender" reports an improvement; specifically the "bloom-basic.html" and "bloom-basic-2.html" subtests report an improvement of 20-25% with high confidence.

I am rather confused by the improvements, but with 10 test runs on each push and such a strong signal I'll have to concede that they're probably real.
The "a11yr dhtml.html" regression is probably real, too. I'm not very familiar with this tests but if this is the only regression, 2.5% on it seems like a small price to pay. (I didn't get profiles of this test due to bug 1753483.)

Assignee: nobody → mstange.moz
Status: NEW → ASSIGNED
Pushed by mstange@themasta.com: https://hg.mozilla.org/integration/autoland/rev/65c90081484d Enable framepointers for all macOS builds on all channels. r=glandium
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 99 Branch
Severity: normal → --
OS: Unspecified → macOS
Hardware: Unspecified → Desktop
Version: 3 Branch → unspecified
See Also: → 1635812
Whiteboard: [perf-tools] → [fxp]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: