Closed Bug 1692531 Opened 4 years ago Closed 4 years ago

No system calls symbolicated in macOS ARM64 crash stacks on 86 branch and above

Categories

(Toolkit :: Crash Reporting, defect)

ARM64
macOS
defect

Tracking

()

RESOLVED FIXED
87 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox86 + fixed
firefox87 + fixed

People

(Reporter: smichaud, Assigned: gsvelto)

References

(Regression)

Details

(Keywords: regression)

Attachments

(1 file)

This is a weird one.

As best I can tell, no system calls are getting symbolicated for macOS on ARM64 hardware on release channels other than "release", even for versions of macOS whose arm64e symbols I know have been uploaded to the symbol server (builds 20B29, 20C69, 20D64 and 20D74).

https://crash-stats.mozilla.org/search/?cpu_arch=arm64&release_channel=%21release&platform=Mac%20OS%20X&date=%3E%3D2021-02-05T17%3A19%3A00.000Z&date=%3C2021-02-12T17%3A19%3A00.000Z&_facets=signature&_facets=platform_version&_sort=-date&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-platform_version

But this doesn't happen with builds on the release channel:

https://crash-stats.mozilla.org/search/?cpu_arch=arm64&release_channel=release&platform=Mac%20OS%20X&date=%3E%3D2021-02-05T17%3A19%3A00.000Z&date=%3C2021-02-12T17%3A19%3A00.000Z&_facets=signature&_facets=platform_version&_sort=-date&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-platform_version

It doesn't seem likely that this is a problem in crashreporter code, so maybe this bug should be moved elsewhere. Now I have an Apple Silicon Mac Mini, so I'll check later.

Do you have any insights into this, Sebastian?

Flags: needinfo?(aryx.bugmail)
Flags: needinfo?(aryx.bugmail) → needinfo?(gsvelto)
See Also: → 1661771

Are release builds signed differently than builds on other channels (like "beta" or "nightly")? I've discovered that Big Sur is much pickier about dealing with unsigned apps on ARM64 hardware than it is on AMD64 hardware.

https://apple.stackexchange.com/questions/408752/the-message-you-do-not-have-permission-to-open-the-application-is-shown-when-t

The only way I've been able to get tryserver builds to work is to sign them myself, using codesign -s "Steven Michaud" -f --deep Firefox\ Nightly.app. Interestingly, this isn't needed for running a non-release build in Rosetta, even on an Apple Silicon machine.

macOS signing got changed yesterday which shall affect 86.0b9 and 87.0a1 with build id > 20210212000000. If that aligns with you observations, then bhearsum can provide more details.

Yes, it seems to

Not really. This bug started less recently.

It may still have something to do with signing, though.

Ben, have there been other, recent changes to signing on macOS?

The record is a bit sparse, but on the nightly branch this bug seems to have started somewhere between these two nightlies:

Build id 20210121094402, which doesn't have the bug (bp-dc0ab288-7571-45c4-8ed8-b1ad00210121)
Build id 20210123215311, which does have the bug (bp-7c2dc33f-8f0b-4bca-a54f-f090c0210125)

There's some evidence that this bug has nothing to do with signing:

When I signed a recent tryserver build using codesign -s "Steven Michaud" -f --deep Firefox\ Nightly.app and then deliberately crashed it (on my Apple Silicon Mac), this bug still happened.

bp-50dd17e8-af0b-403e-bc7d-bbcd00210212
https://treeherder.mozilla.org/jobs?repo=try&revision=b23803b825597781295c8ab2ad2058d2c1f0439e
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/U33VXDFfSWWI4nYBTITZtw/runs/0/artifacts/public/build/target.dmg
https://hg.mozilla.org/try/rev/56b6018758eb3433d4251cacb7132e2c43e3bc81

I discovered I could use kill -6 [pid] to send a SIGABRT to the firefox process, and that this would generate a crash report. Using this information, I found that this bug started (on the "nightly" branch) with the following nightly:

http://ftp.mozilla.org/pub/firefox/nightly/2021/01/2021-01-21-21-33-47-mozilla-central/firefox-86.0a1.en-US.mac.dmg

So the regression range is:

http://ftp.mozilla.org/pub/firefox/nightly/2021/01/2021-01-21-09-44-02-mozilla-central/firefox-86.0a1.en-US.mac.dmg
http://ftp.mozilla.org/pub/firefox/nightly/2021/01/2021-01-21-21-33-47-mozilla-central/firefox-86.0a1.en-US.mac.dmg

I've forgotten how to compute/display mozilla-central regression ranges. Someone please remind me.

In the meantime I'll try to refresh my own memory.

I can't help bug think that bug 1679922 had something to do with it. Though I doubt it caused this bug directly.

Flags: needinfo?(nicolas.b.pierron)

Another possibility is bug 1684672. Now that I think of it, this seems more likely.

Flags: needinfo?(nicolas.b.pierron) → needinfo?(greenrecyclebin)
Flags: needinfo?(greenrecyclebin)

I'm going to try reversing the patch to bug 1684672, to see if this makes a difference.

Actually I won't be able to reverse it. But maybe I can fiddle with the minimum supported version.

Yes, bug 1684672 only made changes to docs. Sigh :-(

Bug 1684672 'only' updates docs, builds should be unaffected. Bug 1686920 says it removes dead code.

Regarding bug 1679922, see bug 1689807 where unsigned jit code caused application crashes.

hg bisect tells me that this bug was triggered (on mozilla-central) by the following commit:

changeset: 564037:d67976888e73
user: Gabriele Svelto <gsvelto@mozilla.com>
date: Thu Jan 21 08:06:43 2021 +0000
summary: Bug 1686920 - Remove out-of-tree breakpad patches that were used for dump_syms r=KrisWright

So this bug is apparently a problem in Crashreporter code.

I'm going to let it simmer on the back burner for a few days. Then if nobody else has claimed it, I'll assign it to myself and start working on it.

Regressed by: 1686920
Has Regression Range: --- → yes
Flags: needinfo?(bhearsum)
Summary: No system calls symbolicated in macOS ARM64 crash stacks on release channels other than "release" → No system calls symbolicated in macOS ARM64 crash stacks on 86 branch and above

While removing the out-of-tree Breakpad patches that were used by dump_syms I accidentally removed the fix for bug 1371390. I don't know why I did that given that even the patch name and comment stated it was a fix for minidump generation rather than symbol emission... I'll restore the patch.

Assignee: nobody → gsvelto
Flags: needinfo?(gsvelto)

Gabriele, could you request an uplift to mozilla-release? I think we'll need your patch for our 86 RC build, thanks!

Flags: needinfo?(gsvelto)
Pushed by gsvelto@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/bd60bfcd99c2 Reinstate the out-of-tree Breakpad patch fixing bug 1371390 r=KrisWright

Comment on attachment 9203024 [details]
Bug 1692531 - Reinstate the out-of-tree Breakpad patch fixing bug 1371390 r=KrisWright

Beta/Release Uplift Approval Request

  • User impact if declined: No impact on users but the quality of the crash report we receive is lowered
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This code had already been in use for a long time, we're just re-instating it
  • String changes made/needed: none
Flags: needinfo?(gsvelto)
Attachment #9203024 - Flags: approval-mozilla-beta?
Attachment #9203024 - Flags: approval-mozilla-beta? → approval-mozilla-release?
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 87 Branch

Comment on attachment 9203024 [details]
Bug 1692531 - Reinstate the out-of-tree Breakpad patch fixing bug 1371390 r=KrisWright

Taking as a ride-along in our RC2, thanks.

Attachment #9203024 - Flags: approval-mozilla-release? → approval-mozilla-release+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: