Closed Bug 1690604 Opened 4 years ago Closed 3 years ago

Crashes on all versions of macOS 11 reported as "macOS 10.16.0" on AMD cpu architecture

Categories

(Toolkit :: Crash Reporting, defect)

x86_64
macOS
defect

Tracking

()

RESOLVED FIXED
88 Branch
Tracking Status
firefox88 --- fixed

People

(Reporter: smichaud, Assigned: smichaud)

References

()

Details

Attachments

(1 file)

See Also: → 1616404

In bug 1675245 I didn't map 10.16 to 11. Crash reports don't contain the pretty OS version number. Should this be manually maintained in Socorro?

The biggest problem is that the reported version numbers for different versions of macOS 11 are all the same, so that we need to use the build ids to distinguish between them (when searching through crash reports). Having to deal with two sets of numbers (10.16... and 11...) would also be a pain. But if we had "10.16.0.1", "10.16.1.0", "10.16.2.0" and so forth it'd be more manageable.

Where does the "10.16.0" come from? From the minidump? If so, then we'll need to change Crashreporter code. I've done that several times recently, and so am reasonably familiar with that codebase. If the "10.16.0" version numbers do come from the minidump, I'll assign this bug to myself, and try to deal with it sometime in the next few weeks. I expect I'll standardize on the 11... version numbers.

Assignee: nobody → smichaud

I've found a simple fix for this bug:

https://hg.mozilla.org/try/rev/043ff153ec3adef98f7a1f52cff8cf8135f82935
https://treeherder.mozilla.org/jobs?repo=try&revision=6b2afeccd1b067a91b79cd6215dbfc50317ce53b

But it's moderately risky, and not perfect. So I'm going to hold off seeking a review until FF 86 is released. That way it'll have a long time to bake on the trunk and on beta.

As the URL link for this bug points out, the SYSTEM_VERSION_COMPAT environment variable can be used to control how the macOS version is reported to client programs on macOS Big Sur. If it's unset or set to 1, the OS reports a "compatible" system version, which it reads from /System/Library/CoreServices/SystemVersionCompat.plist. But if SYSTEM_VERSION_COMPAT is set to 0, or if the application from which the query takes place is built with the macOS 11 SDK, the OS reports the actual system version, which it reads from /System/Library/CoreServices/SystemVersion.plist.

The problem with the "compatible" version of macOS Big Sur is that it's always reported as "10.16" or "10.16.0", regardless of which minor version is running (11.0, 11.0.1, 11.1, 11.2 or 11.2.1). So in order to distinguish between these versions, we need to see the actual macOS version number, and not the "compatible" one.

Most code in the Mozilla tree doesn't need to make such fine-grained distinctions. So it'd be nice to arrange for only Crashreporter code to see the actual version number. But the SYSTEM_VERSION_COMPAT environment variable is read only once, very early in the application loading process (as /usr/lib/libSystem.dylib is initialized). So there's no point in changing its value once application code is running. And so the state of this environment variable (whether or not it's set, and what value it's set to) must be the same for all code in the firefox process and its child processes.

As best I can tell, there's no code in the tree which needs to be changed if the actual version number is reported (as opposed to the "compatible" one). I'll list this code (what I've been able to find) in a later comment. But, as far as I know, macOS version checking can happen in JS code and in extensions. It may even be possible to check the OS version from HTML code. So my patch has the potential to change the behavior of "client" code, outside of the Mozilla tree. So this patch should bake for a while on trunk and beta, to see if any problems arise. I'd also very much appreciate someone telling me how to do OS version checks from JS code (and if possible from HTML code). Then I can check the behavior in other browsers like Chrome and Safari.

Even without this workaround, though, the actual macOS version number will be reported to applications that are built with the macOS 11 SDK. So client code will eventually need to adapt to seeing actual version numbers (and not just compatible ones), as these applications become more common. So at worst my patch will only force them to adapt a bit earlier.

I mentioned above that this patch isn't perfect. It only works if Firefox is run from the GUI (by double-clicking on it, or by using open from the command line), and not by invoking the firefox binary directly (from the command line or otherwise). But I don't think this is a serious problem. The vast majority of users will open Firefox from the GUI. Those who don't will presumably be sophisticated enough to know that they need to set SYSTEM_VERSION_COMPAT explicitly, like so:

    SYSTEM_VERSION_COMPAT=0 /Applications/Firefox.app/Contents/MacOS/firefox

Here's a list of the code I've been able to find in the Mozilla tree which does macOS version checking:

  1. All code changed by the patch for bug 1616404: https://hg.mozilla.org/mozilla-central/rev/37c75bf1c2b2

  2. Gfx version checking code: https://hg.mozilla.org/mozilla-central/annotate/147065aaa29152bf01bd0a6d1e5eb7089ff069e3/widget/cocoa/GfxInfo.mm#l37

  3. Sandbox version checking code: https://hg.mozilla.org/mozilla-central/annotate/147065aaa29152bf01bd0a6d1e5eb7089ff069e3/security/sandbox/mac/Sandbox.mm#l106

  4. https://hg.mozilla.org/mozilla-central/annotate/147065aaa29152bf01bd0a6d1e5eb7089ff069e3/dom/media/platforms/apple/AppleDecoderModule.cpp#l157

Like I said in comment 4, it doesn't seem that any of it needs to be changed to accommodate my patch. __builtin_available() is a new Clang feature, and seems to be smart enough to work with either "macos 10.16" or "macos 11". I double-checked this by building the tree with my patch (with the macOS 11 SDK) on an Apple Silicon Mac Mini.

I found items 2 through 4 by searching on "10.16" in https://searchfox.org/ :-)

Here's a universal (ARM64 and AMD64) tryserver build to test with. Its Mozilla-specific symbols have been uploaded to the Mozilla symbol server (so its crash stacks should be fully symbolicated):

https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/CqhMqq6_QYqBYfVTvz2GFQ/runs/0/artifacts/public/build/target.dmg
https://treeherder.mozilla.org/jobs?repo=try&revision=3585d0659979fd858f4b23b1237c294916b7d637

You can make any of its processes crash by doing kill -6 [pid]. This sends a SIGABRT signal, and also generates a crash log.

Testing with this build on an Apple Silicon machine is a pain. It's not signed, and it's not possible on these machines to just right-click and choose "Open" (though that still works fine on macOS 11 on an Intel machine). So you're going to have to sign it yourself, using some kind of Apple signing certificate. I used my own "developer id application" signing cert. And you may have to Option-drag it to make a copy, and sign that.

https://github.com/Homebrew/brew/issues/9082

    codesign -s "Your Name" -f --deep /Applications/Firefox\ Nightly.app

I've opened bug 1693422 to get ARM64 macOS tryserver builds signed.

(In reply to Steven Michaud [:smichaud] (Retired) from comment #6)

Testing with this build on an Apple Silicon machine is a pain. It's not signed, and it's not possible on these machines to just right-click and choose "Open" (though that still works fine on macOS 11 on an Intel machine). So you're going to have to sign it yourself, using some kind of Apple signing certificate. I used my own "developer id application" signing cert. And you may have to Option-drag it to make a copy, and sign that.

It turns out things aren't quite this bad. Yes, there is a way to make a signed, universal tryserver build, and upload its symbols to the symbol server:

https://hg.mozilla.org/try/rev/46762f072a28b387c431efabd828162449d7e118

And here's the result of doing this:

https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/c9FrcO4SS9m6vwYCBdjZ3w/runs/0/artifacts/public/build/target.dmg

But there's still one more step you need to take after target.dmg is downloaded. Do the following to remove all its extended attributes, including its quarantine attribute:

xattr -c target.dmg

Pushed by smichaud@pobox.com:
https://hg.mozilla.org/integration/autoland/rev/2a88e94f30ed
Get accurate system version info on all macOS builds. r=mstange
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 88 Branch

For a while now the 88 branch has been the release branch. But though most amd64 crash reports now report the macOS 11 version correctly, there are still surprisingly many that don't. I've no idea why. It's hard to believe that so many people are running the firefox binary directly from the command line.

https://crash-stats.mozilla.org/search/?cpu_arch=amd64&version=88.0&platform=Mac%20OS%20X&date=%3E%3D2021-04-27T16%3A45%3A00.000Z&date=%3C2021-05-04T16%3A45%3A00.000Z&_facets=signature&_facets=platform_version&_sort=-date&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=platform_version#facet-platform_version

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: