Crashes [@ nsPNGEncoder::ConvertHostARGBRow ] on macOS, LLVM 21, on a third-party release (a Nix Package)
Categories
(Core :: Graphics: ImageLib, defect)
Tracking
()
People
(Reporter: smichaud, Unassigned)
References
(Blocks 1 open bug)
Details
Crash Data
These have been around for a while in small numbers, but they've increased rather dramatically on FF 144. Since they're macOS only, they're probably not a Mozilla bug. But some change on the 144 branch may have "encouraged" them. They're rare enough that we don't have any crashes in nightlies, or even betas. So it will be hard to tell exactly what that "encouragement" was.
This crash signature was also reported in bug 614144. But that bug is very old, and seems to have been fixed long ago.
I just noticed that almost all of these crashes are on the "default" release channel. On the trunk it'd mean local builds. I'm not sure what it means here. Maybe these crashes are a fluke.
bp-65141841-478f-47a9-9ba8-513240251021
Typical crash stack:
Crashing Thread (0), Name: MainThread
Frame Module Signature Source Trust
0 XUL nsPNGEncoder::ConvertHostARGBRow(unsigned char const*, unsigned char*, unsigned int, bool) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/image/encoders/png/nsPNGEncoder.cpp:689 inlined
0 XUL nsPNGEncoder::AddImageFrame(unsigned char const*, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, nsTSubstring<char16_t> const&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/image/encoders/png/nsPNGEncoder.cpp:287 context
1 XUL nsPNGEncoder::InitFromData(unsigned char const*, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, nsTSubstring<char16_t> const&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/image/encoders/png/nsPNGEncoder.cpp:72 cfi
2 XUL mozilla::image::EncodeImageData(mozilla::gfx::DataSourceSurface*, mozilla::gfx::DataSourceSurface&::ScopedMap, nsTSubstring<char> const&, nsTSubstring<char16_t> const&, nsIInputStream**) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/image/imgTools.cpp:442 cfi
3 XUL mozilla::image::EncodeImageData(mozilla::gfx::DataSourceSurface*, nsTSubstring<char> const&, nsTSubstring<char16_t> const&, nsIInputStream**) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/image/imgTools.cpp:460 inlined
3 XUL mozilla::image::imgTools::EncodeScaledImage(imgIContainer*, nsTSubstring<char> const&, int, int, nsTSubstring<char16_t> const&, nsIInputStream**) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/image/imgTools.cpp:527 cfi
4 XUL nsFaviconService::OptimizeIconSizes(mozilla::places::IconData&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/toolkit/components/places/nsFaviconService.cpp:711 cfi
5 XUL nsFaviconService::SetFaviconForPage(nsIURI*, nsIURI*, nsIURI*, long long, bool, JSContext*, mozilla::dom::Promise**) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/toolkit/components/places/nsFaviconService.cpp:373 cfi
6 XUL _NS_InvokeByIndex cfi
7 XUL CallMethodHelper::Invoke() /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/xpconnect/src/XPCWrappedNative.cpp:1620 inlined
7 XUL CallMethodHelper::Call() /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/xpconnect/src/XPCWrappedNative.cpp:1174 inlined
7 XUL XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/xpconnect/src/XPCWrappedNative.cpp:1120 cfi
8 XUL XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/xpconnect/src/XPCWrappedNativeJSOps.cpp:966 cfi
9 XUL CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), js::CallReason, JS::CallArgs const&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/src/vm/Interpreter.cpp:501 inlined
9 XUL js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/src/vm/Interpreter.cpp:597 cfi
10 XUL InternalCall(JSContext*, js::AnyInvokeArgs const&, js::CallReason) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/src/vm/Interpreter.cpp:664 inlined
10 XUL js::CallFromStack(JSContext*, JS::CallArgs const&, js::CallReason) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/src/vm/Interpreter.cpp:669 inlined
10 XUL js::Interpret(JSContext*, js::RunState&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/src/vm/Interpreter.cpp:3287 cfi
11 XUL MaybeEnterInterpreterTrampoline(JSContext*, js::RunState&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/src/vm/Interpreter.cpp:395 inlined
11 XUL js::RunScript(JSContext*, js::RunState&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/src/vm/Interpreter.cpp:471 cfi
12 XUL js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/src/vm/Interpreter.cpp:629 cfi
13 XUL InternalCall(JSContext*, js::AnyInvokeArgs const&, js::CallReason) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/src/vm/Interpreter.cpp:664 inlined
13 XUL js::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, js::AnyInvokeArgs const&, JS::MutableHandle<JS::Value>, js::CallReason) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/src/vm/Interpreter.cpp:696 cfi
14 XUL JS::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, JS::HandleValueArray const&, JS::MutableHandle<JS::Value>) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/js/src/vm/CallAndConstruct.cpp:119 cfi
15 XUL mozilla::dom::MessageListener::ReceiveMessage(mozilla::dom::BindingCallContext&, JS::Handle<JS::Value>, mozilla::dom::ReceiveMessageArgument const&, JS::MutableHandle<JS::Value>, mozilla::ErrorResult&) s3:gecko-generated-sources-l1:d01fd731b0a2dd12291b621774d88619c1a60454bd2760958e3a1055b2740665edb6c236dbdb2666091892c4f2013157a68f155e2aa02d601f037ad66b6bad8f/dom/bindings/MessageManagerBinding.cpp::5756 cfi
16 XUL mozilla::dom::MessageListener::ReceiveMessage(mozilla::dom::ReceiveMessageArgument const&, JS::MutableHandle<JS::Value>, mozilla::ErrorResult&, char const*, mozilla::dom::CallbackObjectBase::ExceptionHandling, JS::Realm*) s3:gecko-generated-sources-l1:9b2aa2670f267a2aefbcacbab9744eb3c967b72481e28f8725f4ed272f5c42f1d186cbd276a0dd4a5ddf6a57cf2273b50a8620351bb9e78546b89866bf8ede5b/dist/include/mozilla/dom/MessageManagerBinding.h::579 inlined
16 XUL mozilla::dom::JSActor::CallReceiveMessage(JSContext*, mozilla::dom::JSActorMessageMeta const&, JS::Handle<JS::Value>, JS::MutableHandle<JS::Value>, mozilla::ErrorResult&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/dom/ipc/jsactor/JSActor.cpp:289 cfi
17 XUL mozilla::dom::JSActor::ReceiveMessage(JSContext*, mozilla::dom::JSActorMessageMeta const&, JS::Handle<JS::Value>, mozilla::ErrorResult&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/dom/ipc/jsactor/JSActor.cpp:305 cfi
18 XUL mozilla::dom::JSActorManager::ReceiveRawMessage(mozilla::dom::JSActorMessageMeta const&, std::__1::unique_ptr<mozilla::dom::ipc::StructuredCloneData, std::__1::default_delete<mozilla::dom::ipc::StructuredCloneData> >, std::__1::unique_ptr<mozilla::dom::ipc::StructuredCloneData, std::__1::default_delete<mozilla::dom::ipc::StructuredCloneData> >) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/dom/ipc/jsactor/JSActorManager.cpp:226 cfi
19 XUL mozilla::dom::WindowGlobalParent::RecvRawMessage(mozilla::dom::JSActorMessageMeta const&, std::__1::unique_ptr<mozilla::dom::ClonedMessageData, std::__1::default_delete<mozilla::dom::ClonedMessageData> > const&, std::__1::unique_ptr<mozilla::dom::ClonedMessageData, std::__1::default_delete<mozilla::dom::ClonedMessageData> > const&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/dom/ipc/WindowGlobalParent.cpp:569 cfi
20 XUL mozilla::dom::PWindowGlobalParent::OnMessageReceived(IPC::Message const&) s3:gecko-generated-sources-l1:b9d0391237c1672b1bd690918a26ecd26e7b82540a738291ae5a277bef1c3a57a27b451b8e6af46f80428546cccae96c95f35728b7784600cfc385bd5e3d2a79/ipc/ipdl/PWindowGlobalParent.cpp::903 cfi
21 XUL mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&) s3:gecko-generated-sources-l1:a518c1c70c82c36194e2189e4f3f6fb94ebbdc9d2070858295f5b9877e2741175e32b75e881c198574d46d9b926a385bf539b8b099d6da3c6e07739dc33f87ee/ipc/ipdl/PContentParent.cpp::6412 cfi
22 XUL mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/ipc/glue/MessageChannel.cpp:1797 cfi
23 XUL mozilla::ipc::MessageChannel::DispatchMessage(mozilla::ipc::ActorLifecycleProxy*, std::__1::unique_ptr<IPC::Message, std::__1::default_delete<IPC::Message> >) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/ipc/glue/MessageChannel.cpp:1723 cfi
24 XUL mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::ActorLifecycleProxy*, mozilla::ipc::MessageChannel::MessageTask&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/ipc/glue/MessageChannel.cpp:1512 cfi
25 XUL mozilla::ipc::MessageChannel::MessageTask::Run() /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/ipc/glue/MessageChannel.cpp:1614 cfi
26 XUL mozilla::RunnableTask::Run() /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/xpcom/threads/TaskController.cpp:703 cfi
27 XUL mozilla::TaskController::RunTask(mozilla::Task*) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/xpcom/threads/TaskController.cpp:228 inlined
27 XUL mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/xpcom/threads/TaskController.cpp:1323 cfi
28 XUL mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/xpcom/threads/TaskController.cpp:1146 cfi
29 XUL mozilla::TaskController::ProcessPendingMTTask(bool) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/xpcom/threads/TaskController.cpp:639 inlined
29 XUL mozilla::TaskController::TaskController()::$_0::operator()() const /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/xpcom/threads/TaskController.cpp:333 inlined
29 XUL mozilla::detail::RunnableFunction<mozilla::TaskController::TaskController()::$_0>::Run() /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/xpcom/threads/nsThreadUtils.h:549 cfi
30 XUL nsThread::ProcessNextEvent(bool, bool*) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/xpcom/threads/nsThread.cpp:1157 cfi
31 XUL NS_ProcessPendingEvents(nsIThread*, unsigned int) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/xpcom/threads/nsThreadUtils.cpp:427 cfi
32 XUL nsBaseAppShell::NativeEventCallback() /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/widget/nsBaseAppShell.cpp:87 cfi
33 XUL nsAppShell::ProcessGeckoEvents(void*) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/widget/cocoa/nsAppShell.mm:534 cfi
34 CoreFoundation __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ cfi
35 CoreFoundation __CFRunLoopDoSource0 cfi
36 CoreFoundation __CFRunLoopDoSources0 cfi
37 CoreFoundation __CFRunLoopRun cfi
38 CoreFoundation CFRunLoopRunSpecific cfi
39 HIToolbox RunCurrentEventLoopInMode cfi
40 HIToolbox ReceiveNextEventCommon cfi
41 HIToolbox _BlockUntilNextEventMatchingListInModeWithFilter cfi
42 AppKit _DPSNextEvent cfi
43 AppKit -[NSApplication(NSEventRouting) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] cfi
44 XUL -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/widget/cocoa/nsAppShell.mm:189 cfi
45 AppKit -[NSApplication run] cfi
46 XUL -[GeckoNSApplication run] /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/widget/cocoa/nsAppShell.mm:173 cfi
47 XUL nsAppShell::Run() /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/widget/cocoa/nsAppShell.mm:864 cfi
48 XUL nsAppStartup::Run() /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/toolkit/components/startup/nsAppStartup.cpp:291 cfi
49 XUL XREMain::XRE_mainRun() /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/toolkit/xre/nsAppRunner.cpp:5922 cfi
50 XUL XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/toolkit/xre/nsAppRunner.cpp:6167 cfi
51 XUL XRE_main(int, char**, mozilla::BootstrapConfig const&) /private/tmp/nix-build-firefox-unwrapped-144.0.drv-0/firefox-144.0/toolkit/xre/nsAppRunner.cpp:6240 cfi
How to search for these crashes:
| Reporter | ||
Updated•5 months ago
|
| Reporter | ||
Comment 1•5 months ago
|
||
I searched on "nix-build-firefox-unwrapped" and found a hit that seems to indicate these crashes happen in a build extracted from a "Nix package".
| Reporter | ||
Comment 2•5 months ago
|
||
There's a "Firefox Nix Package" at https://mynixos.com/nixpkgs/package/firefox. But it's version 142.0.1, and in any case I can't figure out how to "install" it (even after signing in).
| Reporter | ||
Comment 3•5 months ago
|
||
I just checked the only third-party Firefox build I have access to (running on Ubuntu Linux), and it's on the "release" channel. So a third-party Firefox release running on the "default" channel is (probably) unorthodox, but maybe not unheard of.
| Reporter | ||
Updated•5 months ago
|
Updated•5 months ago
|
Comment 4•5 months ago
|
||
It seems more like a time based thing then a version based thing because there are a decent number of crashes on 143, and the crashes started right at the start of 144 when 143 was still being used significantly. Also I checked ImageEncoder and nsPNGEncoder, no changes for 144.
Almost of all of these are crashing the parent process.
And it looks like if a machine crashes once with this signature it is very likely to crash again with the same signature.
From looking at a random selection of crashes it seems like it is more common to have a stack like https://crash-stats.mozilla.org/report/index/c3684c19-2a51-4a4e-bac2-b7f9e0251017 where it goes through nsFaviconService::SetFaviconForPage and nsFaviconService::OptimizeIconSizes vs something like comment 0 where it starts at EncodingRunnable::Run.
nsFaviconService.h/cpp also wasn't touched for 144.
| Reporter | ||
Comment 5•5 months ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #4)
From looking at a random selection of crashes it seems like it is more common to have a stack like https://crash-stats.mozilla.org/report/index/c3684c19-2a51-4a4e-bac2-b7f9e0251017 where it goes through nsFaviconService::SetFaviconForPage and nsFaviconService::OptimizeIconSizes vs something like comment 0 where it starts at EncodingRunnable::Run.
nsFaviconService.h/cpp also wasn't touched for 144.
You're right. And, oddly, I already knew this, from looking at the Proto Signature Facet from my search in comment #0. Brain fart, I guess. I've edited comment #0 to fix the problem.
| Reporter | ||
Comment 6•5 months ago
•
|
||
Before we can make progress here, we need to get hold of whatever third-party Firefox distro(s) is/are experiencing the crashes. I'll keep trying, but it'd be good to hear from someone who knows more about third-party distros than I do, particularly "Nix Packages".
| Reporter | ||
Comment 7•5 months ago
•
|
||
Search that's limited to the "default" channel. That seems to be what this bug is really about. As best I can tell, all these crashes are in "Nix Packages", and include versions 144.0, 143.0.4 and 140.4.0. (That last must be an esr branch version number.)
Comment 8•5 months ago
|
||
Maybe we can try to contact the Nix people since it seems specific to them?
| Reporter | ||
Comment 9•5 months ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #8)
Maybe we can try to contact the Nix people since it seems specific to them?
Sounds good to me. I think it should probably be someone from Mozilla who does it. I think the "Nix people" are at https://nixos.org/.
I've set up a macOS 15.7.1 VM on which to play around with "Nix packages". I'll report here if I find anything interesting.
Comment 10•5 months ago
|
||
I filed an issue on their github, that seemed to be the best way to try to establish contact. https://github.com/NixOS/nixpkgs/issues/454734
| Reporter | ||
Comment 11•5 months ago
|
||
The Nix package manager is a tough nut to crack. But as best I can tell, all its official "packages" are source distros, and need to be built each time they're installed. I haven't yet managed to do that: There's a bug that hasn't yet been fixed in a Nix release.
So the builds whose crashes we're tracking here are presumably unofficial in some way.
| Reporter | ||
Comment 12•5 months ago
•
|
||
(In reply to Steven Michaud [:smichaud] (Retired) from comment #11)
So the builds whose crashes we're tracking here are presumably unofficial in some way.
But all the recent "Nix package" crashes on FF 144 have the same build id (20251009125714) as Mozilla's own FF 144 builds.
Comment 13•5 months ago
|
||
(In reply to Steven Michaud [:smichaud] (Retired) from comment #12)
(In reply to Steven Michaud [:smichaud] (Retired) from comment #11)
So the builds whose crashes we're tracking here are presumably unofficial in some way.
But all the recent "Nix package" crashes on FF 144 have the same build id (
20251009125714) as Mozilla's own FF 144 builds.
That might just be because anything that builds 144 gets that build id?
Comment 14•5 months ago
|
||
There's some steps to reproduce here https://github.com/NixOS/nixpkgs/issues/453372#issuecomment-3440425174
| Reporter | ||
Comment 15•4 months ago
•
|
||
I can reproduce these crashes! Using an ARM64 Nix Package build of Firefox 144 made on macOS 26.0.1.
bp-143fd849-2927-422c-bafb-cb5720251024
bp-5231f0a3-1d8c-4a04-8c7e-c1fc70251024
| Reporter | ||
Comment 16•4 months ago
•
|
||
This bug's crashes are definitely caused by a bug in LLVM 21. I'm able to trigger them in a local mozilla-central build made using clang and lld from LLVM 21. This happens with LLVM 21.1.2 as used by Nix, and also with the latest release (LLVM 21.1.4).
They don't happen with LLVM 20. Which should be good news to those working on bug 1923255.
Source and binary distros for LLVM are available here. But there are no macOS binaries for LLVM 21 -- only for LLVM 20. So I had to build them myself, which was a royal pain.
llvm.org's build instructions are incomplete and misleading, so I think it's worthwhile documenting how I built LLVM 21. The basic commands I used were as follows. I first installed Homebrew's cmake and ninja.
cmake -S llvm -B build -G Ninja -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;lld" -DLLVM_ENABLE_RUNTIMES="libcxx;libcxxabi;libunwind;compiler-rt" -DCMAKE_BUILD_TYPE=Release -DLIBCXX_ENABLE_VENDOR_AVAILABILITY_ANNOTATIONS=ON
ninja -C build all
ninja -C build install
LLVM recommends building in two stages -- first using the Apple compiler and linker (clang and ld64); second using the LLVM compiler and linker (clang and lld) that you just installed. I did this. For the first stage I used an additional parameter (for example -DCMAKE_INSTALL_PREFIX=/staging/directory) to install into a staging directory. For the second, I temporarily put the staging directory at the start of the path: For example export PATH="/staging/directory/bin:$PATH". And I changed the install prefix to point to something more permanent -- for example -DCMAKE_INSTALL_PREFIX=/usr/local/llvm-project-21.1.4. For this you need (of course) to use sudo ninja -C build install.
To get ./mach configure and ./mach build to use the "wrong" version of LLVM, I created new shortcuts for clang and lld in ~/.mozbuild/clang/bin.
./mach configure fails using LLVM 21 and LLVM 20. I needed to specify ./mach configure --without-wasm-sandboxed-libraries. And for both I needed to patch trunk code as follows:
diff --git a/third_party/zucchini/chromium/components/zucchini/suffix_array_unittest.cc b/third_party/zucchini/chromium/components/zucchini/suffix_array_unittest.cc
--- a/third_party/zucchini/chromium/components/zucchini/suffix_array_unittest.cc
+++ b/third_party/zucchini/chromium/components/zucchini/suffix_array_unittest.cc
@@ -22,7 +22,8 @@ using SLType = InducedSuffixSort::SLType
} // namespace
-using ustring = std::basic_string<unsigned char>;
+//using ustring = std::basic_string<unsigned char>;
+using ustring = std::vector<unsigned char>;
constexpr uint16_t kNumChar = 256;
I found valuable help here and here. This is where I learned that you need to use -DLIBCXX_ENABLE_VENDOR_AVAILABILITY_ANNOTATIONS=ON.
I'll be trying to find the bug in LLVM 21. It may take a while.
Comment 17•4 months ago
|
||
FYI glandium, seems to be a bug in llvm 21 in case that affects any of your plans.
Comment 18•4 months ago
|
||
Thanks for the investigation. We have now pinned LLVM 20 for aarch64-darwin in nixpkgs and are watching this issue.
Updated•4 months ago
|
| Reporter | ||
Updated•4 months ago
|
Updated•4 months ago
|
| Reporter | ||
Comment 19•4 months ago
|
||
I've found the LLVM commit that caused these crashes, and have commented there.
I've been trying to write a reduced testcase, so far without any success. Once I've managed it, I'll open an issue at the LLVM Project.
| Reporter | ||
Comment 20•4 months ago
•
|
||
For those who are curious, here are two patches, each of which works around the LLVM 21 bug I found. These apply to the release/21.x branch.
diff --git a/llvm/lib/CodeGen/MachineCopyPropagation.cpp b/llvm/lib/CodeGen/MachineCopyPropagation.cpp
index 742de1101faa..f6a6c9ef9d19 100644
--- a/llvm/lib/CodeGen/MachineCopyPropagation.cpp
+++ b/llvm/lib/CodeGen/MachineCopyPropagation.cpp
@@ -942,7 +942,9 @@ void MachineCopyPropagation::ForwardCopyPropagateBlock(MachineBasicBlock &MBB) {
// are the same and are not referring to a reserved register). If so,
// delete it.
if (RegSrc == RegDef && !MRI->isReserved(RegSrc)) {
+#if (0)
MI.eraseFromParent();
+#endif
NumDeletes++;
Changed = true;
continue;
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 5420545cc3ce..65e4c0f11373 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -10057,6 +10057,7 @@ AArch64InstrInfo::isCopyInstrImpl(const MachineInstr &MI) const {
// AArch64::ORRWrs and AArch64::ORRXrs with WZR/XZR reg
// and zero immediate operands used as an alias for mov instruction.
+#if (0)
if (((MI.getOpcode() == AArch64::ORRWrs &&
MI.getOperand(1).getReg() == AArch64::WZR &&
MI.getOperand(3).getImm() == 0x0) ||
@@ -10069,6 +10070,7 @@ AArch64InstrInfo::isCopyInstrImpl(const MachineInstr &MI) const {
MI.findRegisterDefOperandIdx(getXRegFromWReg(MI.getOperand(0).getReg()),
/*TRI=*/nullptr) == -1))
return DestSourcePair{MI.getOperand(0), MI.getOperand(2)};
+#endif
if (MI.getOpcode() == AArch64::ORRXrs &&
MI.getOperand(1).getReg() == AArch64::XZR &&
And here's a debug logging patch that I've been using:
diff --git a/llvm/lib/CodeGen/MachineCopyPropagation.cpp b/llvm/lib/CodeGen/MachineCopyPropagation.cpp
index 742de1101faa..ca3d3c5dc433 100644
--- a/llvm/lib/CodeGen/MachineCopyPropagation.cpp
+++ b/llvm/lib/CodeGen/MachineCopyPropagation.cpp
@@ -450,10 +450,18 @@ public:
}
};
+// This debug logging uses https://github.com/steven-michaud/PySerialPortLogger.
+// Install it, then run Terminal and open three tabs. Then run serialportlogger
+// in its third tab.
+#define BUGZILLA_1995582_DEBUG_LOG 1
+
class MachineCopyPropagation {
const TargetRegisterInfo *TRI = nullptr;
const TargetInstrInfo *TII = nullptr;
const MachineRegisterInfo *MRI = nullptr;
+#ifdef BUGZILLA_1995582_DEBUG_LOG
+ const MachineFunction *MF_ = nullptr;
+#endif
// Return true if this is a copy instruction and false otherwise.
bool UseCopyInstr;
@@ -874,6 +882,31 @@ void MachineCopyPropagation::forwardUses(MachineInstr &MI) {
}
}
+#ifdef BUGZILLA_1995582_DEBUG_LOG
+#include <stdarg.h>
+#include <stdio.h>
+#include <fcntl.h>
+#include <termios.h>
+
+#define VIRTUAL_SERIAL_PORT "/dev/ttys003"
+bool g_virtual_serial_checked = false;
+int g_virtual_serial = -1;
+std::unique_ptr<raw_fd_ostream> g_virtual_serial_stream;
+
+static void maybe_initialize_tty()
+{
+ if (!g_virtual_serial_checked) {
+ g_virtual_serial_checked = true;
+ g_virtual_serial =
+ open(VIRTUAL_SERIAL_PORT, O_WRONLY | O_NONBLOCK | O_NOCTTY);
+ if (g_virtual_serial >= 0) {
+ g_virtual_serial_stream =
+ std::make_unique<raw_fd_ostream>(g_virtual_serial, false, true);
+ }
+ }
+}
+#endif
+
void MachineCopyPropagation::ForwardCopyPropagateBlock(MachineBasicBlock &MBB) {
LLVM_DEBUG(dbgs() << "MCP: ForwardCopyPropagateBlock " << MBB.getName()
<< "\n");
@@ -942,6 +975,17 @@ void MachineCopyPropagation::ForwardCopyPropagateBlock(MachineBasicBlock &MBB) {
// are the same and are not referring to a reserved register). If so,
// delete it.
if (RegSrc == RegDef && !MRI->isReserved(RegSrc)) {
+#ifdef BUGZILLA_1995582_DEBUG_LOG
+ maybe_initialize_tty();
+ *g_virtual_serial_stream << "******\n";
+ *g_virtual_serial_stream << "MachineInstr::eraseFromParent(UseCopyInstr " << UseCopyInstr << "):\n\n";
+ MI.print(*g_virtual_serial_stream);
+ *g_virtual_serial_stream << "\n";
+ tcdrain(g_virtual_serial);
+ MF_->print(*g_virtual_serial_stream);
+ tcdrain(g_virtual_serial);
+ *g_virtual_serial_stream << "******\n";
+#endif
MI.eraseFromParent();
NumDeletes++;
Changed = true;
@@ -1616,6 +1660,9 @@ bool MachineCopyPropagation::run(MachineFunction &MF) {
TRI = MF.getSubtarget().getRegisterInfo();
TII = MF.getSubtarget().getInstrInfo();
MRI = &MF.getRegInfo();
+#ifdef BUGZILLA_1995582_DEBUG_LOG
+ MF_ = &MF;
+#endif
for (MachineBasicBlock &MBB : MF) {
if (isSpillageCopyElimEnabled)
| Reporter | ||
Comment 21•4 months ago
•
|
||
I'm still trying to create a reduced test case for the LLVM 21 bug introduced by https://github.com/llvm/llvm-project/pull/129889. It's not going to be easy.
But I did find an easy way for Mozilla to work around this bug: Just add -mllvm -aarch64-enable-copy-propagation=false to CPPFLAGS. Note that this flag has no effect unless you're using -O3 optimization, and that the crashes don't happen at lower levels of optimization (or with no optimization).
| Reporter | ||
Comment 22•4 months ago
•
|
||
I just discovered something interesting: Building Firefox with -O3 -flto (and using LLVM 21 tools) disables copy propagation. The builds take more than twice as long as "normal" builds, but they aren't effected by the copy propagation bug. -flto turns on "link time optimization".
| Reporter | ||
Comment 23•4 months ago
•
|
||
I've given up on writing a reduced testcase, at least for the time being. But I've submitted two lldb sessions here, a "good" one and a "bad" one. Between them they demonstrate how LLVM 21's copy propagation causes this bug's crashes.
Updated•4 months ago
|
| Reporter | ||
Comment 24•4 months ago
|
||
The llvm-project is going to back out the copy propagation optimization that caused these crashes. I'll test it once the patch reaches the release/21.x branch.
Updated•4 months ago
|
| Reporter | ||
Comment 25•4 months ago
|
||
(In reply to Steven Michaud [:smichaud] (Retired) from comment #24)
The llvm-project is going to back out the copy propagation optimization that caused these crashes. I'll test it once the patch reaches the
release/21.xbranch.
This just landed on the release/21.x branch. I tested with it and had no problems. If I understand correctly, the first release containing this patch will be 21.1.6. Given the frequency of past releases, it should come out in a few weeks.
| Reporter | ||
Comment 26•4 months ago
•
|
||
LLVM 21.1.6 was just released, and there's a macOS ARM64 build among its "assets". But it was built incorrectly (without -DLIBCXX_ENABLE_VENDOR_AVAILABILITY_ANNOTATIONS=ON), and so using it to build Firefox triggers this bug (see also).
It's assets don't yet include a source distro (e.g. llvm-project-21.1.6.src.tar.xz). So I'm not yet able to do a local build.
| Reporter | ||
Comment 27•4 months ago
•
|
||
(In reply to Steven Michaud [:smichaud] (Retired) from comment #26)
It's assets don't yet include a source distro (e.g.
llvm-project-21.1.6.src.tar.xz). So I'm not yet able to do a local build.
Actually it does, here. I did a local build of this (with -DLIBCXX_ENABLE_VENDOR_AVAILABILITY_ANNOTATIONS=ON) and had no trouble with it -- either building Firefox (from current trunk code) or running it afterwards.
So maybe we should close this bug as fixed by the LLVM 21.1.6 release. But I think we should wait until the LLVM Project has a macOS ARM64 build that works properly. I reported the problem here. I'm not sure if I'll also need to open a separate issue on it.
Comment 28•27 days ago
|
||
For all intents and purposes for Mozilla, this bug is fixed and won't affect builds using the clang toolchain bootstrapped by the build system.
Description
•