Closed Bug 1744719 Opened 2 years ago Closed 1 year ago

Bug 1724777 introduced crashes in builds compiled with Mingw

Categories

(Core :: DOM: Core & HTML, defect)

Firefox 91
x86_64
Windows
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox-esr91 --- disabled
firefox95 --- disabled
firefox96 --- disabled
firefox97 --- disabled

People

(Reporter: pierov, Assigned: smaug)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: csectype-uaf, regression, sec-other, Whiteboard: [doesn't affect Mozilla-built binaries])

Attachments

(2 files)

In some cases tabs crash when they use asmjs.
This happens only in win64 when compiled with mingw-clang and optimizations on (no crashes in win32, both with and without optimizations, and in win64 without optimizations).

The bug was first discovered in Tor Browser but it applies also to Firefox (tested with 6764bd4047c392e71018f8ee85394c1cf8564304, compiled with mingw).
The ticket reports also the URL to a test page, and to some other pages to use to test the issue, as well as my investigation on the crash.
I found it in CycleCollectedJSContext::PerformMicroTaskCheckPoint (runnable seems to be an invalid pointer).

Blocks: 1724777
Flags: needinfo?(bugs)

Oh, bug 1724777 is just an optimization, right? Which means we could think about backing this out to stop the crashing without any risks (and biting the performance hit)?

Blocks: mingw-clang

On the TBB build with the related patch reverted, I did not have the crash.
However, I did not test performances...

The performance issue happens in somewhat unusual cases.
But why would that patch cause crashes? Did it have a bug or does mingw have a bug? Or did the patch reveal some issue elsewhere?
Stack traces would be useful here, and a minimal testcase.

Flags: needinfo?(bugs)

Hi, sure.
I did not investigate the patch to understand what is the problem, yet.
However I found the crash in CycleCollectedJSContext::PerformMicroTaskCheckPoint, I tried to revert that patch as it was the latest time this task mechanism was modified, and all the problems disappeared.

As for the backtrace, is a textual one okay for you, or should I provide a dump in some specific way?
For testcases, please try to open this page: https://boring-easley-d4cbde.netlify.app/ (here you can find the source: https://github.com/mmso/tor-11-crash-repro/blob/main/src/Test.jsx#L30). Probably it can be turned into an automatic test.
Plus, some other real sites have the same problem (at least https://privatebin.info/, ProtonMail, and the donation page of OBS).

Please notice that from further tests, this patch causes some issues also to addons like uBlock Origin. But in this case there are not any (noticeable) crashes.

(In reply to Pier Angelo Vendrame from comment #4)

Please notice that from further tests, this patch causes some issues also to addons like uBlock Origin. But in this case there are not any (noticeable) crashes.

Blank popup panels and options pages can be a sign of a crashed extension process. There is currently no correctly functioning mechanism to restart a crashed extension process (bug 1355239).

Textual stack trace would be good. And a minimal testcase.

Here is the stack trace for https://boring-easley-d4cbde.netlify.app/:

```
xul!mozilla::CycleCollectedJSContext::PerformMicroTaskCheckPoint + 0x2b0   
xul!mozilla::CycleCollectedJSContext::LeaveMicroTask + 0x13   
xul!mozilla::nsAutoMicroTask::~nsAutoMicroTask + 0x1d   
xul!mozilla::EventListenerManager::HandleEventSubType + 0x1ee   
xul!mozilla::EventListenerManager::HandleEventInternal + 0x4e6   
xul!mozilla::EventListenerManager::HandleEvent + 0x8d   
xul!mozilla::EventTargetChainItem::HandleEvent + 0x29c   
xul!mozilla::EventTargetChainItem::HandleEventTargetChain + 0x485   
xul!mozilla::EventDispatcher::Dispatch + 0xfe5   
xul!mozilla::EventDispatcher::DispatchDOMEvent + 0xf7   
xul!mozilla::DOMEventTargetHelper::DispatchEvent + 0x45   
xul!mozilla::dom::EventTarget::DispatchEvent + 0x34   
xul!mozilla::dom::MessageEventRunnable::DispatchDOMEvent + 0x4b2   
xul!mozilla::dom::MessageEventRunnable::WorkerRun + 0x74   
xul!mozilla::dom::WorkerRunnable::Run + 0x2b9   
xul!nsThread::ProcessNextEvent + 0x569   
xul!NS_ProcessNextEvent + 0x4a   
xul!mozilla::dom::WorkerPrivate::DoRunLoop + 0x349   
xul!mozilla::dom::workerinternals::`anonymous namespace'::WorkerThreadPrimaryRunnable::Run + 0x4d4   
xul!nsThread::ProcessNextEvent + 0x569   
xul!NS_ProcessNextEvent + 0x4a   
xul!mozilla::ipc::MessagePumpForNonMainThreads::Run + 0x96   
xul!MessageLoop::AutoRunState::~AutoRunState + 0xf   
xul!MessageLoop::Run + 0x84   
xul!nsThread::ThreadFunc + 0x121   
nss3!_PR_NativeRunThread + 0x13a   
nss3!pr_root + 0xa   
ucrtbase!thread_start<unsigned int (__cdecl*)(void *),1> + 0x42   
KERNEL32!BaseThreadInitThunk + 0x14   
mozglue!mozilla::interceptor::FuncHook<mozilla::interceptor::WindowsDllInterceptor<mozilla::interceptor::VMSharingPolicyShared>,void (*)(int, void *, void *)>::operator() + 0xe   
mozglue!patched_BaseThreadInitThunk + 0x6e   
ntdll!RtlUserThreadStart + 0x21   
```

After Rob's comment I installed uBlock origin and launched Tor Browser on a debugger (as I already had the needed symbols), and I detected a crash (stack trace attached).

As regards the testcase, why is https://boring-easley-d4cbde.netlify.app/ not okay?
Is it because it has been created with react, or because it uses openpgp.js, or something else? Just to understand how I could improve it.

Tor Browser has patches on top of Firefox, but none in this part of code.
The patch for Bug 1724777  was the last one that modified `xpcom/base/CycleCollectedJSContext.cpp`. But I can build Firefox before any of our patches with and without the patch for Bug 1724777.

(ugh, sorry for the double attachment, I had an alert with error from S3 and I thought it was not added).
Attached file ublock_stack.txt
(double attachment - please see the previous one)

Does Tor have some patches on top of mozilla-esr91 ?
oh, nm, happens with FF too.

Hi again.
I noticed that runnable's raw pointer is always 0xe5e5e5e5e5e5e5e5 when I have the crash.

Also, the exact line where the crash is happening is if (runnable->Suppressed()) { (xpcom/base/CycleCollectedJSContext.cpp:632, in the revision of the file we are using).

I compiled with MOZ_COPY_PDBS=1, so I could get these information on the source code.

Please let me know if you need further information.

No longer blocks: 1724777
Keywords: regression
Regressed by: 1724777
Has Regression Range: --- → yes

Set release status flags based on info from the regressing bug 1724777

Summary: Bug 1724777 introduced crashes when compiling with Mingw → Bug 1724777 introduced crashes in builds compiled with Mingw
Whiteboard: [doesn't affect Mozilla-built binaries]

Olli, assign this to you for now as it looks you've been investigated this.

Assignee: nobody → bugs
Severity: -- → S2

Pier, do you think you could investigate this a bit. It is quite surprising that Mingw causes this kind of side effect.

Flags: needinfo?(pierov)

and is it documented somewhere, how to build FF the right way to reproduce this?

(In reply to Olli Pettay [:smaug] from comment #14)

and is it documented somewhere, how to build FF the right way to reproduce this?

I had though you repro-ed it with FF?

A MinGW build can be done locally on linux for windows - you have to download between 4 and 6 toolchains from taskcluster and use a mozconfig that looks like the below.

It's easier to send in a try run to get a mingw build though.

ac_add_options --enable-debug


ac_add_options --target=x86_64-w64-mingw32
ac_add_options --with-toolchain-prefix=x86_64-w64-mingw32-

ac_add_options --disable-warnings-as-errors
mk_add_options "export WIDL_TIME_OVERRIDE=0"

# This replicates Tor's configuration
ac_add_options --enable-proxy-bypass-protection

# These aren't supported on mingw at this time
ac_add_options --disable-webrtc # Bug 1393901
ac_add_options --disable-geckodriver # Bug 1489320
ac_add_options --disable-update-agent # Bug 1561797
ac_add_options --disable-default-browser-agent # WinToast does not build on mingw

# Find our toolchain
HOST_CC="/home/tom/Documents/moz/mozilla-unified-4/toolchain/clang/bin/clang"
HOST_CXX="/home/tom/Documents/moz/mozilla-unified-4/toolchain/clang/bin/clang++"
CC="/home/tom/Documents/moz/mozilla-unified-4/toolchain/clang/bin/x86_64-w64-mingw32-clang"
CXX="/home/tom/Documents/moz/mozilla-unified-4/toolchain/clang/bin/x86_64-w64-mingw32-clang++"
CXXFLAGS="-fms-extensions"
CFLAGS="$CFLAGS -fcrash-diagnostics-dir=${UPLOAD_PATH}"
CXXFLAGS="$CXXFLAGS -fcrash-diagnostics-dir=${UPLOAD_PATH}"

RUSTC="/home/tom/Documents/moz/mozilla-unified-4/toolchain/rustc/bin/rustc"

# For Stylo
BINDGEN_CFLAGS="-I/home/tom/Documents/moz/mozilla-unified-4/toolchain/clang/x86_64-w64-mingw32/include/c++/v1 -I/home/tom/Documents/moz/mozilla-unified-4/toolchain/clang/x86_64-w64-mingw32/include"

# We want to make sure we use binutils and other binaries in the tooltool
# package.
mk_add_options "export PATH=/home/tom/Documents/moz/mozilla-unified-4/toolchain/clang/bin:/home/tom/Documents/moz/mozilla-unified-4/toolchain/mingw32/bin:/home/tom/Documents/moz/mozilla-unified-4/toolchain/wine/bin:/home/tom/Documents/moz/mozilla-unified-4/toolchain/upx/bin:/home/tom/Documents/moz/mozilla-unified-4/toolchain/fxc2/bin:/home/tom/Documents/moz/mozilla-unified-4/toolchain/binutils/bin:$PATH"

LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+$LD_LIBRARY_PATH:}/home/tom/Documents/moz/mozilla-unified-4/toolchain/mingw32/lib64:/home/tom/Documents/moz/mozilla-unified-4/toolchain/clang/lib
mk_add_options "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH"

Olli, I did not build Firefox to test this, I used a build from treeherder.
I was about to ask Tom if he could answer you, but he was quicker (thanks! :)).

Flags: needinfo?(pierov)

(In reply to Tom Ritter [:tjr] (ni? for response to CVE/sec-approval/advisories/etc) from comment #15)

(In reply to Olli Pettay [:smaug] from comment #14)

and is it documented somewhere, how to build FF the right way to reproduce this?

I had though you repro-ed it with FF?

My comment about FF was related to the initial comment "applies also to Firefox"

(In reply to Olli Pettay [:smaug] from comment #17)

My comment about FF was related to the initial comment "applies also to Firefox"

Yes, I just tested again with last nightly build I could found, and the problem persists.

I followed these steps:

  1. I downloaded a Firefox build compiled with Mingw
  2. I opened the just downloaded nighlty build
  3. I went to https://boring-easley-d4cbde.netlify.app/
  4. I waited 1-2 seconds
  5. the crash happened

Please, let me know if you can reproduce the crash as well in this way.

Thank you.

Flags: needinfo?(bugs)

Hi Christoph,
Flagging you here to follow up if we want to set a different level for this MinGW issue. Thank you.

Flags: needinfo?(ckerschb)

Marking as sec-other as this doesn't affect Firefox as far as we can tell.

Flags: needinfo?(ckerschb)
Keywords: sec-highsec-other

Hi Olli, per comment 20, if this doesn't affect Firefox, should we also downgrade the severity, out of tracking?

(In reply to Hsin-Yi Tsai [:hsinyi] from comment #21)

Hi Olli, per comment 20, if this doesn't affect Firefox, should we also downgrade the severity, out of tracking?

Reviewed in the team meeting, moving to S3.

Severity: S2 → S3

I've just checked the proof of concept (https://boring-easley-d4cbde.netlify.app/) in a couple of recent mingw Bo builds from m-c treeherder, and the crash seems to be fixed there.
So, I think we can close this, not sure whether as fixed or worksforme.

Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → WORKSFORME
Flags: needinfo?(smaug)
Group: dom-core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: