Closed Bug 1877744 Opened 1 year ago Closed 1 year ago

Upstream wpt tests showing frequent crashes in mozilla::dom::WebSocketImpl::AssertIsOnTargetThread

Categories

(Core :: Networking: WebSockets, defect, P1)

defect

Tracking

()

RESOLVED FIXED
124 Branch
Tracking Status
firefox124 --- fixed

People

(Reporter: jgraham, Assigned: kershaw)

Details

(Whiteboard: [necko-triaged] [necko-priority-queue])

Attachments

(1 file)

In release builds, upstream wpt seems to be crashing a lot in websockets tests. This doesn't seem to happen when running the tests from mozilla-central, but does when running directly from a wpt checkout.

This seems to be why we have no recent stable build results in Interop 2023, because eventually we get into a state where we're unable to load the main test page, and so the job fails.

See https://community-tc.services.mozilla.com/tasks/FZV0RlUHR_CsaQBvRRqFpw/runs/0/logs/live/public/logs/live.log for an upstream log

Locally I reproduced this from a wpt checkout with a command like ./wpt run --no-restart-on-unexpected --log-tbpl - --channel=stable --binary ../gecko/obj-x86_64-pc-linux-gnu/dist/bin/firefox firefox websockets/ where the --binary points to a local build.

The stack looks like:

#0  mozilla::dom::WebSocketImpl::AssertIsOnTargetThread (this=0x7fffaf5a3240) at /home/jgraham/develop/gecko/dom/websocket/WebSocket.cpp:150
#1  mozilla::dom::WebSocketImpl::OnStart (this=0x7fffaf5a3240, aContext=<optimized out>) at /home/jgraham/develop/gecko/dom/websocket/WebSocket.cpp:768
#2  0x00007fffec176c4d in mozilla::net::WebSocketChannel::NotifyOnStart (this=0x7fffb0fd6a00) at /home/jgraham/develop/gecko/netwerk/protocol/websocket/WebSocketChannel.cpp:3023
#3  0x00007fffec192847 in mozilla::detail::RunnableMethodArguments<>::apply<mozilla::net::WebSocketChannel, void (mozilla::net::WebSocketChannel::*)()>(mozilla::net::WebSocketChannel*, void (mozilla::net::WebSocketChannel::*)())::{lambda((auto:1&&)...)#1}::operator()<>() const (this=<optimized out>)
    at /home/jgraham/develop/gecko/obj-x86_64-pc-linux-gnu/dist/include/nsThreadUtils.h:1164
#4  std::__invoke_impl<void, mozilla::detail::RunnableMethodArguments<>::apply<mozilla::net::WebSocketChannel, void (mozilla::net::WebSocketChannel::*)()>(mozilla::net::WebSocketChannel*, void (mozilla::net::WebSocketChannel::*)())::{lambda((auto:1&&)...)#1}>(std::__invoke_other, mozilla::detail::RunnableMethodArguments<>::apply<mozilla::net::WebSocketChannel, void (mozilla::net::WebSocketChannel::*)()>(mozilla::net::WebSocketChannel*, void (mozilla::net::WebSocketChannel::*)())::{lambda((auto:1&&)...)#1}&&) (__f=<optimized out>)
    at /home/jgraham/.mozbuild/sysroot-x86_64-linux-gnu/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/invoke.h:60
#5  std::__invoke<mozilla::detail::RunnableMethodArguments<>::apply<mozilla::net::WebSocketChannel, void (mozilla::net::WebSocketChannel::*)()>(mozilla::net::WebSocketChannel*, void (mozilla::net::WebSocketChannel::*)())::{lambda((auto:1&&)...)#1}>(mozilla::detail::RunnableMethodArguments<>::apply<mozilla::net::WebSocketChannel, void (mozilla::net::WebSocketChannel::*)()>(mozilla::net::WebSocketChannel*, void (mozilla::net::WebSocketChannel::*)())::{lambda((auto:1&&)...)#1}&&) (__fn=<optimized out>)
    at /home/jgraham/.mozbuild/sysroot-x86_64-linux-gnu/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/invoke.h:95
#6  std::__apply_impl<mozilla::detail::RunnableMethodArguments<>::apply<mozilla::net::WebSocketChannel, void (mozilla::net::WebSocketChannel::*)()>(mozilla::net::WebSocketChannel*, void (mozilla::net::WebSocketChannel::*)())::{lambda((auto:1&&)...)#1}, std::tuple<>&>(mozilla::detail::RunnableMethodArguments<>::apply<mozilla::net::WebSocketChannel, void (mozilla::net::WebSocketChannel::*)()>(mozilla::net::WebSocketChannel*, void (mozilla::net::WebSocketChannel::*)())::{lambda((auto:1&&)...)#1}&&, std::tuple<>&, std::integer_sequence<unsigned long>) (__t=<optimized out>, __f=<optimized out>)
    at /home/jgraham/.mozbuild/sysroot-x86_64-linux-gnu/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/tuple:1678
#7  std::apply<mozilla::detail::RunnableMethodArguments<>::apply<mozilla::net::WebSocketChannel, void (mozilla::net::WebSocketChannel::*)()>(mozilla::net::WebSocketChannel*, void (mozilla::net::WebSocketChannel::*)())::{lambda((auto:1&&)...)#1}, std::tuple<>&>(mozilla::detail::RunnableMethodArguments<>::apply<mozilla::net::WebSocketChannel, void (mozilla::net::WebSocketChannel::*)()>(mozilla::net::WebSocketChannel*, void (mozilla::net::WebSocketChannel::*)())::{lambda((auto:1&&)...)#1}&&, std::tuple<>&) (__t=<optimized out>, __f=<optimized out>)
    at /home/jgraham/.mozbuild/sysroot-x86_64-linux-gnu/usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/tuple:1687
#8  mozilla::detail::RunnableMethodArguments<>::apply<mozilla::net::WebSocketChannel, void (mozilla::net::WebSocketChannel::*)()>(mozilla::net::WebSocketChannel*, void (mozilla::net::WebSocketChannel::*)()) (this=<optimized out>, o=<optimized out>, m=<optimized out>)
    at /home/jgraham/develop/gecko/obj-x86_64-pc-linux-gnu/dist/include/nsThreadUtils.h:1162
#9  mozilla::detail::RunnableMethodImpl<RefPtr<mozilla::net::WebSocketChannel> const, void (mozilla::net::WebSocketChannel::*)(), true, (mozilla::RunnableKind)0>::Run (this=<optimized out>) at /home/jgraham/develop/gecko/obj-x86_64-pc-linux-gnu/dist/include/nsThreadUtils.h:1213
#10 0x00007fffeb9a0e28 in mozilla::RunnableTask::Run (this=0x7fffaf09b780) at /home/jgraham/develop/gecko/xpcom/threads/TaskController.cpp:549
#11 0x00007fffeb99a48c in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal (this=this@entry=0x7fffe5765900, aProofOfLock=...) at /home/jgraham/develop/gecko/xpcom/threads/TaskController.cpp:876
#12 0x00007fffeb999158 in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal (this=this@entry=0x7fffe5765900, aProofOfLock=...) at /home/jgraham/develop/gecko/xpcom/threads/TaskController.cpp:699
#13 0x00007fffeb9995d6 in mozilla::TaskController::ProcessPendingMTTask (this=0x7fffe5765900, aMayWait=true) at /home/jgraham/develop/gecko/xpcom/threads/TaskController.cpp:485
#14 0x00007fffeb9a49aa in mozilla::TaskController::TaskController()::$_1::operator()() const (this=<optimized out>) at /home/jgraham/develop/gecko/xpcom/threads/TaskController.cpp:214
#15 mozilla::detail::RunnableFunction<mozilla::TaskController::TaskController()::$_1>::Run() (this=<optimized out>) at /home/jgraham/develop/gecko/xpcom/threads/nsThreadUtils.h:548
#16 0x00007fffeb9b8de8 in nsThread::ProcessNextEvent (this=0x7ffff7791c80, aMayWait=true, aResult=0x7fffffffc55f) at /home/jgraham/develop/gecko/xpcom/threads/nsThread.cpp:1199
#17 0x00007fffeb9bf99e in NS_ProcessNextEvent (aThread=0x7ffff7a008e0 <_IO_stdfile_2_lock>, aThread@entry=0x7ffff7791c80, aMayWait=true) at /home/jgraham/develop/gecko/xpcom/threads/nsThreadUtils.cpp:480
#18 0x00007fffec3a417e in mozilla::ipc::MessagePump::Run (this=0x7fffe57261c0, aDelegate=0x7ffff773aa60) at /home/jgraham/develop/gecko/ipc/glue/MessagePump.cpp:107
#19 0x00007fffec322052 in MessageLoop::RunHandler (this=0x7ffff7a008e0 <_IO_stdfile_2_lock>) at /home/jgraham/develop/gecko/ipc/chromium/src/base/message_loop.cc:363
#20 MessageLoop::Run (this=0x7ffff7a008e0 <_IO_stdfile_2_lock>) at /home/jgraham/develop/gecko/ipc/chromium/src/base/message_loop.cc:345
#21 0x00007fffeff46ab9 in nsBaseAppShell::Run (this=0x7fffe5769b00) at /home/jgraham/develop/gecko/widget/nsBaseAppShell.cpp:148
#22 0x00007fffefff8c49 in nsAppShell::Run (this=0x7fffe5769b00) at /home/jgraham/develop/gecko/widget/gtk/nsAppShell.cpp:470
#23 0x00007ffff1763db5 in nsAppStartup::Run (this=0x7fffe5675ce0) at /home/jgraham/develop/gecko/toolkit/components/startup/nsAppStartup.cpp:296
#24 0x00007ffff18caac6 in XREMain::XRE_mainRun (this=this@entry=0x7fffffffc8b0) at /home/jgraham/develop/gecko/toolkit/xre/nsAppRunner.cpp:5709
#25 0x00007ffff18cc232 in XREMain::XRE_main (this=this@entry=0x7fffffffc8b0, argc=argc@entry=5, argv=argv@entry=0x7fffffffdbb8, aConfig=...) at /home/jgraham/develop/gecko/toolkit/xre/nsAppRunner.cpp:5918
#26 0x00007ffff18cce53 in XRE_main (argc=5, argv=0x7fffffffdbb8, aConfig=...) at /home/jgraham/develop/gecko/toolkit/xre/nsAppRunner.cpp:5974
#27 0x000055555558d60a in do_main (argc=5, argv=0x7fffffffdbb8, envp=0x7fffffffdbe8) at /home/jgraham/develop/gecko/browser/app/nsBrowserApp.cpp:227
#28 main (argc=5, argv=0x7fffffffdbb8, envp=0x7fffffffdbe8) at /home/jgraham/develop/gecko/browser/app/nsBrowserApp.cpp:445
Flags: needinfo?(valentin.gosu)

I can reproduce on macOS. It seems to be only worker tests that crash, for example /websockets/Close-1000-reason.any.worker.html?default

TEST-START | /websockets/Close-1000-reason.any.worker.html?default
IOError on command, setting status to CRASH
mozcrash No local symbols_path provided, only http symbols will be used.
mozcrash Copy/paste: /Users/simonpieters/.mozbuild/minidump-stackwalk/minidump-stackwalk --human --brief /var/folders/z8/gjz5ggw947jc4smp_9vxxlzc0000gn/T/tmpgxp8qiaf/minidumps/66E0A21D-2492-4665-B88C-0C028403ED01.dmp
PROCESS-CRASH | MOZ_RELEASE_ASSERT(NS_IsMainThread() == mIsMainThread) [@ XUL + 0x4b1a57c] | /websockets/Close-1000-reason.any.worker.html?default 
Process pid: 34187
Mozilla crash reason: MOZ_RELEASE_ASSERT(NS_IsMainThread() == mIsMainThread)
Crash dump filename: /var/folders/z8/gjz5ggw947jc4smp_9vxxlzc0000gn/T/tmpgxp8qiaf/minidumps/66E0A21D-2492-4665-B88C-0C028403ED01.dmp
Operating system: Mac OS X
                  14.1.1 23B81
CPU: arm64
     10 CPUs

Crash reason:  EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
Crash address: 0x0
Mac Crash Info:

Process uptime: 2 seconds

Thread 0 MainThread (crashed)
 0  XUL + 0x4b1a57c
      x0 = 0x0000000100ac16fc     x1 = 0x000000016f425de4
      x2 = 0x000000016f425dd8     x3 = 0x000000016f425dc8
      x4 = 0x0000000125e0c220     x5 = 0x0000000118cdf568
      x6 = 0x0000000118cdf3e8     x7 = 0xfffa800000000000
      x8 = 0x0000000000000000     x9 = 0x00000000000009c8
     x10 = 0x000000011870b6cc    x11 = 0x0000000000000065
     x12 = 0x0000000000000000    x13 = 0x000000016f426080
     x14 = 0x000000016f426060    x15 = 0x00001a5c5e444b88
     x16 = 0x00000000000000fc    x17 = 0x0000000100ac1600
     x18 = 0x0000000000000000    x19 = 0x0000000127321a00
     x20 = 0x000000016f425f08    x21 = 0x000000016f425dc8
     x22 = 0x000000016f425df0    x23 = 0xfff9800000000000
     x24 = 0x000000016f425ef0    x25 = 0x0000000000000001
     x26 = 0x0000000110c34218    x27 = 0x0000000118cd45f8
     x28 = 0x0000000110c34218     fp = 0x000000016f425db0
      lr = 0x0000000115f5e3d0     sp = 0x000000016f425d60
      pc = 0x0000000115f5e57c
    Found by: given as instruction pointer in context
 1  XUL + 0x3d8e004
      sp = 0x000000016f425dc0     pc = 0x00000001151d2008
    Found by: previous frame's frame pointer
 2  XUL + 0xddf180
      sp = 0x000000016f425ed0     pc = 0x0000000112223184
    Found by: previous frame's frame pointer
 3  XUL + 0x1a37eac
      sp = 0x000000016f425f80     pc = 0x0000000112e7beb0
    Found by: previous frame's frame pointer
 4  XUL + 0x1a47e5c
      sp = 0x000000016f4261d0     pc = 0x0000000112e8be60
    Found by: previous frame's frame pointer
 5  XUL + 0x1a3b79c
      sp = 0x000000016f426680     pc = 0x0000000112e7f7a0
    Found by: previous frame's frame pointer
 6  XUL + 0x1af4c50
      sp = 0x000000016f4268e0     pc = 0x0000000112f38c54
    Found by: previous frame's frame pointer
 7  XUL + 0x1a37eac
      sp = 0x000000016f426a30     pc = 0x0000000112e7beb0
    Found by: previous frame's frame pointer
 8  XUL + 0x1a47e5c
      sp = 0x000000016f426c80     pc = 0x0000000112e8be60
    Found by: previous frame's frame pointer
 9  XUL + 0x1a3b79c
      sp = 0x000000016f427130     pc = 0x0000000112e7f7a0
    Found by: previous frame's frame pointer
10  XUL + 0x1af4c50
      sp = 0x000000016f427390     pc = 0x0000000112f38c54
    Found by: previous frame's frame pointer
11  XUL + 0x1a37eac
      sp = 0x000000016f4274e0     pc = 0x0000000112e7beb0
    Found by: previous frame's frame pointer
12  XUL + 0x1a47e5c
      sp = 0x000000016f427730     pc = 0x0000000112e8be60
    Found by: previous frame's frame pointer
13  XUL + 0x1a3b79c
      sp = 0x000000016f427be0     pc = 0x0000000112e7f7a0
    Found by: previous frame's frame pointer
14  XUL + 0xd8421c
      sp = 0x000000016f427e40     pc = 0x00000001121c8220
    Found by: previous frame's frame pointer
15  XUL + 0xf0c610
      sp = 0x000000016f428000     pc = 0x0000000112350614
    Found by: previous frame's frame pointer
16  XUL + 0xf0b3f4
      sp = 0x000000016f4283b0     pc = 0x000000011234f3f8
    Found by: previous frame's frame pointer
17  XUL + 0xf10d14
      sp = 0x000000016f4284d0     pc = 0x0000000112354d18
    Found by: previous frame's frame pointer
18  XUL + 0xf00458
      sp = 0x000000016f428ab0     pc = 0x000000011234445c
    Found by: previous frame's frame pointer
19  XUL + 0xf222d8
      sp = 0x000000016f428af0     pc = 0x00000001123662dc
    Found by: previous frame's frame pointer
20  XUL + 0x4b174f8
      sp = 0x000000016f428b20     pc = 0x0000000115f5b4fc
    Found by: previous frame's frame pointer
21  XUL + 0x4b17344
      sp = 0x000000016f428b70     pc = 0x0000000115f5b348
    Found by: previous frame's frame pointer
22  XUL + 0x33ce0a0
      sp = 0x000000016f428be0     pc = 0x00000001148120a4
    Found by: previous frame's frame pointer
23  XUL + 0x33de1b0
      sp = 0x000000016f428c10     pc = 0x00000001148221b4
    Found by: previous frame's frame pointer
24  XUL + 0x3c2f20
      sp = 0x000000016f428c20     pc = 0x0000000111806f24
    Found by: previous frame's frame pointer
25  XUL + 0x3bd29c
      sp = 0x000000016f428c40     pc = 0x00000001118012a0
    Found by: previous frame's frame pointer
26  XUL + 0x3c4fbc
      sp = 0x000000016f428d60     pc = 0x0000000111808fc0
    Found by: previous frame's frame pointer
27  XUL + 0x3cd0b0
      sp = 0x000000016f428d90     pc = 0x00000001118110b4
    Found by: previous frame's frame pointer
28  XUL + 0x3cb044
      sp = 0x000000016f429020     pc = 0x000000011180f048
    Found by: previous frame's frame pointer
29  XUL + 0x137a0b0
      sp = 0x000000016f429070     pc = 0x00000001127be0b4
    Found by: previous frame's frame pointer
30  XUL + 0x1399bcc
      sp = 0x000000016f4290c0     pc = 0x00000001127ddbd0
    Found by: previous frame's frame pointer
31  CoreFoundation + 0x7dcf8
      sp = 0x000000016f429130     pc = 0x0000000185c55cfc
    Found by: previous frame's frame pointer
32  CoreFoundation + 0x7dc8c
      sp = 0x000000016f429140     pc = 0x0000000185c55c90
    Found by: previous frame's frame pointer
33  CoreFoundation + 0x7d9fc
      sp = 0x000000016f429170     pc = 0x0000000185c55a00
    Found by: previous frame's frame pointer
34  CoreFoundation + 0x7c5ec
      sp = 0x000000016f4291e0     pc = 0x0000000185c545f0
    Found by: previous frame's frame pointer
35  CoreFoundation + 0x7bc58
      sp = 0x000000016f429f40     pc = 0x0000000185c53c5c
    Found by: previous frame's frame pointer
36  HIToolbox + 0x30444
      sp = 0x000000016f429fe0     pc = 0x00000001901d0448
    Found by: previous frame's frame pointer
37  HIToolbox + 0x30280
      sp = 0x000000016f42a030     pc = 0x00000001901d0284
    Found by: previous frame's frame pointer
38  HIToolbox + 0x2ffd8
      sp = 0x000000016f42a0b0     pc = 0x00000001901cffdc
    Found by: previous frame's frame pointer
39  AppKit + 0x39c50
      sp = 0x000000016f42a0c0     pc = 0x000000018942ec54
    Found by: previous frame's frame pointer
40  AppKit + 0x80feb8
      sp = 0x000000016f42a480     pc = 0x0000000189c04ebc
    Found by: previous frame's frame pointer
41  XUL + 0x1399364
      sp = 0x000000016f42a560     pc = 0x00000001127dd368
    Found by: previous frame's frame pointer
42  AppKit + 0x2d0fc
      sp = 0x000000016f42a5c0     pc = 0x0000000189422100
    Found by: previous frame's frame pointer
43  XUL + 0x4c12a40
      sp = 0x000000016f42a630     pc = 0x0000000116056a44
    Found by: previous frame's frame pointer
44  XUL + 0x139a1d4
      sp = 0x000000016f42a660     pc = 0x00000001127de1d8
    Found by: previous frame's frame pointer
45  XUL + 0x54356a8
      sp = 0x000000016f42a6c0     pc = 0x00000001168796ac
    Found by: previous frame's frame pointer
46  XUL + 0x19faaf0
      sp = 0x000000016f42a6f0     pc = 0x0000000112e3eaf4
    Found by: previous frame's frame pointer
47  XUL + 0x54c5c38
      sp = 0x000000016f42a860     pc = 0x0000000116909c3c
    Found by: previous frame's frame pointer
48  XUL + 0x54c6034
      sp = 0x000000016f42a8f0     pc = 0x000000011690a038
    Found by: previous frame's frame pointer
49  firefox + 0x6298
      sp = 0x000000016f42a9c0     pc = 0x00000001009da29c
    Found by: previous frame's frame pointer
50  dyld + 0x60dc
      sp = 0x000000016f42ae30     pc = 0x00000001857fd0e0
    Found by: previous frame's frame pointer
51  libnss3.dylib + 0x27ffc
      sp = 0x000000016f42af38     pc = 0x0000000100b68000
    Found by: stack scanning
52  libnss3.dylib + 0x2bffc
      sp = 0x000000016f42af48     pc = 0x0000000100b6c000
    Found by: stack scanning

TEST-UNEXPECTED-CRASH | /websockets/Close-1000-reason.any.worker.html?default | expected OK
TEST-INFO took 187ms

We added the release assertions in https://phabricator.services.mozilla.com/D186618 and also did a refactoring in bug 1868177
Kershaw, could you take a look?

Flags: needinfo?(valentin.gosu)
Flags: needinfo?(rjesup)
Flags: needinfo?(kershaw)

This can be reproduced by running ./mach mochitest --disable-e10s dom/websocket/tests/test_websocket_sharedWorker.html.
I am surprised that WebSocketChannel::NotifyOnStart is supposed to be called on main thread, but WebSocketImpl::OnStart not.

Severity: -- → S3
Flags: needinfo?(kershaw)
Priority: -- → P1
Whiteboard: [necko-triaged] [necko-priority-queue]
Assignee: nobody → kershaw
Status: NEW → ASSIGNED

Assuming this lands, is it something we can uplift?

Pushed by valentin.gosu@gmail.com: https://hg.mozilla.org/integration/autoland/rev/6ab0ac80ff45 Make sure WebSocketImpl::OnStart is called on target thread, r=necko-reviewers,valentin

Backed out for causing build bustages in WebSocket.cpp.

  • Backout link
  • Push with failures
  • Failure Log
  • Failure line: /builds/worker/checkouts/gecko/dom/websocket/WebSocket.cpp:773:46: error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result]
Flags: needinfo?(kershaw)
Pushed by kjang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/6d12fe831e8f Make sure WebSocketImpl::OnStart is called on target thread, r=necko-reviewers,valentin
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 124 Branch
Flags: needinfo?(rjesup)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: