Closed Bug 1856003 Opened 2 years ago Closed 2 years ago

Intermittent linux wayland [tier 2] Late Beta application crashed [@ mozilla::Task::GetPriority] | automation.py when Gecko 120 switches to Late Beta on 2023-11-03

Categories

(Core :: XPCOM, defect)

defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox-esr115 --- unaffected
firefox118 --- unaffected
firefox119 --- unaffected
firefox120 + fixed

People

(Reporter: NarcisB, Unassigned)

References

Details

(Keywords: csectype-uaf, sec-other)

[Tracking Requested - why for this release]:

Central-as-beta-simulation

How to run these simulations

Failure log
Log snippet:

[task 2023-09-29T12:41:28.254Z] 12:41:28     INFO - PROCESS-CRASH | application crashed [@ mozilla::Task::GetPriority] | automation.py 
[task 2023-09-29T12:41:28.254Z] 12:41:28     INFO - Process type: unknown
[task 2023-09-29T12:41:28.254Z] 12:41:28     INFO - Process pid: None
[task 2023-09-29T12:41:28.254Z] 12:41:28     INFO - Crash dump filename: /tmp/tmp3j6r2gue.mozrunner/minidumps/07cd44b3-60aa-2aba-791a-780a366d1145.dmp
[task 2023-09-29T12:41:28.254Z] 12:41:28     INFO - Operating system: Linux
[task 2023-09-29T12:41:28.254Z] 12:41:28     INFO -                   5.15.0-70-generic #77-Ubuntu SMP Tue Mar 21 14:02:37 UTC 2023
[task 2023-09-29T12:41:28.254Z] 12:41:28     INFO - CPU: amd64
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -      family 6 model 85 stepping 7
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -      2 CPUs
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO - Linux Ubuntu 22.04 - jammy (Ubuntu 22.04.2 LTS)
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO - 
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO - Crash reason:  SIGSEGV / SI_KERNEL
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO - Crash address: 0xe5e5e5e5e5e5e649
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO - Crashing instruction: `mov edx, dword [rcx + 0x64]`
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO - Memory accessed by instruction:
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -   0. Address: 0xe5e5e5e5e5e5e649
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -      Size: 4
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO - Process uptime: not available
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO - 
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO - Thread 12 Timer (crashed)
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -  0  libxul.so!mozilla::Task::GetPriority() [TaskController.h:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 141]
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -  1  libxul.so!mozilla::Task::PriorityCompare::operator()(RefPtr<mozilla::Task> const&, RefPtr<mozilla::Task> const&) const [TaskController.h:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 176]
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -  2  libxul.so!std::_Rb_tree<RefPtr<mozilla::Task>, RefPtr<mozilla::Task>, std::_Identity<RefPtr<mozilla::Task> >, mozilla::Task::PriorityCompare, std::allocator<RefPtr<mozilla::Task> > >::_M_get_insert_unique_pos(RefPtr<mozilla::Task> const&) [stl_tree.h : 2061]
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -  3  libxul.so!std::_Rb_tree<RefPtr<mozilla::Task>, RefPtr<mozilla::Task>, std::_Identity<RefPtr<mozilla::Task> >, mozilla::Task::PriorityCompare, std::allocator<RefPtr<mozilla::Task> > >::_M_insert_unique<RefPtr<mozilla::Task> >(RefPtr<mozilla::Task>&&) [stl_tree.h : 2114]
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -  4  libxul.so!std::set<RefPtr<mozilla::Task>, mozilla::Task::PriorityCompare, std::allocator<RefPtr<mozilla::Task> > >::insert(RefPtr<mozilla::Task>&&) [stl_set.h : 521]
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -  5  libxul.so!mozilla::TaskController::AddTask(already_AddRefed<mozilla::Task>&&) [TaskController.cpp:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 459]
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -  6  libxul.so!mozilla::TaskController::DispatchRunnable(already_AddRefed<nsIRunnable>&&, unsigned int, mozilla::TaskManager*) [TaskController.cpp:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 598]
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.256Z] 12:41:28     INFO -  7  libxul.so!mozilla::detail::EventQueueInternal<(unsigned long)16>::PutEvent(already_AddRefed<nsIRunnable>&&, mozilla::EventQueuePriority, mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator>*) [EventQueue.cpp:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 55 + 0x11a]
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -      rax = 0x0000000000000004    rdx = 0x0000000000000001
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -      rcx = 0xe5e5e5e5e5e5e5e5    rbx = 0x00007fafc266de80
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -      rsi = 0x00007fafc2c43910    rdi = 0x00007fafd6ab4c00
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -      rbp = 0x0000000000000004    rsp = 0x00007fafc4478300
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -       r8 = 0x00000000f2fffeff     r9 = 0x6d0167389265efcd
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -      r10 = 0x0000000000000000    r11 = 0x0000000000000246
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -      r12 = 0x00007fafc2c43910    r13 = 0x0000000000000000
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -      r14 = 0x00007fafd6ab4c00    r15 = 0x00007fafd6ab4d60
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -      rip = 0x00007fafce7ff2c4
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -     Found by: given as instruction pointer in context
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -  9  libxul.so!mozilla::ThreadEventQueue::PutEvent(already_AddRefed<nsIRunnable>&&, mozilla::EventQueuePriority) [ThreadEventQueue.cpp:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 73]
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO - 10  libxul.so!mozilla::ThreadEventTarget::Dispatch(already_AddRefed<nsIRunnable>, unsigned int) [ThreadEventTarget.cpp:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 92 + 0xb8]
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -      rbx = 0x00007fafc26fc020    rbp = 0x0000000000000004
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -      rsp = 0x00007fafc4478360    r12 = 0x00007fafc26fc000
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -      r13 = 0x00007fafd6af3a00    r14 = 0x00007fafd6ae8f50
[task 2023-09-29T12:41:28.257Z] 12:41:28     INFO -      r15 = 0x00007fafd6ae8f80    rip = 0x00007fafcec5e430
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO -     Found by: call frame info
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO - 11  libxul.so!nsIEventTarget::Dispatch(nsIRunnable*, unsigned int) [nsIEventTarget.h: : 38]
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO - 12  libxul.so!TimerThread::PostTimerEvent(already_AddRefed<nsTimerImpl>) [TimerThread.cpp:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 1249]
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO - 13  libxul.so!TimerThread::Run() [TimerThread.cpp:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 778 + 0xfe]
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO -      rbx = 0x00007fafc26fc028    rbp = 0x00007fafcae1e058
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO -      rsp = 0x00007fafc44783b0    r12 = 0x00007fafc26fc020
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO -      r13 = 0x00007fafd6af3a00    r14 = 0x00007fafcae1e030
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO -      r15 = 0x00007fafc3607790    rip = 0x00007fafcf7e4386
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO -     Found by: call frame info
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO - 14  libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 1192]
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO - 15  libxul.so!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 480 + 0x3ec]
[task 2023-09-29T12:41:28.258Z] 12:41:28     INFO -      rbx = 0x00007fafc36041a0    rbp = 0x00007fafcae1ed00
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO -      rsp = 0x00007fafc44788f0    r12 = 0x00000135940a13b6
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO -      r13 = 0x00007fafc36041a0    r14 = 0x00000000ffffffff
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO -      r15 = 0x00007fafcae1e030    rip = 0x00007fafceba339e
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO -     Found by: call frame info
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO - 16  libxul.so!mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) [MessagePump.cpp:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 300 + 0x9]
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO -      rbx = 0x00007fafc3603200    rbp = 0x0000000000000005
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO -      rsp = 0x00007fafc4478a80    r12 = 0x00007fafc3603220
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO -      r13 = 0x00007fafc36041a0    r14 = 0x00007fafc4478b18
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO -      r15 = 0x00007fafc65fd160    rip = 0x00007fafce867bc9
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO -     Found by: call frame info
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO - 17  libxul.so!MessageLoop::RunInternal() [message_loop.cc:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 370]
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO - 18  libxul.so!MessageLoop::RunHandler() [message_loop.cc:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 363]
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO - 19  libxul.so!MessageLoop::Run() [message_loop.cc:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 345]
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO -     Found by: inlining
[task 2023-09-29T12:41:28.259Z] 12:41:28     INFO - 20  libxul.so!nsThread::ThreadFunc(void*) [nsThread.cpp:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 370 + 0x41]
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -      rbx = 0x00007fafc65faea0    rbp = 0x0000000000000005
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -      rsp = 0x00007fafc4478b00    r12 = 0x0000000000000000
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -      r13 = 0x0000000000000002    r14 = 0x00007fafc4478b00
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -      r15 = 0x00007fafd6a2f7c0    rip = 0x00007fafcf1e7a41
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -     Found by: call frame info
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO - 21  libnspr4.so!_pt_root [ptthread.c:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 201 + 0x6]
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -      rbx = 0x00007fafc65fd160    rbp = 0x0000000000000002
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -      rsp = 0x00007fafc4478d00    r12 = 0x0000000000000000
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -      r13 = 0x0000000000000002    r14 = 0x00007fafc4479640
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -      r15 = 0x00007fafd6a1c880    rip = 0x00007fafd6c87bbd
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -     Found by: call frame info
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO - 22  firefox-bin!set_alt_signal_stack_and_start(PthreadCreateParams*) [pthread_create_interposer.cpp:5642d4fa28cff48dedc96523cedec93fc88ca6e4 : 81 + 0x5]
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -      rbx = 0x00007fafc65fd160    rbp = 0x0000000000000000
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -      rsp = 0x00007fafc4478d50    r12 = 0x00007fafd6c87a30
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -      r13 = 0x0000000000000002    r14 = 0x00007fafc681f000
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -      r15 = 0x00007fafc4478d80    rip = 0x00005632bc220314
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO -     Found by: call frame info
[task 2023-09-29T12:41:28.260Z] 12:41:28     INFO - 23  libc.so.6!start_thread [pthread_create.c : 442 + 0x11]
[task 2023-09-29T12:41:28.261Z] 12:41:28     INFO -      rbx = 0x00007fafc4479640    rbp = 0x0000000000000000
[task 2023-09-29T12:41:28.262Z] 12:41:28     INFO -      rsp = 0x00007fafc4478e20    r12 = 0x00007fafc4479640
[task 2023-09-29T12:41:28.262Z] 12:41:28     INFO -      r13 = 0x0000000000000002    r14 = 0x00007fafd6d3d850
[task 2023-09-29T12:41:28.262Z] 12:41:28     INFO -      r15 = 0x00007ffde02e6050    rip = 0x00007fafd6d3db43
[task 2023-09-29T12:41:28.262Z] 12:41:28     INFO -     Found by: call frame info
[task 2023-09-29T12:41:28.262Z] 12:41:28     INFO - 24  libc.so.6!__clone3 + 0x2f
[task 2023-09-29T12:41:28.262Z] 12:41:28     INFO -      rbx = 0x00007ffde02e5ef0    rbp = 0x0000000000000000
[task 2023-09-29T12:41:28.262Z] 12:41:28     INFO -      rsp = 0x00007fafc4478ec0    r12 = 0x00007fafc4479640
[task 2023-09-29T12:41:28.262Z] 12:41:28     INFO -      r13 = 0x0000000000000002    r14 = 0x00007fafd6d3d850
[task 2023-09-29T12:41:28.262Z] 12:41:28     INFO -      r15 = 0x00007ffde02e6050    rip = 0x00007fafd6dcfa00
[task 2023-09-29T12:41:28.262Z] 12:41:28     INFO -     Found by: call frame info

3 crashes out of 20 makes this a regular intermittent or even maybe a one time deal.

Summary: Perma Tier2 LateBeta application crashed [@ mozilla::Task::GetPriority] | automation.py when Gecko 120 merges to Beta on 2023-10-23 → Intermittent linux wayland [tier 2] Late Beta application crashed [@ mozilla::Task::GetPriority] | automation.py when Gecko 120 switches to Late Beta on 2023-11-03

Following the example of bug 1855215 comment 1.

Group: core-security
Severity: -- → S3
Group: core-security → dom-core-security

I see TimerThread on the stack. Is that passing a deleted object around?

Flags: needinfo?(jlink)

I think I have been able to figure out a lot (but maybe not everything) about what has happened here and why.

Observations:

  • The direct cause of the crash is an attempted read from a bad memory location (0xe5e5e5e5e5e5e649) in Task::GetPriority().
  • The reason for this attempted memory access is a Task pointer that is 0xe5e5e5e5e5e5e5e5.(*)
  • 0xe5e5e5e5e5e5e5e5 is a fill pattern used by heap allocators to mark deleted memory.
  • This implies not that the Task object itself was deleted, but that wherever the pointer this Task object was "fetched" from was deleted.
  • The context in which we encounter this bad Task pointer is in a comparison while inserting into std::set<RefPtr<Task>> 'mMainThreadTasks'.

My first inclination would be to suspect that the newly insert()'ed task is the bad task however I think that is not the case. There are plenty of accesses through that pointer prior to the insertion, including immediately prior to insertion for example, so it seems like we must be coming across a Task pointer in a deleted RefPtr that is already in the set. Since std::set is going to be managing the creation and deletion of the RefPtr's that is uses internally, this would seem to imply that either std::set has a defect in how it manages those internal objects (possible but unlikely) or that we have a concurrency issue where the set is being modified at the same time on a different thread. In looking through the code, however, all references to mMainThreadTasks seem to be protected appropriately with locks of mGraphMutex.

The full output of the crash provides the next piece of this puzzle: TaskController is being deleted on another thread because that thread owns the TaskController and is shutting down. TaskController directly contains mMainThreadTasks so the destructor of TaskController will implicitly be deleting that member, even without any direct reference to it.

The code that is triggering this shutdown seems to be this which was just recently added. Interestingly that code is only supposed to execute if we're not running headless which is how I would have expected most tests to be running so maybe there's a problem there.

@emilio: Do you have any thoughts on any of this?

(*) The memory allocation is calculated by adding "0x64" to 0xe5e5e5e5e5e5e5e5. I think that this is accessing the mPriority member but, for me, mPriority is at offset 64 (decimal), not in hex, and I don't see any reason why it would be different on Linux. I think that perhaps the output here might be mistakenly claiming that this value is hex.

Flags: needinfo?(jlink) → needinfo?(emilio)

So the fact that TaskController is an std::unique_ptr is really odd, because that means that onexit handlers destroy it, which is probably not great: https://searchfox.org/mozilla-central/rev/1865e9fba4166ab2aa6c9d539913115723d9cc53/xpcom/threads/TaskController.cpp#31

The other issue is that these tests are crashing because we don't ship wayland yet past late beta, that's bug 1752398. So our test harness doesn't enable it separately and we fail initialization.

So I think there are two things that need to happen:

  • Fix TaskController shutdown.
  • Either ship wayland, or enable it on our wayland tests with MOZ_ENABLE_WAYLAND=1. Probably the later is easiest.

Andrew, can you look at the second bit? It should be a matter of setting that env variable on our testing environment. I can look at the first.

Depends on: wayland-stable
Flags: needinfo?(emilio) → needinfo?(ahal)

The task from the failure log in comment 0 does have MOZ_ENABLE_WAYLAND: "1" in its env. So not sure what's going on here :/

Flags: needinfo?(ahal) → needinfo?(emilio)

Hmm, bug 1856582 added more diagnostics here. Is this fixed? If not could we get an updated log?

Flags: needinfo?(emilio) → needinfo?(nbeleuzu)

Emilio, the test didn`t fail on todays beta-sim

Flags: needinfo?(nbeleuzu) → needinfo?(emilio)

Right, but we need a late-beta-sim, right?

Flags: needinfo?(emilio) → needinfo?(nbeleuzu)

Sorry, here is the late-beta

Flags: needinfo?(nbeleuzu) → needinfo?(emilio)

Okay, let's call this fixed since the underlying bug is fixed and the tasks pass now.

Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(emilio)
Resolution: --- → WORKSFORME

Bulk-unhiding security bugs fixed in Firefox 119-121 (Fall 2023). Use "moo-doctrine-subsidy" to filter

Group: dom-core-security
You need to log in before you can comment on or make changes to this bug.