Closed Bug 1637890 Opened 4 years ago Closed 4 years ago

Crash in [@ mozilla::MozPromise<T>::ThenValueBase::Dispatch]

Categories

(Core :: IPC, defect)

78 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla78
Tracking Status
firefox-esr68 --- unaffected
firefox76 --- unaffected
firefox77 --- unaffected
firefox78 --- fixed

People

(Reporter: calixte, Assigned: jya)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(4 files, 1 obsolete file)

This bug is for crash report bp-e709c7dd-227e-406a-a289-d6ffd0200513.

Top 10 frames of crashing thread:

0 libxul.so mozilla::MozPromise<mozilla::ProfileBufferChunkManagerUpdate, mozilla::ipc::ResponseRejectReason, true>::ThenValueBase::Dispatch xpcom/threads/MozPromise.h:489
1 libxul.so mozilla::MozPromise<mozilla::ProfileBufferChunkManagerUpdate, mozilla::ipc::ResponseRejectReason, true>::DispatchAll xpcom/threads/MozPromise.h:1042
2 libxul.so void mozilla::MozPromise<mozilla::ProfileBufferChunkManagerUpdate, mozilla::ipc::ResponseRejectReason, true>::Private::Reject<mozilla::ipc::ResponseRejectReason> xpcom/threads/MozPromise.h:1144
3 libxul.so mozilla::ipc::MessageChannel::Clear /builds/worker/fetches/clang/include/c++/7.4.0/bits/std_function.h:706
4 libxul.so mozilla::ipc::MessageChannel::NotifyChannelClosed ipc/glue/MessageChannel.cpp:2750
5 libxul.so mozilla::ProfilerParentTracker::~ProfilerParentTracker ipc/glue/ProtocolUtils.cpp:617
6 libxul.so mozilla::ClearOnShutdown_Internal::PointerClearer<mozilla::UniquePtr<mozilla::ProfilerParentTracker, mozilla::DefaultDelete<mozilla::ProfilerParentTracker> > >::Shutdown xpcom/base/ClearOnShutdown.h:72
7 libxul.so mozilla::KillClearOnShutdown xpcom/base/ClearOnShutdown.cpp:53
8 libxul.so mozilla::ShutdownXPCOM xpcom/build/XPCOMInit.cpp:674
9 libxul.so ScopedXPCOMStartup::~ScopedXPCOMStartup toolkit/xre/nsAppRunner.cpp:1277

There is 1 crash in nightly 78 with buildid 20200513094918. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1592488.
The moz_crash_reason is MOZ_DIAGNOSTIC_ASSERT(AbstractThread::GetCurrent() && AbstractThread::GetCurrent()->IsTailDispatcherAvailable()) (An AbstractThread must exist for the current thread with a tail dispatcher available).

[1] https://hg.mozilla.org/mozilla-central/rev?node=09f0c7ef9cd3

Flags: needinfo?(jyavenard)

The issue is with https://searchfox.org/mozilla-central/rev/9f074fab9bf905fad62e7cc32faf121195f4ba46/tools/profiler/gecko/ProfilerParent.cpp#558

it does:

 RefPtr<AwaitNextChunkManagerUpdatePromise> updatePromise =
      SendAwaitNextChunkManagerUpdate();
  updatePromise->Then(
      GetMainThreadSerialEventTarget(), __func__,

This dispatches to GetMainThreadSerialEventTarget() ;

The crash is on shutdown, what seems to be happening is that the Main thread's AbstractThread gets shutdown before the ProfilerParentTracker.

Both register with ClearOnShutdown:
AbstractThread: https://searchfox.org/mozilla-central/source/xpcom/threads/AbstractThread.cpp#231

ProfileParentTracker: https://searchfox.org/mozilla-central/rev/9f074fab9bf905fad62e7cc32faf121195f4ba46/tools/profiler/gecko/ProfilerParent.cpp#321

So order of shutdown is at play here.

Flags: needinfo?(jyavenard)
Assignee: nobody → jyavenard

Is it normal to still have channels up during xpcom shutdown?

Anyway, it looks like ~ProfilerParentTracker() is closing extant channels, and in there a channel is trying to reject a pending promise from a SendAwaitNextChunkManagerUpdate().

Why is MozPromise not happy about that?

(In reply to Gerald Squelart [:gerald] (he/him) from comment #2)

Is it normal to still have channels up during xpcom shutdown?

it's racy, we are shutting down. There could be pending transactions.

Anyway, it looks like ~ProfilerParentTracker() is closing extant channels, and in there a channel is trying to reject a pending promise from a SendAwaitNextChunkManagerUpdate().

Why is MozPromise not happy about that?

It's a MozPromise being dispatched via the direct task queue; which require the AbstractThread::MainThread to be running.

And it's been shutdown already ; normal dispatch would have just dropped silently the runnable , but with direct dispatch we assert.

Looking at the different phases of shutdown in ShutdownXPCOM, I see that our problem is in the very last one ShutdownFinal, which clears smart pointers including ProfilerParentTracker, but it happens after ShutdownThreads!
So I guess we'd want to close these channels before ShutdownThreads, so that promise rejections can be dispatched.
Or at least reject promises even without closing channels, which sounds scary for me to touch.

Any other ideas?

(In reply to Gerald Squelart [:gerald] (he/him) from comment #4)

Looking at the different phases of shutdown in ShutdownXPCOM, I see that our problem is in the very last one ShutdownFinal, which clears smart pointers including ProfilerParentTracker, but it happens after ShutdownThreads!
So I guess we'd want to close these channels before ShutdownThreads, so that promise rejections can be dispatched.
Or at least reject promises even without closing channels, which sounds scary for me to touch.

Any other ideas?

I don't know the expected lifetime of the ProfilerParentTracker ; but I'm guessing that as we're shutting-down we don't really care.

ClearOnShutdown can be given an optional argument on when it should be called; using ShutdownThread seems to be the right thing to do ;

2nd issue is that the main thread AbstractThread instance is getting shutdown too early anyway.

(In reply to Jean-Yves Avenard [:jya] from comment #5)

2nd issue is that the main thread AbstractThread instance is getting shutdown too early anyway.

This sounds like a bigger issue that could impact others!
If you think it should still be up at ShutdownFinal, this would fix my problem, and feels like a more generic solution.
Indeed it looks like dispatching from MT to MT is still allowed after ShutdownThreads: https://searchfox.org/mozilla-central/rev/4166c15e2a99a23a9b38ad62c9fdfe8e5448b354/xpcom/threads/ThreadEventTarget.cpp#116

Please let me know if you think I should still close the Profiler channels earlier (or at least reject any pending promise). I could file a separate bug for that alone, if you want to keep this one for AbstractThread.

We want the object to be deleted last; ordering in call to ClearOnShutdown() can't guarantee it.

Depends on D75497

When performing the last GC, remaining state watchers would dispatch direct tasks ; this was exposed with P3. Ensure this never happens.

Depends on D75498

Attachment #9149297 - Attachment description: Bug 1637890 - P3. Don't use ClearOnShutdown to destroy AbstractThread::MainThread(). r?froydnj → Bug 1637890 - P2. Don't use ClearOnShutdown to destroy AbstractThread::MainThread(). r?froydnj
Attachment #9149298 - Attachment description: Bug 1637890 - P4. Ensure an AbstractThread exists when dispatching a task. r?froydnj → Bug 1637890 - P3. Ensure an AbstractThread exists when dispatching a task. r?froydnj
Attachment #9149574 - Attachment description: Bug 1637890 - P5. Don't allow the tail dispatcher when threads have been shutdown. r?froydnj → Bug 1637890 - P4. Don't allow the tail dispatcher when threads have been shutdown. r?froydnj
Attachment #9149296 - Attachment description: Bug 1637890 - P2. Process events dispatched by destructors during shutdown. r?froydnj → Bug 1637890 - P5. Process events dispatched by destructors during shutdown. r?froydnj
Attachment #9149574 - Attachment is obsolete: true
Attachment #9149296 - Attachment description: Bug 1637890 - P5. Process events dispatched by destructors during shutdown. r?froydnj → Bug 1637890 - P4. Process events dispatched by destructors during shutdown. r?froydnj
Pushed by jyavenard@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/92f557e40298
P1. Process ClearOnShutdown listeners in LIFO order. r=froydnj
https://hg.mozilla.org/integration/autoland/rev/17ffe89e84fe
P2. Don't use ClearOnShutdown to destroy AbstractThread::MainThread(). r=froydnj
https://hg.mozilla.org/integration/autoland/rev/5e2da1586aca
P3. Ensure an AbstractThread exists when dispatching a task. r=froydnj
Pushed by jyavenard@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ba935303434c
P4. Process events dispatched by destructors during shutdown. r=froydnj
Regressions: 1640009
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: