Crash in CheckCanChangeActiveContext when open devtools + Quantum DOM scheduler enabled

RESOLVED FIXED in Firefox 58

Status

()

defect
P3
critical
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: kanru, Assigned: billm)

Tracking

({crash})

unspecified
mozilla58
Unspecified
Linux
Points:
---

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox56 unaffected, firefox57 disabled, firefox58 fixed)

Details

(crash signature)

Attachments

(2 attachments, 2 obsolete attachments)

This bug was filed from the Socorro interface and is 
report bp-37b34a21-da38-4dee-af82-7c2940170830.
=============================================================

STR:

  1. Set about:config
     dom.ipc.scheduler = true
     dom.ipc.scheduler.chaoticScheduling = false
     dom.ipc.scheduler.preemption = false
     dom.ipc.scheduler.threadCount = 2
     dom.ipc.scheduler.useMultipleQueues = true

  2. Select a tab and open devtools

Stack:

  CheckCanChangeActiveContext
 	js/src/vm/Runtime.cpp:366
  JSRuntime::setActiveContext
 	js/src/vm/Runtime.cpp:373
  mozilla::CooperativeThreadPool::CooperativeThread::Yield
 	xpcom/threads/CooperativeThreadPool.cpp:259
  mozilla::detail::SchedulerEventQueue::GetEvent
 	xpcom/threads/Scheduler.cpp:278
  nsThread::ProcessNextEvent(bool, bool*)
  ...
  nsFrameMessageManager::ReceiveMessage
  mozilla::dom::TabChild::RecvAsyncMessage
  mozilla::dom::PBrowserChild::OnMessageReceived

The crashing thread is not main thread so I'm not sure why TabChild::RecvAsyncMessage appears in the stack.
Duplicate of this bug: 1395546
This is the #6 Windows topcrash in Nightly 20170901220209. Bug 1395546 also has the same signature.

bhackett, you modified this failing assertion:

> MOZ_RELEASE_ASSERT(group->ownerContext().context() == nullptr)

in bug 1341321. Any ideas?
Flags: needinfo?(bhackett1024)
The JS debugger is not compatible with cooperative multithreading (the debugger can create arbitrary links between different zone groups, so if there are multiple threads around then particular threads can't maintain exclusive access to the zone group they are operating on).  There are a couple of callbacks which the JS engine invokes when only one cooperative thread should be operating at a time (specified by SetSingleThreadedExecutionCallbacks); the JS shell uses these but AFAICT from dxr the browser does not.  If these callbacks are used to ensure that only a single thread is used when a debugger is running, then the yield which triggered this crash should not need to be performed.
Flags: needinfo?(bhackett1024) → needinfo?(wmccloskey)
Flags: needinfo?(wmccloskey)
This patch adds support for temporarily suspending and resuming the scheduler. This is needed for later patches.
Assignee: nobody → wmccloskey
Attachment #8910974 - Flags: review?(nfroyd)
This is needed for child process debugging since it can touch all zones.
Attachment #8910975 - Flags: review?(jimb)
I need some advice to proceed here. This patch sorta works, but I'm pretty sure it will result in lots of fuzz bugs. Here are some of the issues:

1. I noticed that the web console seems to call wrapDebuggeeValue before it has added any debuggee globals to a debugger. I saw that happening through this line:
http://searchfox.org/mozilla-central/rev/15ce5cb2db0c85abbabe39a962b0e697c9ef098f/devtools/server/actors/webconsole.js#456
I hacked around this problem by causing wrapDebuggeeValue to "join" a zone group if the Debugger isn't already part of one. But I'm worried this isn't really correct.

2. There are a number of functions like adoptDebuggeeValue and findAllGlobals that seem like they would allow you to skirt the restrictions I've added here, causing wrapDebuggeeValue to assert. Even if the Firefox front-end doesn't use them this way, I'm worried that the fuzzer will find a way to do so. And I'm not sure of the right way to restrict them. 

3. WebExtension debugging does in fact use findAllGlobals, even though it theoretically only debugs one zone group.
Attachment #8910984 - Flags: feedback?(jimb)
Comment on attachment 8910974 [details] [diff] [review]
Scheduler support for blockThreadedExecution

Review of attachment 8910974 [details] [diff] [review]:
-----------------------------------------------------------------

r=me; let me know if you have objections to the changes outlined below.

::: js/xpconnect/idl/xpccomponents.idl
@@ +126,5 @@
>  /**
> + * Interface for callback to be passed to Cu.blockThreadedExecution.
> + */
> +[scriptable, function, uuid(c3b85a5c-c328-47d4-aaaf-384c4ff9d77d)]
> +interface nsBlockThreadedExecutionCallback : nsISupports

Any reason we didn't call this nsIBlock...?  Do we not expect C++ implementors of this interface, but only JS functions?

::: js/xpconnect/src/XPCComponents.cpp
@@ +3411,5 @@
> +{
> +    // Take an owning reference to the callback.
> +    RefPtr<nsBlockThreadedExecutionCallback> callback(aCallback);
> +    Scheduler::BlockThreadedExecution(NS_NewRunnableFunction("BlockThreadedExecution",
> +                                                             [callback] { callback->Callback(); }));

FWIW, NS_NewRunnableMethod would be about the same thing, and then you wouldn't have to bother with the RefPtr, because NS_NewRunnableMethod would take the owning reference for you.

Actually, could we instead pass the callback unmodified all the way through, and let the scheduler (SchedulerImpl::Stop) handle the details of turning it into a runnable, though?  Having it be a runnable seems like we're letting details about how the scheduler does things internally leak out into the external API.  having the callback type in the external API would also make things a little clearer and less likely to be misused (people passing random runnables in or what have you).

::: xpcom/threads/Scheduler.cpp
@@ +520,5 @@
> +  MOZ_ASSERT(mNumSchedulerBlocks > 0);
> +
> +  MutexAutoLock lock(mLock);
> +  mShuttingDown = true;
> +  mShutdownCallback = aStoppedCallback;

We should assert that we don't already have a callback here?

@@ +534,5 @@
>    mShuttingDown = true;
> +
> +  // Delete the SchedulerImpl once shutdown is complete.
> +  mShutdownCallback = NS_NewRunnableFunction("SchedulerImpl::Shutdown",
> +                                             [] { Scheduler::sScheduler = nullptr; });

And here too.
Attachment #8910974 - Flags: review?(nfroyd) → review+
I pushed the previous patch to try and that revealed a number of additional issues that I'm having trouble getting a handle on.

There's more than enough risk already associated with Quantum DOM. I don't want to add any more, especially for something that I don't think is likely to have much effect. This patch makes us use blockThreadedExecution for all content process debugging.
Attachment #8910975 - Attachment is obsolete: true
Attachment #8910984 - Attachment is obsolete: true
Attachment #8910975 - Flags: review?(jimb)
Attachment #8910984 - Flags: feedback?(jimb)
Attachment #8911336 - Flags: review?(jimb)
Comment on attachment 8911336 [details] [diff] [review]
Use blockThreadedExecution for debugging

Review of attachment 8911336 [details] [diff] [review]:
-----------------------------------------------------------------

This is better than changing the Debugger API itself to do the synchronization.
Attachment #8911336 - Flags: review?(jimb) → review+
Priority: -- → P3
Pushed by wmccloskey@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/41d3c1e5dacc
Scheduler support for blockThreadedExecution (r=froydnj)
https://hg.mozilla.org/integration/mozilla-inbound/rev/d1666bcca1a5
Use blockThreadedExecution for debugging (r=jimb)
https://hg.mozilla.org/mozilla-central/rev/41d3c1e5dacc
https://hg.mozilla.org/mozilla-central/rev/d1666bcca1a5
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
You need to log in before you can comment on or make changes to this bug.