Closed Bug 801304 Opened 12 years ago Closed 5 years ago

Nested event loops do not suspend XHR events

Categories

(DevTools :: Debugger, defect, P3)

defect

Tracking

(firefox66 fixed)

RESOLVED FIXED
Firefox 66
Tracking Status
firefox66 --- fixed

People

(Reporter: bugzilla, Assigned: bhackett1024)

References

(Blocks 1 open bug)

Details

(Whiteboard: [devtools-platform])

Attachments

(6 files, 5 obsolete files)

Attached file test case
1. Open the attached test case
2. Open the Debugger and Web Console panels
3. Refresh the page

Actual results:
1. The debugger breaks at line 12
2. "test failed" is printed to the console immediately after

Expected:
1. The debugger breaks at line 12
2. Nothing is printed to the console
3. Press the resume button
4. "test passed" is printed to the console

Here is a copy of the script in the attachment:

var result = "test failed";

onload = function(){
    var xhr = new XMLHttpRequest();
    xhr.open('GET', 'http://www.mozilla.org', true);
    xhr.onload = done;
    xhr.onerror = done;
    xhr.send();
    debugger; // line 12
    result = "test passed";
};

function done(){
    console.log(result);
}

The result is the same if you set a breakpoint in the debugger UI instead of using the debugger keyword.

When the debugger is not loaded, "test passed" is printed to the console as expected.

It fails in both Firebug and the built in debugger in Firefox 19 nightly and Firefox 16 release.
Thanks for filing, I remember Victor had noticed this as well. It looks like we need platform help to fix this. We currently use nsIDOMWindowUtils.suppressEventHandling and nsIDOMWindowUtils.suspendTimeouts, but apparently none of those affect XHR:

http://mxr.mozilla.org/mozilla-central/source/toolkit/devtools/debugger/server/dbg-browser-actors.js#465

The only somewhat related bug that I can find, is bug 409737. If this is true, Boris mentions in that bug that he knows what needs to be done.
This has nothing to do with bug 409737.

The basic problem is that when you spin the event loop the XHR events still get delivered. But worse yet, not delivering them would violate the XHR spec unless done _very_ carefully, because they would end up being delivered when the XHR object is in the wrong state.  There are existing bugs on this.
FILTER ON PUMPKINS.
Priority: -- → P3
See Also: → 1044074
Summary: Debugger breakpoint breaks run-to-completion when used with async XMLHttpRequest → Nested event loops do not suspend async XMLHttpRequests
Summary: Nested event loops do not suspend async XMLHttpRequests → Nested event loops do not suspend XHR events
Blocks: dbg-r2c
Blocks: 1039795
Blocks: 977972
(In reply to Not doing reviews right now from comment #2)
> This has nothing to do with bug 409737.
> 
> The basic problem is that when you spin the event loop the XHR events still
> get delivered. But worse yet, not delivering them would violate the XHR spec
> unless done _very_ carefully, because they would end up being delivered when
> the XHR object is in the wrong state.  There are existing bugs on this.

Hi Boris,

I suspect this bug might be causing bug 1158498. In that bug, we have an asynchronous XHR request, where the callback gets invoked while the debuggee is supposed to be paused.

Based on your comments above, it seems like it should be possible to fix this, although I don't know how hard it would be in practice. Can you give me an idea of what would need to be done?
Assignee: nobody → ejpbruel
Blocks: 1143449
Whiteboard: [devtools-platform]
Assignee: ejpbruel → nobody
Product: Firefox → DevTools
Here is another STR: https://dbg-r2c-xhr.glitch.me/

We saw this here:
1. go to https://firefox-dev.tools/debugger-examples/examples/todomvc/
2. add a todo
3. add a breakpoint at todo-view.js#34
3. reload

ER: getFile's XHR onload callback is suspended,
AR: it is run and we see `JavaScript warning: resource://devtools/server/actors/utils/event-loop.js, line 118: DebuggeeWouldRun: debuggee `https://alpine-ninja.glitch.me/client.js:14' would run`
Attached patch WIP (obsolete) — Splinter Review
Attempted fix for this bug.  I think this is correct, but this needs testing and its own tests.

The strategy here is to suppress events on the channels associated with XHR requests on threads where there is a nested event loop pushed by the debugger.  The network code already has a class, ChannelEventQueue, which is in place to deal with a similar problem where an XHR handler spins up its own nested event loop, in which case other events for the same XHR should not fire reentrantly.  This just needs some modification so that we can suppress events for *all* XHRs when inside the debugger's nested event loops.
Assignee: nobody → bhackett1024
I suppose I should clarify this a bit: 'suppress' here just pushes back the events until some later time when they can be processed (in the original order they were delivered) on the thread's main event loop.
Attached patch WIP (obsolete) — Splinter Review
Updated patch, with a new test.  This optimizes things and fixes a couple problems with the earlier patch:

- nsJSInspector is main thread only, and can't be accessed on worker threads.  I don't know what the debugging story currently is for workers, but AFAICT this means we can't enter nested pauses on debuggers, and so can't have run-to-completion problems when debugging them.  Accesses have been fixed to make sure we don't use nsJSInspector off the main thread.

- After the debugger server pauses, it can create new XHRs to, for example, load the URL where the debuggee is paused.  These XHRs need to be processed by the event loop when paused, so I added logic to allow activity on any ChannelEventQueues that were created after the debugger started pausing.  This works but it isn't very robust: if the debugger creates an XHR right before the pause, events on that XHR can't be processed while paused.  I'm going to keep looking to see if there is a better option here.
Attachment #9030064 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
Finished patch.  It turns out that both points in comment 9 are somewhat wrong:

- Nested pauses in workers *are* supported, but they just don't use nsJSInspector.  When a worker pauses it does so via WorkerPrivate::EnterDebuggerEventLoop, but since this doesn't spin up a complete event loop for the thread (it only processes messages associated with the debugger) there shouldn't be any run-to-completion problems with XHR requests the worker has made.

- The debugger does not create its own XHRs while paused, AFAICT.  It does create ChannelEventQueues via the network resources it fetches, however.  To allow these queues to operate while paused, and to remove the workaround in the last patch, this patch only suspends ChannelEventQueues whose owner is associated with an XHR.
Attachment #9030117 - Attachment is obsolete: true
This patch allows nsJSInspector to suspend activity in ChannelEventQueues when they are associated with an XHR and a nested event loop has been pushed onto the main thread's stack.
Attachment #9030153 - Flags: review?(jimb)
Before processing events in a ChannelEventQueue, check with the nsJSInspector to see if the queue should be suspended instead.  This also adds a method on the ChannelEventQueue to see if the queue is associated with an XHR, which nsJSInspector will need for its own testing.
Attachment #9030154 - Flags: review?(honzab.moz)
Add a test that main thread XHR handlers do not break run-to-completion when pausing in the debugger.
Attachment #9030155 - Flags: review?(jlaster)
This is a minor thing, but I think it is nice when the run-to-completion test is in the server unit directory versus the client directory. 

Jim's worker onmessage test is a good example:
https://phabricator.services.mozilla.com/D9221#change-qExsoApmzPsx
Brian, does your patch address this bug as well? 
https://bugzilla.mozilla.org/show_bug.cgi?id=977972
(In reply to Jason Laster [:jlast] from comment #15)
> Brian, does your patch address this bug as well? 
> https://bugzilla.mozilla.org/show_bug.cgi?id=977972

Yeah, it does, just commented in that bug.
Attachment #9030155 - Flags: review?(jlaster) → review?(jimb)
Comment on attachment 9030154 [details] [diff] [review]
Part 2 - Check if event queues should be suspended before processing their events.

Review of attachment 9030154 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/ipc/ChannelEventQueue.cpp
@@ +172,5 @@
> +  if (channel) {
> +    nsCOMPtr<nsILoadInfo> loadInfo = channel->GetLoadInfo();
> +    if (loadInfo) {
> +      nsContentPolicyType contentType = loadInfo->InternalContentPolicyType();
> +      return contentType == nsIContentPolicy::TYPE_INTERNAL_XMLHTTPREQUEST;

not sure how often this may get called, but some kind of caching may be good.  not a review requirement, tho.

::: netwerk/ipc/ChannelEventQueue.h
@@ +157,5 @@
>  };
>  
> +inline bool ChannelEventQueue::MaybeSuspendForJSInspector() {
> +  if (jsinspector::nsJSInspector::SuspendOnChannelEvent(this)) {
> +    SuspendInternal();

add a comment that SuspendOnChannelEvent takes ownership of the queue and resumes when the conditions in the inspector are met.
Attachment #9030154 - Flags: review?(honzab.moz) → review+
(In reply to Brian Hackett (:bhackett) from comment #9)

> - nsJSInspector is main thread only, and can't be accessed on worker
> threads.  I don't know what the debugging story currently is for workers

AFAIK nobody has worked on this since Eddy Bruel left.  He left a PDF
explaining the current state as of then (December 2016) which I can forward
to anybody interested in this.
I'd love to see the PDF
Attached file Worker Debugging.pdf
Eddy's worker debugging document.
Comment on attachment 9030153 [details] [diff] [review]
Part 1 - Allow nsJSInspector to suspend event queues.

Review of attachment 9030153 [details] [diff] [review]:
-----------------------------------------------------------------

This looks to me like it suspends all XHRs in all tabs, when we enter the debugger in any tab. I think the effects need to be limited to the XHRs owned by the debuggee. If I'm misunderstanding that, let me know.

::: devtools/platform/nsJSInspector.cpp
@@ +63,5 @@
>    MOZ_ASSERT(mLastRequestor.isNull());
>    mozilla::DropJSObjects(this);
>  }
>  
> +/* static */ Atomic<size_t> nsJSInspector::gTotalNestedLoopLevel;

This should have a #include <mozilla/Atomics.h>.

@@ +94,5 @@
>    }
>  
> +  if (mNestedLoopLevel == 0) {
> +    for (net::ChannelEventQueue* queue : mSuspendedQueues) {
> +      queue->Resume();

Ideally, when the program is paused at a breakpoint, and the user explicitly causes some sort of interaction (say, an XHR) from the console, that would run, even as other XHRs started by the interrupted code remain paused. But we don't do that for timers, and it seems like a lot of trouble for a corner case, so simply suspending them all until we're back at the top level like this seems fine. It's much worse for XHRs to break R2C, so this seems like the right simplification.

@@ +133,5 @@
>    return NS_OK;
>  }
>  
> +/* static */ bool nsJSInspector::SlowPathSuspendOnChannelEvent(
> +    net::ChannelEventQueue* aQueue) {

OMG, this really *is* what clang-format produces. My faith is shaken.
Flags: needinfo?(bhackett1024)
Comment on attachment 9030153 [details] [diff] [review]
Part 1 - Allow nsJSInspector to suspend event queues.

Review of attachment 9030153 [details] [diff] [review]:
-----------------------------------------------------------------

clearing review; see needinfo.
Attachment #9030153 - Flags: review?(jimb)
Comment on attachment 9030155 [details] [diff] [review]
Part 3 - Add test that XHR does not break run-to-completion when paused in the debugger.

Review of attachment 9030155 [details] [diff] [review]:
-----------------------------------------------------------------

These are really nice.
Attachment #9030155 - Flags: review?(jimb) → review+
Attached patch patchSplinter Review
Updated complete patch.  Jim is right in comment 22, we do need to avoid suspending XHRs for content that does not have event handling suppressed (it could be for an unrelated tab that just happens to be in the same content process).  This patch avoids touching nsJSInspector, and ties into the event suppression state in nsIDocument: XHRs for an nsIDocument should not be processed when (a) event handling is suppressed for it, and (b) it is not being used in a synchronous operation that breaks run-to-completion.  The latter case is for synchronous XHRs, which also suppress event handling and spin up a nested event loop, rather deliberately breaking run-to-completion in the process.
Attachment #9030151 - Attachment is obsolete: true
Flags: needinfo?(bhackett1024)
With this patch, nsIDocument keeps track of the event queues for XHRs which were suspended while the document had event handling suppressed, and which need to be resumed when handling stops being suppressed.
Attachment #9030968 - Flags: review?(bugs)
Sorry for the re-review, this patch has changed a fair amount.  The ChannelEventQueue now does all the checking itself to see if it should be suspended instead of processing events.
Attachment #9030153 - Attachment is obsolete: true
Attachment #9030154 - Attachment is obsolete: true
Attachment #9030969 - Flags: review?(honzab.moz)
Comment on attachment 9030968 [details] [diff] [review]
Part 1 - Keep track of suspended event queues in nsIDocument.

Review of attachment 9030968 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsDocument.cpp
@@ +8582,5 @@
> +  if (!EventHandlingSuppressed()) {
> +    for (net::ChannelEventQueue* queue : mSuspendedQueues) {
> +      queue->Resume();
> +    }
> +    mSuspendedQueues.Clear();

swap to a local array to iterate (re-entry protection)

::: dom/base/nsIDocument.h
@@ +4040,5 @@
>    uint32_t mEventsSuppressed;
>  
> +  // Any XHR ChannelEventQueues that were suspended on this document while
> +  // events were suppressed.
> +  nsTArray<RefPtr<mozilla::net::ChannelEventQueue>> mSuspendedQueues;

assert single thread access for this array
Comment on attachment 9030969 [details] [diff] [review]
Part 2 - Check if event queues should be suspended before processing their events.

Review of attachment 9030969 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with comment addressed.

::: netwerk/ipc/ChannelEventQueue.cpp
@@ +179,5 @@
> +  // We only ever need to suppress events on the main thread, since this is
> +  // where content scripts can run.
> +  if (!NS_IsMainThread()) {
> +    return false;
> +  }

do the thread checking as the first thing (before you access mHasCheckedForXMLHttpRequest && !mForXMLHttpRequest)

@@ +205,5 @@
> +      nsCOMPtr<nsIDocument> document;
> +      loadInfo->GetLoadingDocument(getter_AddRefs(document));
> +      if (document &&
> +          document->EventHandlingSuppressed() &&
> +          !document->IsInSyncOperation()) {

this condition should be r? by smaug as well.

@@ +208,5 @@
> +          document->EventHandlingSuppressed() &&
> +          !document->IsInSyncOperation()) {
> +        document->AddSuspendedChannelEventQueue(this);
> +        SuspendInternal();
> +        return true;

instead of branch nesting it would be nicer to use early |return false| statemenets instead.  but it's just a personal preference.
Attachment #9030969 - Flags: review?(honzab.moz)
Attachment #9030969 - Flags: review?(bugs)
Attachment #9030969 - Flags: review+
Attachment #9030968 - Flags: review?(bugs) → review+
Comment on attachment 9030969 [details] [diff] [review]
Part 2 - Check if event queues should be suspended before processing their events.

>+      // Suspend the queue if the associated document has suppressed event
>+      // handling, *and* it is not in the middle of a synchronous operation
>+      // (particularly a synchronous XHR) that deliberately breaks
>+      // run-to-completion.

The comment isn't quite right. sync XHR isn't supposed to break run-to-completion. It is Gecko limitation that
it does. So perhaps clarify the comment a bit.
Attachment #9030969 - Flags: review?(bugs) → review+
Pushed by bhackett@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4661014dd7ff
Part 1 - Keep track of suspended event queues in nsIDocument, r=smaug.
https://hg.mozilla.org/integration/mozilla-inbound/rev/2e68f8f42eb5
Part 2 - Check if event queues should be suspended before processing their events, r=mayhemer,smaug.
https://hg.mozilla.org/integration/mozilla-inbound/rev/ef37f622e40e
Part 3 - Add test that XHR does not break run-to-completion when paused in the debugger, r=jimb.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: