Closed Bug 779707 Opened 7 years ago Closed 4 years ago

"ASSERTION: Not yet suspended!" with sync XHR, workers

Categories

(Core :: DOM: Workers, defect)

x86_64
macOS
defect
Not set

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: jruderman, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: assertion, testcase)

Attachments

(2 files)

Attached file testcase
###!!! ASSERTION: Not yet suspended!: 'mParentSuspended', file ../../../dom/workers/WorkerPrivate.cpp, line 2048

###!!! ASSERTION: Not yet suspended!: 'mSuspended', file ../../../dom/workers/WorkerPrivate.cpp, line 3099

###!!! ASSERTION: Mismatched busy count mods!: 'aIncrease || mBusyCount', file ../../../dom/workers/WorkerPrivate.cpp, line 2158
Attached file stack trace
Stuck a few printfs in the worker code and...

When all workers are suspended (from nsGlobalWindow::SuspendTimeouts), there are no Workers associated with the window, but, when the window resumes, the Worker (that is created after the sync XHR) has been registered with the window, and obviously is not aware of being suspended.

I thought sync XHR meant script after it would not be evaluated and so the worker shouldn't be created until the resume is finished. (And XHR queues the runnable that causes the window to resume immediately after it suspends everything, so worker code is definitely being run from the nsAutoSyncLoop block - http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsXMLHttpRequest.cpp#2890)
Flags: needinfo?(bent.mozilla)
I have no idea what's going on here but it looks like this is a failure to properly synchronize in the main thread XHR implementation. There shouldn't be any way that the worker is created before the window is resumed.
Component: DOM: Workers → DOM
Flags: needinfo?(bent.mozilla)
Script isn't really "suspended" during a sync XHR on the main thread. When a sync XHR is running, we simply spin the event loop until the XHR is finished. This means that any runnables posted to the main thread will run just fine during the sync XHR, and if those runnables call out into JS, that JS will run.

The only thing we "suspend" during the sync XHR is that we prevent certain runnables from being posted to the event loop while the sync XHR is running. So for example UI events and setTimeouts don't post runnables to the main event loop. Right before we return from the sync-xhr code (hopefully after we stop the event-loop-spinning) we re-enable the UI and setTimeouts code so that it can start posting runnables to the event loop again.

However lots of other APIs don't change behavior during the sync-xhr, and so they gladly post runnables to the event loop, and so lots of stuff happens while the sync XHR is blocked.

Based on the stack trace and comment 2, what it sounds like is happening is something like this:

1. We create a worker, this posts some runnable to the event loop to do some initialization of said worker.
2. JS calls into the sync-xhr code.
3. We tell the timeouts code to stop firing timeouts.
4. We spin the event loop.
5. During the event loop spinning we initialize the worker.
6. The sync XHR finishes and so we stop spinning the event loop.
7. We tell the timeouts code to go again.
8. The timeouts-go code calls into the worker code. Which was thoroughly confused by finding that a worker has been initialized.

I'm *guessing* that part of the problem here is that the code in step 3 and step 7 to suspend/resume timeouts is the same as the code to suspend/resume timeouts that's used when you leave and return to a page. So it *seems* like this code is also attempting to suspend/resume workers.

The code to suspend/resume workers gets confused by the fact that a new worker was created between the suspend/resume.

However we shouldn't be suspending/resuming workers during a sync-XHR code at all. Only when you leave/re-enter a page.

Potentially we could suspend/resume the onmessage firing during the sync-XHR. That would be good for the same reasons that we suspend/resume UI-event-firing. But worker execution has no reason to stop.
Ah, ben points out that i misread the code a bit. And I see that I might have misread the sync-xhr code a bit. Updated theory to follow. Though again, mostly based on guesses.

1. JS calls into the sync-xhr code.
2. We tell the timeouts code to stop firing timeouts.
3. We spin the event loop.
4. The sync XHR finishes and so we stop spinning the event loop.
5. We post an event to resume timeouts.
6. We return from the sync-xhr call and get back into JS
7. We create a worker.
8. We run the resume-timeouts runnable which tells the timeouts code to go again.
9. The timeouts-go code calls into the worker code. Which was thoroughly confused by finding that a worker that isn't suspended.

Same problem though. The suspend/resume timeouts code that's called by sync-xhr should not suspend/resume workers. At most it should suspend/resume message events.
This no longer reproduces.
Status: NEW → RESOLVED
Closed: 4 years ago
Component: DOM → DOM: Workers
Flags: in-testsuite+
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.