Closed Bug 1361166 Opened 3 years ago Closed 3 years ago

worker close() is racy

Categories

(Core :: DOM: Workers, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: bkelly, Assigned: bkelly)

References

Details

Attachments

(1 file, 2 obsolete files)

In this try build I'm consistently hitting failures in WorkerGlobalScope_close.htm WPT test:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ae0dd71fd3c3dac9f6248ce45a23ce6b436e74d0&selectedJob=95724545

This test does this:

async_test(function(t) {
  var worker = new Worker('./support/WorkerClose.js');
  worker.onmessage = t.step_func(function(e) {
    assert_equals(e.data, "ping");
    worker.onmessage = t.unreached_func("Unexpected message event");
    worker.postMessage("pong");
    setTimeout(t.step_func_done(), 100);
  });
  worker.postMessage("ping");
});

// WorkerClose.js script
onmessage = function(evt)
{
    postMessage(evt.data);
    self.close();
}

When the self.close() is called we essentially do:

1) WorkerPrivate::NotifyInternal(Closing)
2) Clear the worker event queue
3) Set the WorkerPrivate mStatus to Closing
4) Schedule a CloseRunnable to set the parent mParentStatus to Closing

Apparently my changes alter the timing just enough so that the main thread triggers a MessageEventRunnable between (3) and (4).  We have a window of time where the event queue has been cleared, but the CloseRunnable has not executed which therefore allows new event runnables to be scheduled.

Andrea, any thoughts on how to fix this?  Can we get rid of CloseRunnable() and just set mParentStatus synchronously?  Its already protected by a mutex.
Flags: needinfo?(amarchesini)
The simple synchronous update passes the tests I tried locally.  Lets see what try thinks:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1fd8aa22a33eaab2264cae8090a678f8faeda8b5

If this doesn't work maybe I could make CloseRunnable a sync loop runnable instead.  I'd also have to move it up to before the event queue clearing.  (I guess I should do that regardless, though.)
(In reply to Ben Kelly [not reviewing due to deadline][:bkelly] from comment #1)
> If this doesn't work maybe I could make CloseRunnable a sync loop runnable
> instead.  I'd also have to move it up to before the event queue clearing. 
> (I guess I should do that regardless, though.)

Probably goes without saying, but I don't want to do the sync loop.  Its more complicated and probably has a higher risk of deadlock.
Slightly better patch that calls Close() synchronously before clearing the event queue.  This should completely remove the race.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3d514903bd1fa5e487dccc7f06ecb212606250f2
Attachment #8863533 - Attachment is obsolete: true
Slight improvement to the patch to avoid locking mMutex twice in NotifyInternal().  We just call WorkerPrivateParent::Close() in the existing NotifyInternal() lock.

Andrea, this patches fixes the issue described in comment 0.  Essentially there is a race between when we clear the event queue on worker close and when we mark the parent status as closed (which prevents further dispatch).

Previous version of this patch is green on try, so flagging for review.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=842d597fd00a080a934c791ff574bc44b4cb94f2
Attachment #8863538 - Attachment is obsolete: true
Flags: needinfo?(amarchesini)
Attachment #8863743 - Flags: review?(amarchesini)
Comment on attachment 8863743 [details] [diff] [review]
Update mParentStatus synchronously when closing worker thread. r=baku

Review of attachment 8863743 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #8863743 - Flags: review?(amarchesini) → review+
Pushed by bkelly@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a66ac62d1f0a
Update mParentStatus synchronously when closing worker thread. r=baku
https://hg.mozilla.org/mozilla-central/rev/a66ac62d1f0a
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.