Closed
Bug 1318226
Opened 8 years ago
Closed 8 years ago
Tail dispatcher breaks the order of events
Categories
(Core :: XPCOM, defect)
Core
XPCOM
Tracking
()
RESOLVED
FIXED
mozilla55
Tracking | Status | |
---|---|---|
firefox55 | --- | fixed |
People
(Reporter: jwwang, Assigned: jwwang)
References
Details
Attachments
(2 files)
Without tail dispatching:
TaskQueue* tq1;
TaskQueue* tq1;
tq1->Dispatch(R1);
tq2->Dispatch(R2); // R2 will call tq1->Dispatch(R3)
R1 is guaranteed to happen before R3.
With tail dispatching:
We can't guarantee R1 happens before R3.
http://searchfox.org/mozilla-central/rev/efcb1644e49c36445e75d89b434fdf4c84832c84/dom/media/MediaDecoder.cpp#304
This causes bug 1316145 where mCompositorUpdatedEvent dispatches a task to the TaskQueue of MediaDecoderReader after it is shut down.
Assignee | ||
Comment 1•8 years ago
|
||
Hi Bobby,
Do you have any suggestion how to fix the event disorder introduced by tail dispatching?
Flags: needinfo?(bobbyholley)
Assignee | ||
Comment 2•8 years ago
|
||
Add a gtest to verify the bug:
https://hg.mozilla.org/try/rev/0e0de8b04f5a
and it fails miserably on Try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bb0da3a6fc22aacb9eb114b5078362bfadfe246b
Comment hidden (mozreview-request) |
Comment 4•8 years ago
|
||
Hm, nice find! That's a pretty tough one.
Fundamentally, the point of tail dispatching is to make asynchronous messages from one thread to another occur atomically. It's not possible to reconcile that with ordering assumptions where callers _depend_ on the order of interleaved events dispatched from a given thread.
We could at least detect this case though: i.e. make TaskDispatcher::AddTask MOZ_CRASH if the PerThreadTaskGroup it returns is not end-of-list. Callers could then either drain the tail dispatcher before continuing (ensuring the correct ordering), or pass some kind of opt-out token (indicating that they don't care about the ordering).
I don't have a sense of how common this pattern is, and therefore what the false-positive rate would be. Maybe worth a try though?
Flags: needinfo?(bobbyholley)
Assignee | ||
Comment 5•8 years ago
|
||
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #4)
> Fundamentally, the point of tail dispatching is to make asynchronous
> messages from one thread to another occur atomically.
Why is the atomicity required? I think the purpose of tail dispatcher is to provide run-to-completion as we have in JS.
Assignee | ||
Comment 6•8 years ago
|
||
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #4)
> I don't have a sense of how common this pattern is, and therefore what the
> false-positive rate would be. Maybe worth a try though?
We have:
a. MediaDecoder (main thread) -> MDSM::Shutdown (MDSM TaskQueue) -> MediaDecoderReader::Shutdown (Reader TaskQueue)
and also:
b. MediaDecoder (main thread) sends events to MediaDecoderReader (Reader TaskQueue) using MediaEventSource.
We need to guarantee events are sent to the Reader's TaskQueue before MediaDecoderReader::Shutdown shuts down its TaskQueue to avoid assertions caused by dispatch failures.
Comment 7•8 years ago
|
||
(In reply to JW Wang [:jwwang] [:jw_wang] from comment #5)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #4)
> > Fundamentally, the point of tail dispatching is to make asynchronous
> > messages from one thread to another occur atomically.
> Why is the atomicity required? I think the purpose of tail dispatcher is to
> provide run-to-completion as we have in JS.
I don't think so. We introduced tail dispatching to support state mirroring. For state mirroring, we wanted to send a single atomic state update to other threads once a given task is complete. But to preserve ordering, that means that we also need to defer any dispatched tasks until after the state updates are applied. That's the reason we require tail dispatch for any thread that holds canonical mirrored state.
(In reply to JW Wang [:jwwang] [:jw_wang] from comment #6)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #4)
> > I don't have a sense of how common this pattern is, and therefore what the
> > false-positive rate would be. Maybe worth a try though?
>
> We have:
> a. MediaDecoder (main thread) -> MDSM::Shutdown (MDSM TaskQueue) ->
> MediaDecoderReader::Shutdown (Reader TaskQueue)
>
> and also:
> b. MediaDecoder (main thread) sends events to MediaDecoderReader (Reader
> TaskQueue) using MediaEventSource.
>
> We need to guarantee events are sent to the Reader's TaskQueue before
> MediaDecoderReader::Shutdown shuts down its TaskQueue to avoid assertions
> caused by dispatch failures.
That's the failure mode from bug 1316145, right? That would be a non-false positive under my proposed solution from comment 4. I'm specifically wondering how many _other_ situations there are where we dispatch a task to A, then to B, and then to A again.
Assignee | ||
Comment 8•8 years ago
|
||
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #7)
> I don't think so. We introduced tail dispatching to support state mirroring.
> For state mirroring, we wanted to send a single atomic state update to other
> threads once a given task is complete. But to preserve ordering, that means
> that we also need to defer any dispatched tasks until after the state
> updates are applied. That's the reason we require tail dispatch for any
> thread that holds canonical mirrored state.
There are 2 categories of tasks in discussion:
1. state change tasks (used by state mirroring): for which we must favor atomicity over ordering.
2. regular tasks: for which we might favor ordering over atomicity.
There are some proposed solutions to fix the ordering of regular tasks:
1. comment 4 (this breaks atomicity).
2. implicitly append a new PerThreadTaskGroup if the returned PerThreadTaskGroup is not the last element. (this breaks atomicity).
3. provide a function to drain the tail dispatcher for functions (ex: MediaDecoder::Shutdown()) that are sensitive to TaskQueue shutdown. (atomicity is broken when the caller agrees).
Comment 9•8 years ago
|
||
Sorry I've been slow to respond here - I was meaning to get back to this before the holidays but didn't. I'll try to pick this discussion up again once I'm back. :-)
Assignee | ||
Comment 10•8 years ago
|
||
No worries! Please take your time and enjoy the holidays. :)
Assignee | ||
Comment 11•8 years ago
|
||
CC Bevis who is making changes to AbstractThread and TaskQueue.
Comment 12•8 years ago
|
||
(In reply to JW Wang [:jwwang] [:jw_wang] from comment #8)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #7)
> > I don't think so. We introduced tail dispatching to support state mirroring.
> > For state mirroring, we wanted to send a single atomic state update to other
> > threads once a given task is complete. But to preserve ordering, that means
> > that we also need to defer any dispatched tasks until after the state
> > updates are applied. That's the reason we require tail dispatch for any
> > thread that holds canonical mirrored state.
>
> There are 2 categories of tasks in discussion:
> 1. state change tasks (used by state mirroring): for which we must favor
> atomicity over ordering.
> 2. regular tasks: for which we might favor ordering over atomicity.
The key point here is that the atomicity of regular tasks exists _because_ of ordering - specifically to provide a consistent ordering between state changes and tasks that might observe them.
>
> There are some proposed solutions to fix the ordering of regular tasks:
> 1. comment 4 (this breaks atomicity).
> 2. implicitly append a new PerThreadTaskGroup if the returned
> PerThreadTaskGroup is not the last element. (this breaks atomicity).
> 3. provide a function to drain the tail dispatcher for functions (ex:
> MediaDecoder::Shutdown()) that are sensitive to TaskQueue shutdown.
> (atomicity is broken when the caller agrees).
Can you explain the difference between 1 and 3? They sound the same to me (and I still think that we should pursue the solution in comment 4).
Assignee | ||
Comment 13•8 years ago
|
||
(In reply to Bobby Holley (PTO) (busy with Stylo) from comment #12)
> Can you explain the difference between 1 and 3? They sound the same to me
> (and I still think that we should pursue the solution in comment 4).
No difference. 1 includes 3 after my second thought.
Assignee | ||
Comment 14•8 years ago
|
||
(In reply to Bobby Holley (PTO) (busy with Stylo) from comment #12)
> The key point here is that the atomicity of regular tasks exists _because_
> of ordering - specifically to provide a consistent ordering between state
> changes and tasks that might observe them.
I don't quite get it. Can you explain how ordering implies atomicity for regular tasks?
Below is my understanding about the atomicity of regular tasks:
Say we want to dispatch 3 regular tasks to thread1. By atomicity I mean to have a task dispatched to thread1 and the Run() function will call task1->Run(), task2->Run(), and task3->Run() in sequence. It is impossible for other tasks to run in between these 3 tasks.
By breaking atomicity of regular tasks, I mean we might have taskA dispatched to call task1->Run() and taskB dispatched to call task2->Run() and task3->Run() in sequence. It is possible for other tasks to run in between task1 and task2.
Comment 15•8 years ago
|
||
(In reply to JW Wang [:jwwang] [:jw_wang] from comment #14)
> (In reply to Bobby Holley (PTO) (busy with Stylo) from comment #12)
> > The key point here is that the atomicity of regular tasks exists _because_
> > of ordering - specifically to provide a consistent ordering between state
> > changes and tasks that might observe them.
>
> I don't quite get it. Can you explain how ordering implies atomicity for
> regular tasks?
I'm probably just being a bit sloppy with my terminology. I was specifically talking about ordering of the tasks relative to state changes (the regular tasks need to come after all the state changes, because that's the only well-defined ordering). This requires that the regular tasks be tail-dispatched, but I realize now that they don't necessarily need to be atomic (see below).
> Below is my understanding about the atomicity of regular tasks:
>
> Say we want to dispatch 3 regular tasks to thread1. By atomicity I mean to
> have a task dispatched to thread1 and the Run() function will call
> task1->Run(), task2->Run(), and task3->Run() in sequence. It is impossible
> for other tasks to run in between these 3 tasks.
>
> By breaking atomicity of regular tasks, I mean we might have taskA
> dispatched to call task1->Run() and taskB dispatched to call task2->Run()
> and task3->Run() in sequence. It is possible for other tasks to run in
> between task1 and task2.
These definitions make sense to me.
Thinking about this some more, I think you're right that the regular tasks don't need to be atomic, the just need to be delayed. I _think_ we can implicitly append additional PerThreadTaskGroups for regular tasks, _assuming_ that we make sure to put all state updates (including ones that occur after additional PerThreadTaskGroups are appended) continue to live in the first PerThreadTaskGroup.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Updated•8 years ago
|
Attachment #8845206 -
Flags: review?(bobbyholley)
Attachment #8812640 -
Flags: review?(bobbyholley)
Comment 18•8 years ago
|
||
mozreview-review |
Comment on attachment 8812640 [details]
Bug 1318226. P2 - add gtest TestTaskQueue to test the order of regular tasks.
https://reviewboard.mozilla.org/r/94284/#review126040
Sorry for the delay on this one.
Attachment #8812640 -
Flags: review?(bobbyholley) → review+
Comment 19•8 years ago
|
||
mozreview-review |
Comment on attachment 8845206 [details]
Bug 1318226. P1 - preserve the order of regular tasks.
https://reviewboard.mozilla.org/r/118400/#review126042
::: xpcom/threads/TaskDispatcher.h:127
(Diff revision 1)
> - PerThreadTaskGroup& group = EnsureTaskGroup(aThread);
> + // To preserve the event order, we need to append a new group if the last
> + // group is not targeted for |aThread|.
Probably worth explaining the issue here in a bit more detail, and/or referencing the discussion in the bug and/or the testcase that demonstrates this issue.
Attachment #8845206 -
Flags: review?(bobbyholley) → review+
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 23•8 years ago
|
||
Pushed by jwwang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cd225b588582
P1 - preserve the order of regular tasks. r=bholley
https://hg.mozilla.org/integration/autoland/rev/1bfd5d58b3b1
P2 - add gtest TestTaskQueue to test the order of regular tasks. r=bholley
Comment 24•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/cd225b588582
https://hg.mozilla.org/mozilla-central/rev/1bfd5d58b3b1
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in
before you can comment on or make changes to this bug.
Description
•