Closed Bug 1300659 Opened 8 years ago Closed 8 years ago

prevent setTimeout/setInterval from flooding the main thread event queue

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla52

Tracking Flags:

Tracking

Status

firefox52

---

fixed

People

(Reporter: bkelly, Assigned: bkelly)

References

(Depends on 1 open bug, Blocks 2 open bugs, Regressed 1 open bug)

Details

Attachments

(7 files, 35 obsolete files)

P1 Expose a main thread TaskQueue on nsGlobalWindow. r=bz 8 years ago Ben Kelly [:bkelly, not reviewing] 9.13 KB, patch		Details \| Diff \| Splinter Review
P2 Make sure window TaskQueues are drained before allowing browser to shutdown. r=bz 8 years ago Ben Kelly [:bkelly, not reviewing] 2.45 KB, patch		Details \| Diff \| Splinter Review
P3 Don't create new window MainThreadTaskQueue objects after xpcom-shutdown. r=bz 8 years ago Ben Kelly [:bkelly, not reviewing] 3.59 KB, patch		Details \| Diff \| Splinter Review
P4 Make setTimeout() and setInterval() use the MainThreadTaskQueue. r=bz 8 years ago Ben Kelly [:bkelly, not reviewing] 2.60 KB, patch		Details \| Diff \| Splinter Review
P1 Add a TaskQueue wrapper that automatically shuts down when appropriate. r=froydnj 8 years ago Ben Kelly [:bkelly, not reviewing] 8.62 KB, patch		Details \| Diff \| Splinter Review
P2 Expose a main thread TaskQueue on nsGlobalWindow. r=bz 8 years ago Ben Kelly [:bkelly, not reviewing] 6.33 KB, patch		Details \| Diff \| Splinter Review
P3 Make setTimeout() and setInterval() use the MainThreadTaskQueue. r=bz 8 years ago Ben Kelly [:bkelly, not reviewing] 5.34 KB, patch		Details \| Diff \| Splinter Review
P3 Make setTimeout() and setInterval() use the MainThreadTaskQueue. r=bz 8 years ago Ben Kelly [:bkelly, not reviewing] 6.23 KB, patch		Details \| Diff \| Splinter Review
P1 Add a TaskQueue wrapper that automatically shuts down when appropriate. r=froydnj 8 years ago Ben Kelly [:bkelly, not reviewing] 10.75 KB, patch	froydnj : feedback+	Details \| Diff \| Splinter Review
P2 Expose a main thread TaskQueue on nsGlobalWindow. r=bz 8 years ago Ben Kelly [:bkelly, not reviewing] 7.10 KB, patch	smaug : review+	Details \| Diff \| Splinter Review
P3 Make setTimeout() and setInterval() use the MainThreadTaskQueue. r=bz 8 years ago Ben Kelly [:bkelly, not reviewing] 8.00 KB, patch		Details \| Diff \| Splinter Review
P4 Attach the worker TaskQueue to the window TaskQueue if present. r=baku 8 years ago Ben Kelly [:bkelly, not reviewing] 1.64 KB, patch	baku : review+	Details \| Diff \| Splinter Review
P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 3.90 KB, patch		Details \| Diff \| Splinter Review
CPU/Mem of setTimeout bomb with patches P1 to P4 (no back pressure) 8 years ago Ben Kelly [:bkelly, not reviewing] 19.54 KB, image/png		Details
CPU/Mem of setTimeout bomb with patches P1 to P5 (with back pressure) 8 years ago Ben Kelly [:bkelly, not reviewing] 17.43 KB, image/png		Details
P1 Add a TaskQueue wrapper that automatically shuts down when appropriate. r=froydnj 8 years ago Ben Kelly [:bkelly, not reviewing] 11.67 KB, patch		Details \| Diff \| Splinter Review
P1 Add a TaskQueue wrapper that automatically shuts down when appropriate. r=froydnj 8 years ago Ben Kelly [:bkelly, not reviewing] 11.73 KB, patch	froydnj : review+	Details \| Diff \| Splinter Review
P2 Expose a main thread TaskQueue on nsGlobalWindow. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 6.93 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P3 Make setTimeout() and setInterval() use the MainThreadTaskQueue. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 7.88 KB, patch		Details \| Diff \| Splinter Review
P4 Attach the worker TaskQueue to the window TaskQueue if present. r=baku 8 years ago Ben Kelly [:bkelly, not reviewing] 1.67 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 3.71 KB, patch		Details \| Diff \| Splinter Review
P2 Expose a TaskQueue on that TabGroup. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 5.52 KB, patch		Details \| Diff \| Splinter Review
P3 Make setTimeout() and setInterval() use the TabGroup TaskQueue. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 7.88 KB, patch		Details \| Diff \| Splinter Review
P4 Attach the worker TaskQueue to the window TaskQueue if present. r=baku 8 years ago Ben Kelly [:bkelly, not reviewing] 2.44 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 3.50 KB, patch		Details \| Diff \| Splinter Review
P2 Expose a TaskQueue on that TabGroup. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 8.43 KB, patch		Details \| Diff \| Splinter Review
P3 Make setTimeout() and setInterval() use the TabGroup TaskQueue. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 7.84 KB, patch		Details \| Diff \| Splinter Review
P4 Attach the worker TaskQueue to the window TaskQueue if present. r=baku 8 years ago Ben Kelly [:bkelly, not reviewing] 2.32 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P1 Add the ThrottledEventQueue class. r=froydnj 8 years ago Ben Kelly [:bkelly, not reviewing] 18.06 KB, patch	froydnj : review+	Details \| Diff \| Splinter Review
P2 Expose a ThrottledEventQueue on TabGroup and nsPIDOMWindow. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 7.69 KB, patch	smaug : review-	Details \| Diff \| Splinter Review
P3 Make setTimeout() and setInterval() use the TabGroup ThrottledEventQueue. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 10.71 KB, patch	smaug : review+	Details \| Diff \| Splinter Review
P4 Use ThrottledEventQueue in workers. r=baku 8 years ago Ben Kelly [:bkelly, not reviewing] 7.62 KB, patch	baku : review+	Details \| Diff \| Splinter Review
P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 3.72 KB, patch	smaug : review+	Details \| Diff \| Splinter Review
P4 Use ThrottledEventQueue in workers. r=baku 8 years ago Ben Kelly [:bkelly, not reviewing] 8.09 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P2 Expose a ThrottledEventQueue on TabGroup and nsPIDOMWindow. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 8.69 KB, patch		Details \| Diff \| Splinter Review
P2 Expose a ThrottledEventQueue on TabGroup and nsPIDOMWindow. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 8.66 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P4 Use ThrottledEventQueue in workers. r=baku 8 years ago Ben Kelly [:bkelly, not reviewing] 7.74 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P1 Add the ThrottledEventQueue class. r=froydnj 8 years ago Ben Kelly [:bkelly, not reviewing] 18.72 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P2 Expose a ThrottledEventQueue on TabGroup and nsPIDOMWindow. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 8.66 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P3 Make setTimeout() and setInterval() use the TabGroup ThrottledEventQueue. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 9.38 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P4 Use ThrottledEventQueue in workers. r=baku 8 years ago Ben Kelly [:bkelly, not reviewing] 7.74 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review
P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug 8 years ago Ben Kelly [:bkelly, not reviewing] 4.02 KB, patch	bkelly : review+	Details \| Diff \| Splinter Review

Ben Kelly [:bkelly, not reviewing]

Assignee

Description

•

8 years ago

Bug 1300118 adapts the xpcom TaskQueue class a bit to be used to prevent main thread runnable flooding.  This bug will use TaskQueue in nsGlobalWindow to throttle runnables from setTimeout() and setInterval().

For example, this bug will keep this script from killing main thread responsiveness:

  function boom() { setTimeout(boom, 0); setTimeout(boom, 0); }
  boom();

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 1

•

8 years ago

Attached patch P1 Expose a main thread TaskQueue on nsGlobalWindow. r=bz (obsolete) — Details — Splinter Review

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 2

•

8 years ago

Attached patch P2 Make sure window TaskQueues are drained before allowing browser to shutdown. r=bz (obsolete) — Details — Splinter Review

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 3

•

8 years ago

Attached patch P3 Don't create new window MainThreadTaskQueue objects after xpcom-shutdown. r=bz (obsolete) — Details — Splinter Review

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 4

•

8 years ago

Attached patch P4 Make setTimeout() and setInterval() use the MainThreadTaskQueue. r=bz (obsolete) — Details — Splinter Review

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 5

•

8 years ago

I believe there is still a fair bit of orange with these patches.

Also, I'd like to add code to freeze timers if we detect that the MainThreadTaskQueue is falling behind.  In the worker patches I block the worker thread if I see a backlog > 5000 runnables.  Here I plan to use the existing "frozen" timers mechanism.  I'll detect when to resume timers by inserting a runnable in the task queue so I know when the backlog has flushed.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 6

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=aa9487c16e92

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 7

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3fbcbf3ceec0

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 8

•

8 years ago

Lets just limit this to linux x64 until its more green:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=df894db43221

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 9

•

8 years ago

Trying a different approach to managing the shutdown of the task queue.  Also, this shares a task queue between connected parent/child windows (but not window.opener yet).

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 10

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=9285ec2fcdb1

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 11

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=9c5fe7f14d23

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 12

•

8 years ago

Attached patch P1 Add a TaskQueue wrapper that automatically shuts down when appropriate. r=froydnj (obsolete) — Details — Splinter Review

My new shutdown approach is to make a TaskQueue wrapper that just DTRT.  It internally handles shutdown when the TaskQueue is no longer referenced or when the browser is shutting down.  For browser shutdown new Dispatch calls are issues to the original base target.

Attachment #8788302 - Attachment is obsolete: true

Attachment #8788303 - Attachment is obsolete: true

Attachment #8788304 - Attachment is obsolete: true

Attachment #8788305 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 13

•

8 years ago

Attached patch P2 Expose a main thread TaskQueue on nsGlobalWindow. r=bz (obsolete) — Details — Splinter Review

This exposes the auto-closing TaskQueue on the outer window.  A TaskQueue is shared with all child windows and between openers.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 14

•

8 years ago

Attached patch P3 Make setTimeout() and setInterval() use the MainThreadTaskQueue. r=bz (obsolete) — Details — Splinter Review

Updated patch to make setTimeout() and setInterval() use the task queue.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 15

•

8 years ago

These patches pass all our content tests.  There are maybe 5 browser-chrome tests that fail currently.  Trying to run them locally on an unmodified tree tends to trigger the same failures, though.  I think they are just racy.  We would need to fix them before proceeding with this, though.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 16

•

8 years ago

My inability to run these tests actually had more to do with my xvfb args.  Apparently the tests are designed to run on 1600x1200.  I was using 640x480.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 17

•

8 years ago

Attached patch P3 Make setTimeout() and setInterval() use the MainThreadTaskQueue. r=bz (obsolete) — Details — Splinter Review

It seems there is some timeout message at the end of some browser chrome tests that races with the window closing.  If the window closes before the timer runs then the test hangs.  My changes made this happen more frequently because I no longer let more than one timer run synchronously in the same runnable.

This updates the patch to allow up to 10 timers to execute in the same runnable.  It lets me pass at least one of these browser chrome tests locally:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a668bbeee477

I'll see if I can dig out which timer specifically was causing the problem.

Attachment #8790265 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

8 years ago

Blocks: QuantumDOM

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 18

•

8 years ago

It seems devtools wants to check to see that certain UX components have been selected and that these selections can happen via timeouts.  So throttling timeouts makes these tests fail due to missing their carefully tuned races.

As a way around this, lets try going back to my earlier idea of exempting chrome windows from throttling.  This would make both the browser chrome harness and devtools work the same as they do today while throttling content script.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=9e77137a8cba

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 19

•

8 years ago

Exempt chrome windows from the "single timeout per runnable" limitation like I intended:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=24441e5e1ee9

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 20

•

8 years ago

Attached patch P1 Add a TaskQueue wrapper that automatically shuts down when appropriate. r=froydnj (obsolete) — Details — Splinter Review

One of the difficulties with using a TaskQueue for main thread things like windows is being able to shutdown the TaskQueue cleanly.  In a worker thread we can AwaitShutdownAndIdle(), but we can't do that on the main thread.

This helper class tries to solve the problem.  It wraps the TaskQueue as an nsIEventTarget and provides a self-closing semantic.  If the target goes out of scope then we shutdown the TaskQueue asynchronously.  If xpcom-shutdown comes, then we shutdown the TaskQueue and spin the event loop until its drained.  This lets us largely pass the nsIEventTarget wrapper around trusting that it will DTRT when we need to stop using it.

I don't love the name or that this is separate from TaskQueue itself, but I think its a good step towards figuring out what we need here.  I didn't want to just put this in DOM since I already have two different places I want to use.  I'm just using it in the window case first.

Once things stabilize I figure we could revisit the name and possibly consider rolling it into TaskQueue itself.  That is much more work, though, since we need to negotiate with the other consumers of TaskQueue.  I'd like to prove this out first.

Also note, I would have liked to implement TaskQueue as an interface here, but there is no TaskQueue interface yet.  Another enhancement we could make would be adding an nsITaskQueue or similar abstract class.  We probably can't do anything like that until Bobby comes back from PTO, though.

Anyway, I'd love to proceed for now and then incrementally improve this class if possible.  Thanks!

Attachment #8790263 - Attachment is obsolete: true

Attachment #8790729 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 21

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=bf80aa4a1391

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 22

•

8 years ago

Attached patch P2 Expose a main thread TaskQueue on nsGlobalWindow. r=bz (obsolete) — Details — Splinter Review

Attachment #8790264 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 23

•

8 years ago

Attached patch P3 Make setTimeout() and setInterval() use the MainThreadTaskQueue. r=bz (obsolete) — Details — Splinter Review

https://treeherder.mozilla.org/#/jobs?repo=try&revision=21f9f0db162f

Attachment #8790444 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 24

•

8 years ago

Attached patch P4 Attach the worker TaskQueue to the window TaskQueue if present. r=baku (obsolete) — Details — Splinter Review

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

8 years ago

Attachment #8790952 - Flags: review?(amarchesini)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 25

•

8 years ago

Try is green.  I intend to flag for review once I can coordinate the review request with bz.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 26

•

8 years ago

Comment on attachment 8790932 [details] [diff] [review]
P2 Expose a main thread TaskQueue on nsGlobalWindow. r=bz

Here is my review comment so I don't lose it:

This patch creates a single TaskQueue for each constellation of windows.  The TaskQueue is kept on the outer window.  The existence of a TaskQueue is not guaranteed, so we provide a GetMainThreadTaskQueue() getter that can return nullptr.

The task queue can be nullptr under these circumstances:

1) Its a chrome window.  We avoid throttling chrome windows partially in order to prioritize chrome work and partially to avoid the resulting browser-chrome/devtools orange it produces.
2) The window has not yet had a docshell set.
3) The window in a new constellation has a docshell set during xpcom shutdown.

Note that we do not explicitly clear the TaskQueue when the docshell is cleared.  Its unclear if all runnables have executed by that point so we maintain the TaskQueue to preserve the ordering of runnables trailing in after de-activation.

The shutdown of the queue is handled auto-magically by the SelfCloseTaskQueueTarget wrapper class.  See P1 for the gory details.  Basically it shuts down when the target receives its last Release() or when xpcom shutdown starts.  All runnables are flushed to the main thread before the task queue is destroyed in either case.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 27

•

8 years ago

Comment on attachment 8790933 [details] [diff] [review]
P3 Make setTimeout() and setInterval() use the MainThreadTaskQueue. r=bz

Review of attachment 8790933 [details] [diff] [review]:
-----------------------------------------------------------------

Here is my review comment:

This patch does two things:

1) It makes the nsTimeout nsITimers fire their runnables at the window's main thread TaskQueue if its present.
2) It limits the number of timeouts we will run in a single runnable before we yield back the main thread.

Both of these are important.  Without (1) we can swamp the main thread event loop with runnables.  Without (2) we will end up blocking the main thread in a single runnable for extremely long periods when there are excessive timers in play.

We exempt chrome windows from (2) just like we exempt chrome windows from the main thread TaskQueue throttling mechanism.

Note that the limit on timeouts per runnable was a bit tricky because we can end up re-ordering nsITimer callbacks relative to our window nsTimeout list somehow.  I believe this can happen during suspend and resume.  We handle this case by always ensuring we fire at least the timer that the callback was created before.  This avoids the situation where we skip a timeout, but then the next timer runnable does not fire because the other timers were cancelled or cleared.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 28

•

8 years ago

Comment on attachment 8790932 [details] [diff] [review]
P2 Expose a main thread TaskQueue on nsGlobalWindow. r=bz

Olli, Boris won't be able to review this for a while.  Do you mind looking at the patches?

The description for this patch is in comment 26.

Attachment #8790932 - Flags: review?(bugs)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 29

•

8 years ago

Comment on attachment 8790933 [details] [diff] [review]
P3 Make setTimeout() and setInterval() use the MainThreadTaskQueue. r=bz

Description for this patch is in comment 27.

Attachment #8790933 - Flags: review?(bugs)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 30

•

8 years ago

Attached patch P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug (obsolete) — Details — Splinter Review

Suspend the window when we detect that the main thread TaskQueue is backing up.  I don't have an automated test that triggers this yet, but I verified locally that it works as expected.

I also took a couple screenshots of the process cpu and memory to show the benefits of applying this kind of backpressure.

Attachment #8791052 - Flags: review?(bugs)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 31

•

8 years ago

Attached image CPU/Mem of setTimeout bomb with patches P1 to P4 (no back pressure) — Details

This shows that memory just sky rockets with runnable throttling, but no back pressure to the page.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 32

•

8 years ago

Attached image CPU/Mem of setTimeout bomb with patches P1 to P5 (with back pressure) — Details

This shows the CPU and memory of the process when we add in the P5 patch applying back pressure.  Suspending the window when the TaskQueue is backed up has a big impact.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 33

•

8 years ago

New try with P4 and P5.  Also rebased on latest m-c:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=60aa651bb72f

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 34

•

8 years ago

I'm starting to think maybe we should just rename:

  SelfClosingTaskQueueTarget

To something more like:

  TaskQueueTarget

And just use it everywhere; both workers and windows.  I don't really want to expose AbstractThread or TaskQueue APIs further in the code than I have to.

Nathan Froyd [:froydnj]

Comment 35

•

8 years ago

Comment on attachment 8790729 [details] [diff] [review]
P1 Add a TaskQueue wrapper that automatically shuts down when appropriate. r=froydnj

Review of attachment 8790729 [details] [diff] [review]:
-----------------------------------------------------------------

I think this is reasonable.  I have a few questions, though.

::: xpcom/threads/SelfClosingTaskQueueTarget.cpp
@@ +96,5 @@
> +      new TaskQueue(target.forget(), false /* tail dispatch */);
> +
> +    RefPtr<Inner> ref = new Inner(aBaseTarget, taskQueue);
> +
> +    nsresult rv = obs->AddObserver(ref, kShutdownTopic, false /* weak ref */);

According to nsIObserverService.idl, passing false here means that we're holding a strong ref, so the comment should be fixed.

@@ +114,5 @@
> +    MOZ_ASSERT(!strcmp(aTopic, kShutdownTopic));
> +
> +    MaybeStartShutdown();
> +
> +    while (!mTaskQueue->IsEmpty()) {

How are we sure that all threads are done dispatching to the task queue at this point?  (i.e. how are we guaranteed that we're not going to race deciding that this is empty, and then somebody is going to dispatch an event to it behind our back?)

@@ +124,5 @@
> +
> +  void
> +  MaybeStartShutdown()
> +  {
> +    // Any thread

Do you think it's worth explaining here why two threads can't race to this function?  (One thread would have to be destroying the outer object, and the main thread would have to be going through xpcom-shutdown.  But if the main thread is calling this function at xpcom-shutdown, then we can't be destroying the outer object, because the inner object is still holding a strong ref to it.  Right?)

@@ +169,5 @@
> +  nsresult
> +  DispatchFromScript(nsIRunnable* aEvent, uint32_t aFlags)
> +  {
> +    // Any thread
> +    nsIEventTarget* target = mShutdownStarted ? mBaseTarget : mTaskQueueTarget;

Factor this computation out into a helper method, because you use it in a number of places?

Oh, you can't use NS_FORWARD_NSIEVENTTARGET because it doesn't work correctly with move-only classes.  Boo.

Attachment #8790729 - Flags: review?(nfroyd) → feedback+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 36

•

8 years ago

(In reply to Nathan Froyd [:froydnj] from comment #35)
> > +    nsresult rv = obs->AddObserver(ref, kShutdownTopic, false /* weak ref */);
> 
> According to nsIObserverService.idl, passing false here means that we're
> holding a strong ref, so the comment should be fixed.

Well, I was trying to indicate the argument was aWeakRef boolean.  Clearly it would be nice if this took an enum. :-\

> > +    MaybeStartShutdown();
> > +
> > +    while (!mTaskQueue->IsEmpty()) {
> 
> How are we sure that all threads are done dispatching to the task queue at
> this point?  (i.e. how are we guaranteed that we're not going to race
> deciding that this is empty, and then somebody is going to dispatch an event
> to it behind our back?)

Because MaybeStartShutdown() sets the Atomic<bool> mShutdownStarted.  After that we force all Dispatch() routines to the base target instead of the task queue.  I can add a comment.

> > +  void
> > +  MaybeStartShutdown()
> > +  {
> > +    // Any thread
> 
> Do you think it's worth explaining here why two threads can't race to this
> function?  (One thread would have to be destroying the outer object, and the
> main thread would have to be going through xpcom-shutdown.  But if the main
> thread is calling this function at xpcom-shutdown, then we can't be
> destroying the outer object, because the inner object is still holding a
> strong ref to it.  Right?)

Well, the first thing this method does is check-and-set the Atomic<bool> mShutdownStarted.  That was my anti-race mitigation.  Maybe I need a mutex around mShutdownStarted instead.  It does seem two threads could get passed the conditional before either one sets the atomic.

I'll add the mutex.

> > +    nsIEventTarget* target = mShutdownStarted ? mBaseTarget : mTaskQueueTarget;
> 
> Factor this computation out into a helper method, because you use it in a
> number of places?

Will do.  That will make it easier to use the mutex as well.

Nathan Froyd [:froydnj]

Comment 37

•

8 years ago

(In reply to Ben Kelly [:bkelly] from comment #36)
> (In reply to Nathan Froyd [:froydnj] from comment #35)
> > > +    nsresult rv = obs->AddObserver(ref, kShutdownTopic, false /* weak ref */);
> > 
> > According to nsIObserverService.idl, passing false here means that we're
> > holding a strong ref, so the comment should be fixed.
> 
> Well, I was trying to indicate the argument was aWeakRef boolean.  Clearly
> it would be nice if this took an enum. :-\

Story of my life.

> > > +    MaybeStartShutdown();
> > > +
> > > +    while (!mTaskQueue->IsEmpty()) {
> > 
> > How are we sure that all threads are done dispatching to the task queue at
> > this point?  (i.e. how are we guaranteed that we're not going to race
> > deciding that this is empty, and then somebody is going to dispatch an event
> > to it behind our back?)
> 
> Because MaybeStartShutdown() sets the Atomic<bool> mShutdownStarted.  After
> that we force all Dispatch() routines to the base target instead of the task
> queue.  I can add a comment.

Ah, that's a good explanation.  Please do add the comment.

> > > +  void
> > > +  MaybeStartShutdown()
> > > +  {
> > > +    // Any thread
> > 
> > Do you think it's worth explaining here why two threads can't race to this
> > function?  (One thread would have to be destroying the outer object, and the
> > main thread would have to be going through xpcom-shutdown.  But if the main
> > thread is calling this function at xpcom-shutdown, then we can't be
> > destroying the outer object, because the inner object is still holding a
> > strong ref to it.  Right?)
> 
> Well, the first thing this method does is check-and-set the Atomic<bool>
> mShutdownStarted.  That was my anti-race mitigation.  Maybe I need a mutex
> around mShutdownStarted instead.  It does seem two threads could get passed
> the conditional before either one sets the atomic.
> 
> I'll add the mutex.

If you were going to check-and-set, what you want is Atomic::compareExchange:

http://dxr.mozilla.org/mozilla-central/source/mfbt/Atomics.h#575

Using that should ensure only one thread gets into this method, and then you don't have to bother with the mutex.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 38

•

8 years ago

(In reply to Nathan Froyd [:froydnj] from comment #37)
> If you were going to check-and-set, what you want is Atomic::compareExchange:
> 
> http://dxr.mozilla.org/mozilla-central/source/mfbt/Atomics.h#575

Perfect!  I forgot we had this.

Nathan Froyd [:froydnj]

Comment 39

•

8 years ago

(In reply to Ben Kelly [:bkelly] from comment #36)
> > > +    MaybeStartShutdown();
> > > +
> > > +    while (!mTaskQueue->IsEmpty()) {
> > 
> > How are we sure that all threads are done dispatching to the task queue at
> > this point?  (i.e. how are we guaranteed that we're not going to race
> > deciding that this is empty, and then somebody is going to dispatch an event
> > to it behind our back?)
> 
> Because MaybeStartShutdown() sets the Atomic<bool> mShutdownStarted.  After
> that we force all Dispatch() routines to the base target instead of the task
> queue.  I can add a comment.

Thinking about this some more, what if you have:

SomeThread: checks mShutdownStarted in Dispatch*, finds it false
SomeThread: preempted
MainThread: calls MaybeStartShutdown
MainThread: spins task queue event loop
MainThread: finishes loop
SomeThread: dispatches to the task queue

Just setting the boolean isn't enough; you have to ensure that all threads have somehow finished anything they might have been doing in Dispatch.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 40

•

8 years ago

(In reply to Nathan Froyd [:froydnj] from comment #39)
> (In reply to Ben Kelly [:bkelly] from comment #36)
> > > > +    MaybeStartShutdown();
> > > > +
> > > > +    while (!mTaskQueue->IsEmpty()) {
> > > 
> > > How are we sure that all threads are done dispatching to the task queue at
> > > this point?  (i.e. how are we guaranteed that we're not going to race
> > > deciding that this is empty, and then somebody is going to dispatch an event
> > > to it behind our back?)
> > 
> > Because MaybeStartShutdown() sets the Atomic<bool> mShutdownStarted.  After
> > that we force all Dispatch() routines to the base target instead of the task
> > queue.  I can add a comment.
> 
> Thinking about this some more, what if you have:
> 
> SomeThread: checks mShutdownStarted in Dispatch*, finds it false
> SomeThread: preempted
> MainThread: calls MaybeStartShutdown
> MainThread: spins task queue event loop
> MainThread: finishes loop
> SomeThread: dispatches to the task queue
> 
> Just setting the boolean isn't enough; you have to ensure that all threads
> have somehow finished anything they might have been doing in Dispatch.

Well, the worst case here is that the TaskQueue dispatch returns an error.  This doesn't seem too unreasonable if a thread is trying to use this thing during xpcom shutdown from a non-main-thread.

If you'd like, I could check the Dispatch() call for failure and then retry with the base target if we have entered shutdown state.

  nsIEventTarget*
  InternalTarget() const
  {
    // This can be called by any thread.  Consult our Atomic<bool> shutdown
    // flag.  If we have not been shutdown then return the task queue.  Otherwise
    // return the base target.
    return mShutdownStarted ? mBaseTarget : mTaskQueueTarget;
  }

  nsresult
  Dispatch(already_AddRefed<nsIRunnable> aEvent, uint32_t aFlags)
  {
    // Any thread
    nsresult rv = InternalTarget()->Dispatch(Move(aEvent), aFlags);
    if (NS_FAILED(rv) && mShutdownStarted) {
      rv = InternalTarget()->Dispatch(Move(aEvent), aFlags)
    }
    return rv;
  }

Or I can just wrap all of Dispatch() in a mutex.

Flags: needinfo?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 41

•

8 years ago

Attached patch P1 Add a TaskQueue wrapper that automatically shuts down when appropriate. r=froydnj (obsolete) — Details — Splinter Review

Addressed review feedback.  I implemented the "retry if we fail dispatch and are shutting down" idea from the previous comment.

Attachment #8790729 - Attachment is obsolete: true

Flags: needinfo?(nfroyd)

Attachment #8791369 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 42

•

8 years ago

Attached patch P1 Add a TaskQueue wrapper that automatically shuts down when appropriate. r=froydnj (obsolete) — Details — Splinter Review

Now with more weak ref bare boolean commenting.

Attachment #8791369 - Attachment is obsolete: true

Attachment #8791369 - Flags: review?(nfroyd)

Attachment #8791375 - Flags: review?(nfroyd)

Andrea Marchesini [:baku]

Comment 43

•

8 years ago

Comment on attachment 8790952 [details] [diff] [review]
P4 Attach the worker TaskQueue to the window TaskQueue if present. r=baku

Review of attachment 8790952 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/WorkerPrivate.cpp
@@ +3957,4 @@
>    }
>  
>    // TODO: If we have a window, try to use its MainThreadTaskQueue as the
>    //       target for our sub-queue.

remove this TODO.

Attachment #8790952 - Flags: review?(amarchesini) → review+

Nathan Froyd [:froydnj]

Comment 44

•

8 years ago

Comment on attachment 8791375 [details] [diff] [review]
P1 Add a TaskQueue wrapper that automatically shuts down when appropriate. r=froydnj

Review of attachment 8791375 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the below two changes.  Please double-check my reasoning on the Dispatch comment.

::: xpcom/threads/SelfClosingTaskQueueTarget.cpp
@@ +14,5 @@
> +using mozilla::services::GetObserverService;
> +
> +namespace {
> +
> +static const char* kShutdownTopic = "xpcom-shutdown";

Please make this static const char kShutdownTopic[].

@@ +80,5 @@
> +  InternalTarget() const
> +  {
> +    // This can be called by any thread.  Consult our Atomic<bool> shutdown
> +    // flag.  If we have not been shutdown then return the task queue.  Otherwise
> +    // return the base target.

Can you reverse the order of the last two sentences to correspond to the order of the condition?

@@ +195,5 @@
> +    nsresult rv = InternalTarget()->Dispatch(Move(aEvent), aFlags);
> +    // Between selecting the target in InternalTarget() and calling
> +    // Dispatch() the TaskQueue might get shutdown.  If that happens,
> +    // then try again.  This time we will get the base target which
> +    // should succeed.

Is, uh, that really guaranteed?  Looking at TaskQueue::DispatchLocked, I see a lot of different failure conditions.  Though I guess we won't hit the AbortIfFlushing condition, and dispatch to the underlying event target really ought to be infallible.  So maybe we'll only fail if we're shutting down.

Attachment #8791375 - Flags: review?(nfroyd) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 45

•

8 years ago

> > +    nsresult rv = InternalTarget()->Dispatch(Move(aEvent), aFlags);
> > +    // Between selecting the target in InternalTarget() and calling
> > +    // Dispatch() the TaskQueue might get shutdown.  If that happens,
> > +    // then try again.  This time we will get the base target which
> > +    // should succeed.
> 
> Is, uh, that really guaranteed?  Looking at TaskQueue::DispatchLocked, I see
> a lot of different failure conditions.  Though I guess we won't hit the
> AbortIfFlushing condition, and dispatch to the underlying event target
> really ought to be infallible.  So maybe we'll only fail if we're shutting
> down.

Well, AbortIfFlushing is dead code. Nothing I could find sets mIsFlushing.  (Have I mentioned I don't love some of the stuff in AbstractThread/TaskQueue land?)

That just leaves shutdown and real dispatch failure.  I think propagating the real dispatch failure back is reasonable.  Double-attempting it should not have any impact other than a perf penalty during this rare error condition.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 46

•

8 years ago

Comment on attachment 8791052 [details] [diff] [review]
P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug

I think nsGlobalWindow::SuspendTimeouts() is probably buggy at the moment.  I'm going to write a separate bug to fix that first.

Attachment #8791052 - Flags: review?(bugs)

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

8 years ago

Depends on: 1303167

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 47

•

8 years ago

Comment on attachment 8790932 [details] [diff] [review]
P2 Expose a main thread TaskQueue on nsGlobalWindow. r=bz


>+
>+  // Attempt to use the task queue already present in our constellation.
We're about to change the naming. You mean here 'tab group'.
(document group is then the similar-origin-group of related globals)

>+  // Maybe create a TaskQueue for dispatching runnables to the main thread
>+  // from this window.  This TaskQueue will throttle these runnables and
>+  // prevent the window content from significantly impacting main thread
>+  // responsiveness.  Certain situations will result in a nullptr task
>+  // queue even after calling this method.
>+  void
>+  MaybeCreateMainThreadTaskQueue();
Why the task queue is called MainThreadTaskQueue? That sounds like something bound to the main thread, but here we can have actually
several task queues, one per tab group.
Perhaps call it GetTabGroupTaskQueue? Though, I'm not quite happy with that either, since only some tasks are supposed to be queued here.
GetThrottlableTabGroupTaskQueue is quite horrible, but is closer to what the thing is about, right? I'd prefer something like that.

r+ if the name of the task queue is changed to something which hints what it is about.

Attachment #8790932 - Flags: review?(bugs) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 48

•

8 years ago

(In reply to Olli Pettay [:smaug] from comment #47)
> >+  void
> >+  MaybeCreateMainThreadTaskQueue();
> Why the task queue is called MainThreadTaskQueue? That sounds like something
> bound to the main thread, but here we can have actually
> several task queues, one per tab group.

Well, a TaskQueue can target any thread or nsIEventTarget.  So I was trying to indicate which thread the runnables for the TaskQueue would eventually run on.  While this may seem implicit here, its an important distinction for worker threads.  And I was trying to use uniform naming throughout.

I guess I was thinking the window would have a "task queue that goes to the main thread", and its somewhat abstracted which task queue that is.  The window picks the right one when its activated.  It may create its own or use the tab group's TaskQueue.  The caller of the code should not care, though.  It should just use the TaskQueue associated with the window regardless of the source.

Does any of that make sense?

Flags: needinfo?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 49

•

8 years ago

Well, not really :) Window always lives on the main thread, so why would you ever get non-main-thread event queue. So, I would rather indicate in the name how it is supposed to be called.

Flags: needinfo?(bugs)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 50

•

8 years ago

I guess if the name is the most controversial thing here I'll call it a win.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 51

•

8 years ago

Comment on attachment 8790933 [details] [diff] [review]
P3 Make setTimeout() and setInterval() use the MainThreadTaskQueue. r=bz

Dropping review here until I fix bug 1303167.  Simplifying our timeout code should make this review easier.

Attachment #8790933 - Flags: review?(bugs)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 52

•

8 years ago

Attached patch P2 Expose a main thread TaskQueue on nsGlobalWindow. r=smaug (obsolete) — Details — Splinter Review

Rebase.  I still need to address action items.

Attachment #8790932 - Attachment is obsolete: true

Attachment #8804279 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 53

•

8 years ago

Attached patch P3 Make setTimeout() and setInterval() use the MainThreadTaskQueue. r=smaug (obsolete) — Details — Splinter Review

Attachment #8790933 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 54

•

8 years ago

Attached patch P4 Attach the worker TaskQueue to the window TaskQueue if present. r=baku (obsolete) — Details — Splinter Review

Rebase

Attachment #8790952 - Attachment is obsolete: true

Attachment #8804287 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 55

•

8 years ago

Attached patch P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug (obsolete) — Details — Splinter Review

Update to use the new API from bug 1303167.

Attachment #8791052 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 56

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=799f52ac4374f28326ee44cc77ecd7b8da5adf2a

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 57

•

8 years ago

Looks like a new test failure to fix in W(1):

TEST-UNEXPECTED-TIMEOUT | /media-source/mediasource-seek-beyond-duration.html | Test seeking beyond media duration. - Test timed out

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 58

•

8 years ago

Attached patch P2 Expose a TaskQueue on that TabGroup. r=smaug (obsolete) — Details — Splinter Review

Attachment #8804279 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 59

•

8 years ago

Attached patch P3 Make setTimeout() and setInterval() use the TabGroup TaskQueue. r=smaug (obsolete) — Details — Splinter Review

Attachment #8804284 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 60

•

8 years ago

Attached patch P4 Attach the worker TaskQueue to the window TaskQueue if present. r=baku (obsolete) — Details — Splinter Review

Attachment #8804287 - Attachment is obsolete: true

Attachment #8805589 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 61

•

8 years ago

Attached patch P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug (obsolete) — Details — Splinter Review

Attachment #8804288 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 62

•

8 years ago

Rewrote the patches slightly to keep the TaskQueue on the TabGroup.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=6f42908cd89a7e4fe6b13aa33a44d4492a331284

Ben Kelly [:bkelly, not reviewing]

Assignee

•

8 years ago

Attached patch P4 Attach the worker TaskQueue to the window TaskQueue if present. r=baku (obsolete) — Details — Splinter Review

https://treeherder.mozilla.org/#/jobs?repo=try&revision=819353f929d26854ab9cce606227c927af10f460&selectedJob=30117035

Still have to figure out why

  dom/tests/mochitest/bugs/test_bug479143.html

Is hanging in non-e10s.  At least it reproduces locally.

Attachment #8805589 - Attachment is obsolete: true

Attachment #8805697 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 66

•

8 years ago

The issue with the failure in comment 65 is:

1) A setTimeout() fires running js code from within the TaskQueue context
2) The js code calls showModalDialog() which spins the event loop
3) The showModalDialog() uses a setTimeout() to close itself
4) The nested setTimeout() never gets to run because the TaskQueue is still stuck on the runnable from (1).

I think the cleanest way to solve this is to make the nsITimer notify to the TaskQueue as my patches do now, but then to issue another runnable to the main thread to execute js code.  This means we have at most 2 runnables in the main thread event queue at once instead of just 1 runnable.  I don't love it, but it should be ok.

I have a hacked up version which passes the test with this locally.  I need to clean it up and do another full try run, though.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 67

•

8 years ago

Actually, we could probably use a microtask or "tail dispatch" instead of a normal runnable dispatch here.  That would avoid any additional delay in running the timer.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 68

•

8 years ago

Does that matter too much? But anyhow, please don't add new microtask usage right now. I have patches to make Promises to use microtasks but still working on to fix our devtools and browsers tests.
(only 1 failing devtool test left, browser_responsiveui_window_close.js, written by you! ;), and dozens of failing browser tests)

Hmm, now that I think. Running setTimeout code during microtask checkpoint would be quite surprising, and against the spec.

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

8 years ago

Blocks: 1313864

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 69

•

8 years ago

Attached patch P1 Add the ThrottledEventQueue class. r=froydnj (obsolete) — Details — Splinter Review

So, after working on this bug for a while and talking to :billm I finally decided we need something slightly different from TaskQueue.

TaskQueue was designed for feeding a thread pool.  As such it wants to guarantee that no two events are running on the thread pool at once.  To that end it dispatches the next event only after all processing for the current event is done.

That works for thread pools, but is not quite right for a mechanism that dispatches to a single thread which may be its current thread.  In that world we need to handle cases where the event might spin the event loop.  If we only dispatch the next runnable after processing the current event then spinning the event loop from an event stalls the TaskQueue.

This class, in contrast, dispatches its next runnable *before* processing its current event.  This solves the event loop problem, but makes it not suitable for thread pool targets.

Writing a new class also allowed us to more easily work around some other idiosyncrasies in TaskQueue.  For example, this class is a simple nsIEventTarget which ensures it will work with the majority of the gecko tree.

Most of this patch is the same as the previous SelfClosingTaskQueueTarget class I previously had reviewed here.  I simply swapped out the TaskQueue for an nsEventQueue, a mutex, and a runnable.  (I believe the nsEventQueue is likely better optimized than TaskQueue's std::queue as well.)

Attachment #8791375 - Attachment is obsolete: true

Attachment #8806780 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 70

•

8 years ago

Attached patch P2 Expose a ThrottledEventQueue on TabGroup and nsPIDOMWindow. r=smaug (obsolete) — Details — Splinter Review

This patch exposes the ThrottledEventQueue on the TabGroup and nsPIDOMWindow objects.

A TabGroup always has a ThrottledEventQueue if we're not in shutdown, but we only expose it to windows that are both non-chrome and have an outer window assigned (because we need it to get the TabGroup).

Attachment #8805695 - Attachment is obsolete: true

Attachment #8806784 - Flags: review?(bugs)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 71

•

8 years ago

Attached patch P3 Make setTimeout() and setInterval() use the TabGroup ThrottledEventQueue. r=smaug (obsolete) — Details — Splinter Review

I'm still running a few final tests before flagging this patch for review.

Attachment #8805696 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 72

•

8 years ago

Attached patch P4 Use ThrottledEventQueue in workers. r=baku (obsolete) — Details — Splinter Review

This is a rewrite of the previous P4 patch that made workers feed the window's TaskQueue.

This patch makes workers now use ThrottledEventQueue instead of TaskQueue.  It also feeds into the window TaskQueue as we did before.

Attachment #8805697 - Attachment is obsolete: true

Attachment #8806787 - Flags: review?(amarchesini)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 73

•

8 years ago

Attached patch P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug (obsolete) — Details — Splinter Review

Again, still doing some tests before flagging for review on this one.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=f72d86a13e47d80d9ce718bf308ecaa0ed5f6380

Attachment #8805590 - Attachment is obsolete: true

Andrea Marchesini [:baku]

Comment 74

•

8 years ago

Comment on attachment 8806787 [details] [diff] [review]
P4 Use ThrottledEventQueue in workers. r=baku

Review of attachment 8806787 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/WorkerPrivate.cpp
@@ +4002,5 @@
> +  // Throttle events to the main thread using a ThrottledEventQueue specific to
> +  // this worker thread.  This may return nullptr during shutdown.
> +  mMainThreadThrottledEventQueue = ThrottledEventQueue::Create(target);
> +
> +  // If we were able to creat the throttled event queue, then use it for

to create

@@ +4624,5 @@
>      // If the worker thread is spamming the main thread faster than it can
>      // process the work, then pause the worker thread until the MT catches
>      // up.
> +    if (mMainThreadThrottledEventQueue &&
> +        mMainThreadThrottledEventQueue->Length() > 5000) {

Maybe this should be a #define MAX_THROTTLEDEVENTQUEUE_LENGTH

Attachment #8806787 - Flags: review?(amarchesini) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 75

•

8 years ago

Comment on attachment 8806780 [details] [diff] [review]
P1 Add the ThrottledEventQueue class. r=froydnj

Need to sort out some try failures.

Attachment #8806780 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 76

•

8 years ago

Attached patch P4 Use ThrottledEventQueue in workers. r=baku (obsolete) — Details — Splinter Review

Minor tweak to fix building on windows where the compiler is more strict about ternary types.

Also, I had to make child worker threads inherit their parent worker's ThrottledEventQueue.  Unlike TaskQueue we must currently create ThrottledEventQueue on the main thread.

Attachment #8806787 - Attachment is obsolete: true

Attachment #8806831 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 77

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3b2df7bc0370f9211935755413df978d98304b8d

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 78

•

8 years ago

Comment on attachment 8806784 [details] [diff] [review]
P2 Expose a ThrottledEventQueue on TabGroup and nsPIDOMWindow. r=smaug

> TabGroup::TabGroup()
>-{}
>+{
>+  nsCOMPtr<nsIThread> mainThread;
>+  NS_GetMainThread(getter_AddRefs(mainThread));
>+  MOZ_DIAGNOSTIC_ASSERT(mainThread);
>+
>+  // This may return nullptr during xpcom shutdown.  This is ok as we
>+  // do not guarantee a task queue will be present.
>+  mThrottledEventQueue = ThrottledEventQueue::Create(mainThread);
>+}
I don't think we should create mThrottledEventQueue for chrome tab group, since then
accessing the queue from window returns null, but accessing queue from TabGroup is non-null

Attachment #8806784 - Flags: review?(bugs) → review-

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 79

•

8 years ago

Attached patch P2 Expose a ThrottledEventQueue on TabGroup and nsPIDOMWindow. r=smaug (obsolete) — Details — Splinter Review

This adds an argument to the TabGroup constructor so we can avoid creating the ThrottledEventQueue for the chrome TabGroup.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=cc8529e302395177acb25432946825492d999d33

Attachment #8806784 - Attachment is obsolete: true

Attachment #8806844 - Flags: review?(bugs)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 80

•

8 years ago

Comment on attachment 8806844 [details] [diff] [review]
P2 Expose a ThrottledEventQueue on TabGroup and nsPIDOMWindow. r=smaug

Sorry, too quick to flag here.

Attachment #8806844 - Flags: review?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 81

•

8 years ago

>+  MOZ_ASSERT(!sChromeTabGroup);
Clearly this assertion may fire when non-chrome tab groups are created. So, remove the assertion.

With that, r+ for the patch fixing that issue.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 82

•

8 years ago

Attached patch P2 Expose a ThrottledEventQueue on TabGroup and nsPIDOMWindow. r=smaug (obsolete) — Details — Splinter Review

Removed the spurious assert.  r+ per previous comment.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ab7a125690134d5a7a7218c10a7e176d88cf04e6

Attachment #8806844 - Attachment is obsolete: true

Attachment #8806847 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 83

•

8 years ago

Attached patch P4 Use ThrottledEventQueue in workers. r=baku (obsolete) — Details — Splinter Review

Another worker tweak.  We don't want to shutdown the ThrottledEventQueue explicitly since it might be shared with child workers.  Instead just rely on the automatic shutdown mechanism.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4a092dee584b24fd4654240857aac0d67ceb0762

Attachment #8806831 - Attachment is obsolete: true

Attachment #8806900 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 84

•

8 years ago

Comment on attachment 8806780 [details] [diff] [review]
P1 Add the ThrottledEventQueue class. r=froydnj

Ok, I didn't need to change anything here.  Please see patch description in comment 69.

Attachment #8806780 - Flags: review?(nfroyd)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 85

•

8 years ago

Comment on attachment 8806785 [details] [diff] [review]
P3 Make setTimeout() and setInterval() use the TabGroup ThrottledEventQueue. r=smaug

Review of attachment 8806785 [details] [diff] [review]:
-----------------------------------------------------------------

This patch modifies our timer code to dispatch runnables using the window's ThrottledEventQueue.

In addition, we also try to avoid running multiple callbacks in a single runnable.  Chrome windows, however, will still batch similar to how chrome windows bypass the throttling mechanism.

Attachment #8806785 - Flags: review?(bugs)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 86

•

8 years ago

Comment on attachment 8806788 [details] [diff] [review]
P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug

Add a mechanism to apply back pressure to a window.  If we detect the ThrottledEventQueue is falling behind simply Suspend() the window.  We place a runnable at the end of the queue that will Resume() the window (when the queue is presumably clear).

Right now I trigger this from the timer code. As more subsystems start using the ThrottledEventQueue() they should also check for the back pressure condition.

This back pressure mechanism is important to avoid memory problems.  Throttling events by itself keeps the browser responsive, but moves the cpu problem to a memory problem by buffering events in the queue.  The back pressure is then a way to solve the memory problem by not filling the queue any more after it hits a threshold.

Attachment #8806788 - Flags: review?(bugs)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 87

•

8 years ago

Try build from comment 83 is looking pretty green!  Hoping I can land this before it bit rots.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 88

•

8 years ago

Comment on attachment 8806788 [details] [diff] [review]
P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug

It feels wrong to suspend and yet expect the event queue to be processed.
I must be missing something here.
I would have expected us to throttle timeout handling in case the queue gets too long. Though, that wouldn't help with postMessage messages coming from workers. In that case worker should be throttled.
But looks like this doesn't anyhow deal with postMessages... so, I'm lost here.


Either I'm missing something here, or this adds some new odd state where we're suspended but yet process stuff.

Attachment #8806788 - Flags: review?(bugs) → review-

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 89

•

8 years ago

Comment on attachment 8806788 [details] [diff] [review]
P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug

ok, after the explanation on irc, this make more sense.

Please add some comment that this back pressure should apply only in worst case scenarios.

Attachment #8806788 - Flags: review- → review+

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 90

•

8 years ago

Comment on attachment 8806785 [details] [diff] [review]
P3 Make setTimeout() and setInterval() use the TabGroup ThrottledEventQueue. r=smaug

>+  // Run a limited number of timers at once to avoid janking the main
>+  // thread.  Any timers beyond this will get picked up on the next
>+  // timer runnable which should fire immediately.  Chrome windows are
>+  // exempty from this limit.
>+  static const uint32_t kMaxSequentialTimeouts = 1;
I don't understand what this stuff has to do with this bug, but fine.

>   // The timeout list is kept in deadline order. Discover the latest timeout
>   // whose deadline has expired. On some platforms, native timeout events fire
>   // "early", but we handled that above by setting deadline to aTimeout->mWhen
>   // if the timer fired early.  So we can stop walking if we get to timeouts
>   // whose mWhen is greater than deadline, since once that happens we know
>   // nothing past that point is expired.
>   last_expired_timeout = nullptr;
>+  uint32_t count = 0;
>+  bool foundTarget = false;
>   for (Timeout* timeout = mTimeouts.getFirst();
>        timeout && timeout->mWhen <= deadline;
>        timeout = timeout->getNext()) {
>     if (timeout->mFiringDepth == 0) {
>       // Mark any timeouts that are on the list to be fired with the
>       // firing depth so that we can reentrantly run timeouts
>       timeout->mFiringDepth = firingDepth;
>       last_expired_timeout = timeout;
>+
>+      // Note that we have seen the timer this runnable was dispatched
>+      // to fire.
>+      if (timeout == aTimeout) {
>+        foundTarget = true;
>+      }
>+
>+      // Apply throttling to the number of timers we will fire in a
>+      // single runnable before yielding the main thread.
>+      //
>+      // While we would like to throttle immediately after finding our
>+      // limited number of timers, we MUST fire our target timer.  We
>+      // cannot rely on another timer's runnable firing because that
>+      // timer might get cleared as a result of this runnable.
>+      //
>+      // Note, the fact that our target timeout is not always the first
>+      // timer we encounter suggests we have other bugs.  We should
>+      // receive timer callbacks in the same order we store our mTimers
>+      // list.  It seems they can come slightly out of order after
>+      // suspend/resume, however.
>+      count += 1;
>+      if (foundTarget && count >= kMaxSequentialTimeouts && !IsChromeWindow()) {
>+        break;
>+      }
So this doesn't guarantee at most kMaxSequentialTimeouts timeouts, if timeout == aTimeout isn't true for the first timeout.
Per IRC, remove kMaxSequentialTimeouts and just rely on 'timeout == aTimeout' and document that one or more, but usually just one timer ends up running.

Attachment #8806785 - Flags: review?(bugs) → review+

Nathan Froyd [:froydnj]

Comment 91

•

8 years ago

Comment on attachment 8806780 [details] [diff] [review]
P1 Add the ThrottledEventQueue class. r=froydnj

Review of attachment 8806780 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good, I just have a couple of comments on the documentation (which is great!).  r=me with comments fixed up.

::: xpcom/threads/ThrottledEventQueue.cpp
@@ +17,5 @@
> +using mozilla::services::GetObserverService;
> +
> +namespace {
> +
> +static const char* kShutdownTopic = "xpcom-shutdown";

Nit: please make this |static const char kShutdownTopic[]|

@@ +158,5 @@
> +    Unused << event->Run();
> +    --mExecutionDepth;
> +
> +    // If shutdown was started and the queue is now empty we can now
> +    // finalize the shutdown.  This is performed separate at the end

"...performed separately", perhaps?

@@ +174,5 @@
> +    MOZ_ASSERT(IsEmpty());
> +
> +    nsCOMPtr<nsIObserverService> obs = GetObserverService();
> +
> +    // This may cause immediate destruction of `this`.

That can't be the case, because the NewRunnableMethod executing ShutdownComplete is already holding a strong ref to `this`, right?

@@ +266,5 @@
> +  AwaitIdle() const
> +  {
> +    // Any thread, except the main thread or our base target.  Blocking the
> +    // main thread is forbidden and block the base target is guaranteed to
> +    // produce a deadlock.

Does this mean to say "...blocking the base target..."?  Not sure if it makes sense otherwise.

::: xpcom/threads/ThrottledEventQueue.h
@@ +3,5 @@
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +#ifndef mozilla_ThrottledEventQueue_h

I've been asking people to put a one-line comment here so DXR etc. will display a little information about files.  Maybe:

// nsIEventTarget wrapper for throttling event dispatch.

?

@@ +13,5 @@
> +
> +// A ThrottledEventQueue is an event target that can be used to throttle
> +// events being dispatched to another base target.  It maintains its
> +// own queue of events and only dispatches one at a time to the wrapped
> +// target.  This can be used to avoid flooding the base target.

Thank you for this comment.

It's not immediately obvious from the implementation, or from this comment, how this works: after all, a normal thread maintains its own queue of events and only runs ("dispatches") one at a time.  I think something needs to be added about how Length/AwaitIdle can be used to throttle events coming into the queue?

@@ +53,5 @@
> +  Create(nsIEventTarget* aBaseTarget);
> +
> +  // Determine if there are any events pending in the queue.
> +  bool
> +  IsEmpty() const;

Nit: I think we only use this style with the return type of a separate line for definitions, or if the complete declaration is too long to fit on one line.  Can you fix the declarations up to be on one line?

@@ +72,5 @@
> +  // In order to determine when shutdown completes you can either dispatch
> +  // a runnable before calling MaybeStartShutdown() or you can block
> +  // until the queue is empty using AwaitIdle().
> +  void
> +  MaybeStartShutdown();

Is there value in making this a public function, or could it be protected/private and made visible via `friend` declarations instead?

Attachment #8806780 - Flags: review?(nfroyd) → review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 92

•

8 years ago

Thanks for the review!

(In reply to Nathan Froyd [:froydnj] from comment #91)
> @@ +72,5 @@
> > +  // In order to determine when shutdown completes you can either dispatch
> > +  // a runnable before calling MaybeStartShutdown() or you can block
> > +  // until the queue is empty using AwaitIdle().
> > +  void
> > +  MaybeStartShutdown();
> 
> Is there value in making this a public function, or could it be
> protected/private and made visible via `friend` declarations instead?

I was using it in my workers patch, but stopped doing so.  I can make it private for now and change the comments to indicate that people should depend on the self-closing behavior.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 93

•

8 years ago

I mean, I think explicitly shutting down is a valid use case, but maybe we don't need it right now.

Nathan Froyd [:froydnj]

Comment 94

•

8 years ago

(In reply to Ben Kelly [:bkelly] from comment #93)
> I mean, I think explicitly shutting down is a valid use case, but maybe we
> don't need it right now.

I think that could be a valid use case, too, but I would prefer that we make the API non-visible right now and let the person who needs it make a case for why it should be public later on.

Nathan Froyd [:froydnj]

Comment 95

•

8 years ago

Should we MOZ_COUNT_CTOR/DTOR the Inner class, to ensure the observer service isn't holding a bunch of these things alive?

Flags: needinfo?(bkelly)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 96

•

8 years ago

(In reply to Nathan Froyd [:froydnj] from comment #95)
> Should we MOZ_COUNT_CTOR/DTOR the Inner class, to ensure the observer
> service isn't holding a bunch of these things alive?

NS_IMPL_ISUPPORTS uses ref-count logging which is supposedly preferred over ctor/dtor logging.

Flags: needinfo?(bkelly)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 97

•

8 years ago

Attached patch P1 Add the ThrottledEventQueue class. r=froydnj — Details — Splinter Review

Attachment #8806780 - Attachment is obsolete: true

Attachment #8808279 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 98

•

8 years ago

Attached patch P2 Expose a ThrottledEventQueue on TabGroup and nsPIDOMWindow. r=smaug — Details — Splinter Review

Attachment #8806847 - Attachment is obsolete: true

Attachment #8808280 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 99

•

8 years ago

Attached patch P3 Make setTimeout() and setInterval() use the TabGroup ThrottledEventQueue. r=smaug — Details — Splinter Review

Attachment #8806785 - Attachment is obsolete: true

Attachment #8808281 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 100

•

8 years ago

Attached patch P4 Use ThrottledEventQueue in workers. r=baku — Details — Splinter Review

Attachment #8806900 - Attachment is obsolete: true

Attachment #8808282 - Flags: review+

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 101

•

8 years ago

Attached patch P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug — Details — Splinter Review

One last try to make sure I didn't break something while addressing review feedback:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=7ba29c73421cc2e05439bb8edd4badbf0fc28aa9

Attachment #8806788 - Attachment is obsolete: true

Attachment #8808284 - Flags: review+

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 102

•

8 years ago

I don't want to hold this up, but is there a reason the queue is on the TabGroup and not the DocGroup?

Flags: needinfo?(bkelly)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 103

•

8 years ago

(In reply to Bill McCloskey (:billm) from comment #102)
> I don't want to hold this up, but is there a reason the queue is on the
> TabGroup and not the DocGroup?

Its the safest starting point given all documents in the TabGroup have sync access to each other in one form or another.  In the future we can create DocGroup specific queues that feed into the TabGroup queue.  Timers could then use those DocGroup queues.  Or we could even have window specific timer queues.

I guess I view this as a starting point with the expectation that we will continue to refine it.

Flags: needinfo?(bkelly)

Pulsebot

Comment 104

•

8 years ago

Pushed by bkelly@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/537dea427441
P1 Add the ThrottledEventQueue class. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/a7b4c0d350ef
P2 Expose a ThrottledEventQueue on TabGroup and nsPIDOMWindow. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/00efae73e570
P3 Make setTimeout() and setInterval() use the TabGroup ThrottledEventQueue. r=smaug
https://hg.mozilla.org/integration/mozilla-inbound/rev/2ff33f3c59d5
P4 Use ThrottledEventQueue in workers. r=baku
https://hg.mozilla.org/integration/mozilla-inbound/rev/24d97dc514d4
P5 Suspend a window if its main thread TaskQueue falls to far behind. r=smaug

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

8 years ago

Blocks: 1315803

Phil Ringnalda (:philor)

Updated

•

8 years ago

Depends on: 1315867

Alison Shiue

Updated

•

8 years ago

Depends on: 1315895

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 105

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/537dea427441
https://hg.mozilla.org/mozilla-central/rev/a7b4c0d350ef
https://hg.mozilla.org/mozilla-central/rev/00efae73e570
https://hg.mozilla.org/mozilla-central/rev/2ff33f3c59d5
https://hg.mozilla.org/mozilla-central/rev/24d97dc514d4

Status: ASSIGNED → RESOLVED

Closed: 8 years ago

status-firefox52: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla52

Loic

Updated

•

8 years ago

Depends on: 1318084

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

8 years ago

Blocks: 1320235

Emilio Cobos Álvarez (:emilio)

Updated

•

8 years ago

Blocks: 1322328

Alice0775 White