Last Comment Bug 715376 - implement per-page event queue, to allow timeout grouping/prioritization/etc
: implement per-page event queue, to allow timeout grouping/prioritization/etc
Status: NEW
[Snappy:P1][leave open]
:
Product: Core
Classification: Components
Component: DOM: Core & HTML (show other bugs)
: unspecified
: x86 Windows 7
: -- normal with 27 votes (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
Mentors:
Depends on: 847991 827990 827993 829191 892511 1003096
Blocks: 715378 820411 1266353
  Show dependency treegraph
 
Reported: 2012-01-04 16:14 PST by (dormant account)
Modified: 2016-08-18 12:14 PDT (History)
78 users (show)
khuey: needinfo? (overholt)
ryanvm: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
evidence that we fire too many timers (5.54 KB, image/png)
2012-02-03 03:01 PST, (dormant account)
no flags Details
patch (16.54 KB, patch)
2012-05-02 08:59 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
RunTimeout refactoring (16.40 KB, patch)
2012-05-02 09:00 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
queue timeouts for later processing (17.58 KB, patch)
2012-05-04 10:29 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
per-window event queues (5.72 KB, patch)
2012-05-04 10:31 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
per-page event queues and some event dispatching (17.58 KB, patch)
2012-05-16 12:34 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
per-page event queues and some event dispatching (28.54 KB, patch)
2012-05-16 12:37 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
WIP patch, alternate event dispatching only (10.57 KB, patch)
2012-05-23 13:09 PDT, Nathan Froyd [:froydnj]
bugs: feedback+
bzbarsky: feedback+
Details | Diff | Splinter Review
WIP patch, alternate event dispatching only (11.94 KB, patch)
2012-06-05 07:04 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
WIP rollup patch (33.03 KB, patch)
2012-06-21 13:01 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
WIP rollup patch (40.53 KB, patch)
2012-06-28 14:34 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
stack frames (13.08 KB, text/plain)
2012-06-28 14:51 PDT, Nathan Froyd [:froydnj]
no flags Details
WIP rollup patch (61.66 KB, patch)
2012-07-25 09:18 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 1 - per-document event queue infrastructure (15.75 KB, patch)
2012-07-31 13:26 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 2 - redirect events to the per-document event queue (39.14 KB, patch)
2012-07-31 13:28 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 3 - make tests aware of the per-document event queue (6.38 KB, patch)
2012-07-31 13:37 PDT, Nathan Froyd [:froydnj]
jmaher: review+
justin.lebar+bug: review+
Details | Diff | Splinter Review
part 1 - redirect events to the nearest document's event queue (49.54 KB, patch)
2012-08-09 11:35 PDT, Nathan Froyd [:froydnj]
bzbarsky: feedback+
Details | Diff | Splinter Review
current rollup patch (4.13 KB, patch)
2012-10-31 07:37 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
current rollup patch (73.62 KB, patch)
2012-12-05 08:00 PST, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 1 - redirect events through the nearest document's task queue (61.68 KB, patch)
2012-12-05 10:48 PST, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 2 - make nsDocument manage its own event queue rather than deferring to the current thread (10.03 KB, patch)
2012-12-05 10:50 PST, Nathan Froyd [:froydnj]
bzbarsky: review+
Details | Diff | Splinter Review
part 3 - post timers to the correct nsIEventTarget (7.13 KB, patch)
2012-12-05 10:51 PST, Nathan Froyd [:froydnj]
bzbarsky: review+
Details | Diff | Splinter Review
part 4 - update MockFilePicker.init() to take a window for event dispatching purposes (16.10 KB, patch)
2012-12-05 10:52 PST, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 5 - fix executeSoon to be window-aware (1.95 KB, patch)
2012-12-05 10:53 PST, Nathan Froyd [:froydnj]
jmaher: review+
Details | Diff | Splinter Review
part 6 - make iframes and window.open'd windows share event targets with parents (1.97 KB, patch)
2012-12-05 10:58 PST, Nathan Froyd [:froydnj]
bzbarsky: review+
Details | Diff | Splinter Review
part 7 - fixup tests to call executeSoon/setTimeout on the correct windows (20.63 KB, patch)
2012-12-05 10:59 PST, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 4 - update MockFilePicker.init() to take a window for event dispatching purposes (16.11 KB, patch)
2012-12-05 11:03 PST, Nathan Froyd [:froydnj]
jmaher: review+
Details | Diff | Splinter Review
redirect events through the nearest document's task queue (60.41 KB, patch)
2012-12-10 08:06 PST, Nathan Froyd [:froydnj]
bzbarsky: review+
Details | Diff | Splinter Review
part 1 - redirect events through the nearest document's task queue (64.95 KB, patch)
2013-07-11 09:55 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 2 - make nsDocument manage its own event queue rather than deferring to the current thread (10.03 KB, patch)
2013-07-11 09:56 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 3 - post timers to the correct nsIEventTarget (6.45 KB, patch)
2013-07-11 09:57 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 6 - make iframes and window.open'd windows share event targets with parents (2.96 KB, patch)
2013-07-11 09:58 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 7 - fixup tests to call executeSoon/setTimeout on the correct windows (34.23 KB, patch)
2013-07-11 10:01 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 1 - redirect all events through the nearest document (72.64 KB, patch)
2014-03-14 11:08 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 2 - post timers to nsIEventTargets associated with documents (7.58 KB, patch)
2014-03-14 11:08 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 3 - make nsDocument manage its own event queue rather than deferring to the current thread (7.84 KB, patch)
2014-03-14 11:08 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 4 - make iframes and window.open'd windows share event targets with parents (4.48 KB, patch)
2014-03-14 11:08 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 5 - fix executeSoon to be window-aware (2.27 KB, patch)
2014-03-14 11:09 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
part 6 - fixup tests to call executeSoon/setTimeout on the correct window(s) (40.93 KB, patch)
2014-03-14 11:12 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
add envvar switch for turning off event queues (1.65 KB, patch)
2014-03-14 11:13 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
rollup patch that applies to m-c as of now (85.42 KB, patch)
2014-04-28 09:00 PDT, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review

Description (dormant account) 2012-01-04 16:14:40 PST
We currently throttle setTimeouts in background tabs. It would be good to actually combine them to 
a) reduce wakeups
b) do some setTimeout queue management to avoid interrupting the user. ie run 30ms worth of setTimeouts, then sleep a second, then run some more.
Comment 1 Olli Pettay [:smaug] 2012-01-05 07:10:52 PST
(In reply to Taras Glek (:taras) from comment #0)
> We currently throttle setTimeouts in background tabs. It would be good to
> actually combine them to 
> a) reduce wakeups
This sounds good.


> b) do some setTimeout queue management to avoid interrupting the user. ie
> run 30ms worth of setTimeouts, then sleep a second, then run some more.
Wouldn't this cause browser to be possibly unresponsive once a second, if there are several
background tabs.
Comment 2 (dormant account) 2012-01-05 16:25:13 PST
i didn't mean sleep(1). I mean if you batch up timeouts and you exceed your allocated timeslot for timeouts, don't process any more timeouts for another second.
Comment 3 (dormant account) 2012-01-10 16:39:50 PST
Batching timeouts is nicer for CPU-run-to-completetion and it should make it easier to reschedule batches of timeouts when thresholds are crossed.
Comment 4 Johnny Stenback (:jst, jst@mozilla.com) 2012-02-02 12:03:46 PST
John will be looking into this, reassigning. There's probably a few different ways we can do this, I'd like to see us stop using an XPCOM timer per DOM timeout and use either just one, or one per window, or something like that.
Comment 5 (dormant account) 2012-02-03 03:01:35 PST
Created attachment 594109 [details]
evidence that we fire too many timers

This doesn't distinguish between foreground/background, but I think DOM_TIMERS_RECENTLY_SET is a pretty good indicator that we need to be more clever about timers.
Comment 6 Johnny Stenback (:jst, jst@mozilla.com) 2012-03-01 11:44:08 PST
John is starting to investigate here, looking into scheduling JS timeouts off of the refresh driver, and looking into better scheduling of timeouts in general (i.e. Windows media mode for >16ms timer resolution when necessary), and also slowing down background tabs by spreading out slowed down timeouts in background tabs to cause as consistent as possible "load" as opposed to firing all background timers at once (per window or what not).
Comment 7 Daniel Cater 2012-03-09 19:47:37 PST
(In reply to Taras Glek (:taras) from comment #5)
> Created attachment 594109 [details]
> evidence that we fire too many timers
> 
> This doesn't distinguish between foreground/background, but I think
> DOM_TIMERS_RECENTLY_SET is a pretty good indicator that we need to be more
> clever about timers.

What does each axis represent?
Comment 8 (dormant account) 2012-04-18 11:38:42 PDT
Morphing this bug "slightly" based on discussion with bz. Per-page event queues will let us deal with XHR, timeouts, and other queued up events properly. This will also help with prioritizing active tabs, decaying old ones, etc.
Comment 9 Nathan Froyd [:froydnj] 2012-05-02 08:59:40 PDT
Created attachment 620342 [details] [diff] [review]
patch

Here's a patch which:

1) Moves the actual handling of timeouts to its own global, not per-window, queue;
2) Runs those queued handlers at appropriate times.

Despite trying to understand all the corner cases, I'm quite sure I've missed several.  However, the browser does run and can process setTimeouts (various news-y webpages, JS sort animations, etc.), so things aren't too awful.  I have not run the testsuite.

There's no prioritization/throttling of foreground/background tabs and the like yet.

I did write this on top of the RunTimeout refactoring in bug 748464; I'll attach that patch here for reference.  If that patch doesn't get approved, it's easy enough to fix.
Comment 10 Nathan Froyd [:froydnj] 2012-05-02 09:00:33 PDT
Created attachment 620343 [details] [diff] [review]
RunTimeout refactoring

The aforementioned bug 748464 patch, placed here for completeness.
Comment 11 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2012-05-02 09:33:27 PDT
Comment on attachment 620342 [details] [diff] [review]
patch

Review of attachment 620342 [details] [diff] [review]:
-----------------------------------------------------------------

This patch doesn't really match the summary of the bug, but here are a few comments anyways.

::: dom/base/nsGlobalWindow.cpp
@@ +878,5 @@
> +  nsTArray<nsRefPtr<nsGlobalWindow> > mWindows;
> +  static GlobalTimeoutQueue *sInstance;
> +};
> +
> +GlobalTimeoutQueue* GlobalTimeoutQueue::sInstance;

This needs to be initialized to null, right?  Also, GlobalTimeoutQueue is leaked (which will trigger our leak detection code since it has a member nsTArray).

@@ +908,5 @@
> +{
> +  NS_ASSERTION(NS_IsMainThread(), "wrong thread");
> +
> +  nsCOMPtr<nsIRunnable> runner = new GlobalTimeoutQueueRunner(this);
> +  NS_DispatchToCurrentThread(runner);

Why do we need a separate runnable?  Why not just make the GlobalTimeoutQueue inherit from nsIRunnable?

@@ +915,5 @@
> +bool
> +GlobalTimeoutQueue::IsTrackingWindow(nsGlobalWindow *aWindow)
> +{
> +  return mWindows.Contains(aWindow);
> +}

O(n) behavior here worries me a bit.
Comment 12 Nathan Froyd [:froydnj] 2012-05-02 09:56:36 PDT
Thanks for the comments.

(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #11)
> This patch doesn't really match the summary of the bug, but here are a few
> comments anyways.

What do you think would match the summary of the bug?  I implemented what had been discussed elsewhere, though I might have gotten wires crossed 'twixt talk and implementation.

> ::: dom/base/nsGlobalWindow.cpp
> @@ +878,5 @@
> > +  nsTArray<nsRefPtr<nsGlobalWindow> > mWindows;
> > +  static GlobalTimeoutQueue *sInstance;
> > +};
> > +
> > +GlobalTimeoutQueue* GlobalTimeoutQueue::sInstance;
> 
> This needs to be initialized to null, right?  Also, GlobalTimeoutQueue is
> leaked (which will trigger our leak detection code since it has a member
> nsTArray).

I guess for those crazy platforms where null != 0, yes.  The leakiness is something that should be fixed (at least in debug builds).

> @@ +908,5 @@
> > +{
> > +  NS_ASSERTION(NS_IsMainThread(), "wrong thread");
> > +
> > +  nsCOMPtr<nsIRunnable> runner = new GlobalTimeoutQueueRunner(this);
> > +  NS_DispatchToCurrentThread(runner);
> 
> Why do we need a separate runnable?  Why not just make the
> GlobalTimeoutQueue inherit from nsIRunnable?

No reason, just separating out bits.
Comment 13 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2012-05-02 10:00:46 PDT
(In reply to Nathan Froyd (:froydnj) from comment #12)
> Thanks for the comments.
> 
> (In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #11)
> > This patch doesn't really match the summary of the bug, but here are a few
> > comments anyways.
> 
> What do you think would match the summary of the bug?  I implemented what
> had been discussed elsewhere, though I might have gotten wires crossed
> 'twixt talk and implementation.

Well the summary of the bug talks about per-page event queues ...

> > ::: dom/base/nsGlobalWindow.cpp
> > @@ +878,5 @@
> > > +  nsTArray<nsRefPtr<nsGlobalWindow> > mWindows;
> > > +  static GlobalTimeoutQueue *sInstance;
> > > +};
> > > +
> > > +GlobalTimeoutQueue* GlobalTimeoutQueue::sInstance;
> > 
> > This needs to be initialized to null, right?  Also, GlobalTimeoutQueue is
> > leaked (which will trigger our leak detection code since it has a member
> > nsTArray).
> 
> I guess for those crazy platforms where null != 0, yes.  The leakiness is
> something that should be fixed (at least in debug builds).

Oh, is static data initialized to 0?  I didn't know that

> > @@ +908,5 @@
> > > +{
> > > +  NS_ASSERTION(NS_IsMainThread(), "wrong thread");
> > > +
> > > +  nsCOMPtr<nsIRunnable> runner = new GlobalTimeoutQueueRunner(this);
> > > +  NS_DispatchToCurrentThread(runner);
> > 
> > Why do we need a separate runnable?  Why not just make the
> > GlobalTimeoutQueue inherit from nsIRunnable?
> 
> No reason, just separating out bits.

Well it seems unnecessary to allocate a GlobalTimeoutQueueRunner every time ;-)
Comment 14 Nathan Froyd [:froydnj] 2012-05-04 10:29:36 PDT
Created attachment 621092 [details] [diff] [review]
queue timeouts for later processing

Here's a version of the timeout bits with most of khuey's comments addressed.  Don't yet free() GlobalTimeoutQueue::sInstance, but that's on the radar.
Comment 15 Nathan Froyd [:froydnj] 2012-05-04 10:31:56 PDT
Created attachment 621095 [details] [diff] [review]
per-window event queues

I think this sort of thing is what khuey was asking about on IRC.  This is just the infrastructure, all the modifications of NS_DispatchTo* are yet to be made...and I imagine there are a lot of them.

Compile-tested and basic smoketests pass (with calling DispatchEvent in nsGlobalWindow.cpp only).
Comment 16 Nathan Froyd [:froydnj] 2012-05-04 14:10:47 PDT
(In reply to Nathan Froyd (:froydnj) from comment #15)
> I think this sort of thing is what khuey was asking about on IRC.  This is
> just the infrastructure, all the modifications of NS_DispatchTo* are yet to
> be made...and I imagine there are a lot of them.

Preemptively raising issues: is it going to cause problems that we're now not invoking nsIThreadObserver.OnProcessNextEvent for any events dispatched in this fashion?  Obviously the set of such events is rather small right now, but it is going to get bigger...
Comment 17 Nathan Froyd [:froydnj] 2012-05-09 09:22:40 PDT
Also, it seems that this may not be the best place for this functionality to live long term (since various content and other dom bits need to tie into this), but I do not know of a better place to put it.  Suggestions welcome.
Comment 18 Nathan Froyd [:froydnj] 2012-05-16 12:34:26 PDT
Created attachment 624489 [details] [diff] [review]
per-page event queues and some event dispatching
Comment 19 Nathan Froyd [:froydnj] 2012-05-16 12:37:44 PDT
Created attachment 624492 [details] [diff] [review]
per-page event queues and some event dispatching
Comment 20 Nathan Froyd [:froydnj] 2012-05-16 13:50:00 PDT
After talking with khuey, the basic plan is reasonable, but nsGlobalWindow should be modified to implement nsIEventTarget, so that events can easily be dispatched to windows rather than the current thread or wherever.

It is a open issue whether events should go to inner or outer windows.  Inner seems like the easy choice for now, but careful decisions may have to be made later.
Comment 21 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2012-05-16 13:52:41 PDT
nsGlobalWindow needs to have some object that implements nsIEventTarget that it can hand out.  nsGlobalWindow itself can't implement nsIEventTarget because the latter needs to be threadsafe, and the former isn't.
Comment 22 Nathan Froyd [:froydnj] 2012-05-23 13:09:31 PDT
Created attachment 626571 [details] [diff] [review]
WIP patch, alternate event dispatching only

This patch is strictly a WIP that continues to keep the browser running while making changes.  The idea is to let docshells provide an nsIEventTarget: the main thread for now.  Then various bits can redirect to a docshell's nsIEventTarget instead of NS_DispatchTo{Main,Current}Thread.  Eventually, docshells will provide the associated window's nsIEventTarget and everything will Just Work.  (Modulo some event-ordering issues, but hopefully those will be minor...)

Known areas that haven't been converted yet:

- image loading/decoding;
- possibly other loading bits;
- various events fired during layout, associated gfx bits.

I *think* the loading bits are reasonably easy, but the events fired from layout are sometimes within deep call stacks where propagating the target down would be expensive and/or the frames don't need to hold nsIEventTarget references.  Ideas welcome.

Asking for feedback for general "yes, this is approximately the right direction, but you should do X, Y, and/or Z" or "this is awful, you need to rethink your approach."
Comment 23 Boris Zbarsky [:bz] 2012-05-24 12:53:59 PDT
Comment on attachment 626571 [details] [diff] [review]
WIP patch, alternate event dispatching only

This seems like a totally reasonable approach.

It's worth looking carefully at HTML5 to see whether its task sources are per-document or per-browsing-context or per-browsing-context-group or whatnot so we can align with that.  But having a getter on the document in all cases seems fine; it's just a matter of what it ends up returning long-term.

If we're seriously trying to align with the HTML5 task source business, we should think about whether we want separate event targets per task source or some sort of filtering.  Conceptually, separate event targets makes more sense to me: the actual impl can be the same object that does filtering, if desired.

You don't need to do all that nsIInterfaceRequestor jazz.  Just:

  target = do_GetInterface(docshell);

It's even null-safe.

You also don't need to QI the mainThread: nsIThread inherits from nsIEventTarget.

In the binding manager code, you can probably just drop the event on the floor if !mDocument.  Possibly as a preliminary to this patch.
Comment 24 Olli Pettay [:smaug] 2012-06-04 01:41:03 PDT
Comment on attachment 626571 [details] [diff] [review]
WIP patch, alternate event dispatching only

>+   * Return the target for this document's events.
>+   */
>+  nsIEventTarget *GetEventTarget();
Nit, 
nsIEventTarget* GetEventTarget()
Though, I wonder if this should be already_AddRefed


>+nsIDocument::GetEventTarget()
>+{
>+  // We replaced a bunch of NS_DispatchToCurrentThread calls that were
>+  // hitting the main thread by inspection.  Let's make sure that
>+  // assumption still holds.
>+  NS_ASSERTION(NS_IsMainThread(), "wrong thread!");
>+
>+  nsCOMPtr<nsIEventTarget> target;
>+  nsCOMPtr<nsPIDOMWindow> window = GetWindow();
No need for nsCOMPtr<>
nsPIDOMWindow* is just fine


>+  if (window) {
>+    // FIXME should just ask the window directly
>+    nsIDocShell* docShell = window->GetDocShell();
>+    if (docShell) {
>+      nsCOMPtr<nsIInterfaceRequestor> requestor(do_QueryInterface(docShell));
>+      if (NS_SUCCEEDED(requestor->GetInterface(NS_GET_IID(nsIEventTarget),
>+                                               getter_AddRefs(target)))) {
>+        return target;
>+      }
>+    }
>+  }
though, perhaps you could just
nsCOMPtr<nsIDocShell> ds = do_QueryReferent(mDocumentContainer);
nsCOMPtr<nsIInterfaceRequestor> requestor = do_QueryInterface(ds);
nsCOMPtr<nsIEventTarget> target = do_GetInterface(requestor);
if (!target) {
  target = do_GetMainThread();
}
return target; 


> nsIDocument::DeleteAllProperties()
> {
>@@ -4241,7 +4270,7 @@ nsDocument::EndLoad()
>   if (!mSynchronousDOMContentLoaded) {
>     nsRefPtr<nsIRunnable> ev =
>       NS_NewRunnableMethod(this, &nsDocument::DispatchContentLoadedEvents);
>-    NS_DispatchToCurrentThread(ev);
>+    GetEventTarget()->Dispatch(ev, NS_DISPATCH_NORMAL);

Hmm, should document have method for runnable dispatching.
Something like 
nsresult nsIDocument::DispatchRunnable(nsIRunnable* aRunnable)
{
  nsCOMPtr<nsIEventTarget> target = GetEventTarget();
  return target ? target->Dispatch(ev, NS_DISPATCH_NORMAL) : NS_ERROR_FAILURE;
}

>@@ -8572,6 +8602,7 @@ void
> nsIDocument::ExitFullScreen(bool aRunAsync)
> {
>   if (aRunAsync) {
>+    // XXX would be better if we could GetEventTarget here.
Why we can't?


> nsresult nsAsyncDOMEvent::PostDOMEvent()
> {
>+  // FIXME: these get posted on the main thread; we should make some effort
>+  // to post to any associated window's nsIEventTarget.
>   return NS_DispatchToCurrentThread(this);
> }

return mEventNode ? mEventNode->OwnerDoc()->DispatchRunnable(this) : NS_ERROR_FAILURE;



>+++ b/docshell/base/nsDocShell.cpp
>@@ -937,6 +937,11 @@ NS_IMETHODIMP nsDocShell::GetInterface(const nsIID & aIID, void **aSink)
>              NS_SUCCEEDED(EnsureScriptEnvironment())) {
>         return mScriptGlobal->QueryInterface(aIID, aSink);
>     }
>+    else if (aIID.Equals(NS_GET_IID(nsIEventTarget))) {
>+        // FIXME let this QueryInterface mScriptGlobal.
>+        nsCOMPtr<nsIThread> mainThread = do_GetMainThread();
>+        return mainThread->QueryInterface(aIID, aSink);
So we'll want a new class for the nsIEventTarget. nsDocShell could own it, but the class
could be thread safe.
(I don't recall now if we already made docshell non-thread-safe, but that has been the plan -
at least if we want to cycle collect it)
Comment 25 Nathan Froyd [:froydnj] 2012-06-05 06:58:43 PDT
(In reply to Olli Pettay [:smaug] from comment #24)
> >@@ -4241,7 +4270,7 @@ nsDocument::EndLoad()
> >   if (!mSynchronousDOMContentLoaded) {
> >     nsRefPtr<nsIRunnable> ev =
> >       NS_NewRunnableMethod(this, &nsDocument::DispatchContentLoadedEvents);
> >-    NS_DispatchToCurrentThread(ev);
> >+    GetEventTarget()->Dispatch(ev, NS_DISPATCH_NORMAL);
> 
> Hmm, should document have method for runnable dispatching.

Yes, it should.  Makes dealing with failure modes during shutdown easier.

> >@@ -8572,6 +8602,7 @@ void
> > nsIDocument::ExitFullScreen(bool aRunAsync)
> > {
> >   if (aRunAsync) {
> >+    // XXX would be better if we could GetEventTarget here.
> Why we can't?

ExitFullScreen is static.
Comment 26 Nathan Froyd [:froydnj] 2012-06-05 07:04:50 PDT
Created attachment 630154 [details] [diff] [review]
WIP patch, alternate event dispatching only

New patch, with some feedback comments addressed, and a few more places adjusted.

One thing I'm starting to wonder about is how to ensure that all the necessary places (or at least a sufficient number of them) are redirected through the window's nsIEventTarget.  The ones so far have been trial-and-error with GDB breakpoints on nsThread::Dispatch, but that's not scalable and not likely to catch all the weird cases that might occur.

Also, any sort of network prioritization will likely involve some surgery to Necko's APIs, which seems like a scary prospect.
Comment 27 Olli Pettay [:smaug] 2012-06-05 07:26:43 PDT
Search for all the instances of NS_DispatchToCurrentThread / NS_DispatchToMainThread
in content/, dom/, layout/, editor/, docshell/ ?
Comment 28 Nathan Froyd [:froydnj] 2012-06-05 07:32:55 PDT
(In reply to Olli Pettay [:smaug] from comment #27)
> Search for all the instances of NS_DispatchToCurrentThread /
> NS_DispatchToMainThread
> in content/, dom/, layout/, editor/, docshell/ ?

That's how I started, actually, but the GDB approach seemed better.  IIRC, there weren't a ton of references either.  Will go back with grep at some point; I'm sure all sorts of fun will start happening once these events aren't actually dispatched to the main thread anyway.
Comment 29 Olli Pettay [:smaug] 2012-06-05 07:37:04 PDT
Yes, once we start to change the order of the runnables, odd things may happen.
Comment 30 Nathan Froyd [:froydnj] 2012-06-21 13:01:01 PDT
Created attachment 635426 [details] [diff] [review]
WIP rollup patch

I've made time to work on this bug this week; here's what I have so far.  It almost passes mochitests:

https://tbpl.mozilla.org/?tree=Try&rev=8741629960d9

and all the other test suites too (ignore the mochitest failures, of course):

https://tbpl.mozilla.org/?tree=Try&rev=e67484eee7cd

so I feel pretty good about it.

This version of the patch uses a single event queue for all windows.  Once that's fully debugged, the next big change is to split things up to a per-window event queue.

One sad thing about the patch as it stands is that the event queue redispatches all the events contained therein back to the main thread.  The comment in GlobalEventQueue explains why we need to do this.  I haven't thought hard about what could be done to get away from the need to redispatch, but I have a feeling it'd be fairly difficult.  (You can immediately see the different re-dispatching makes; compare the try push https://tbpl.mozilla.org/?tree=Try&rev=f538ef90fe7e with the first one above.)

All things that need to be directed into the per-window event queue might not be quite yet; I've been working from the pragmatic principle of "if it makes tests work, redirect" but it'd be better to have a good reason why things are given to a thread versus the per-window event queue.
Comment 31 Nathan Froyd [:froydnj] 2012-06-22 13:40:15 PDT
A good bit of the M-oth test failures come from chrome://mochitests/content/chrome/content/base/test/chrome/test_title.xul and failures to fire DOMTitleChange.  These failures only happen when we push events back into the thread's event queue.  I haven't groveled through all the event handling infrastructure, but the failure mode is something like:

- window's event queue gets some events
- event to fire DOMTitleChange gets dispatched, but the document does *not* have a window at this point, so it gets dispatched to the main thread instead, behind the window's event queue
- window's event queue gets to run shoves all the events back onto the main thread
- all those events, which presumably the DOMTitleChange event depended on, are now executing *after* it, with predictably bad results

I think the best bet might be to associate the queue with individual documents, rather than with windows.  That means the queue is always around for the document and avoids inner/outer window issues with queues; anything the window dispatches should be able to go to its document...maybe?  I haven't tried implementing this shuffling around, I'll do that next week.  And of course, there are unknown issues with the remaining M-oth failures that I haven't looked at yet and which are a bit more scattered...
Comment 32 Karl Tomlinson (:karlt) 2012-06-24 16:16:18 PDT
Would it be practical to rethink GlobalEventQueue to run only one event before the "MainThread" loop runs, again?
Comment 33 Nathan Froyd [:froydnj] 2012-06-25 08:46:02 PDT
(In reply to Karl Tomlinson (:karlt) from comment #32)
> Would it be practical to rethink GlobalEventQueue to run only one event
> before the "MainThread" loop runs, again?

So that you'd do something like:

- GlobalEventQueue::Run dispatches one event (A)
- main thread runs one event (B)
- GlobalEventQueue::Run dispatches one event (C)
- main thread runs one event (D)
- ...etc.

?  I don't think that'd necessarily solve the problem described in comment 31; say B and D were already queue'd up when A gets dispatched, so A now executes after D, but D depends on A running first.
Comment 34 Karl Tomlinson (:karlt) 2012-06-25 14:39:22 PDT
Yes, you are right that it doesn't solve the problem is comment 31.
Intrinsic to having separate event queues is the need to have dependent events dispatched to the same queue.

It was an idea to avoid the need for the redispatch in comment 30.  In nested event loops, even if one event won't complete, the others will run.  That is the status quo, but, thinking about it more, events for the same window should not be processed during a nested event loop - the idea is to pause until the nested event loop returns.  However, using the stack to store state just can't (and doesn't) work when there are multiple nested event loops because they need to unravel in reverse order.

I also wondered whether running all queued events for one window in sequence is really what is wanted.  Perhaps round-robin scheduling could be fairer, but I confess I haven't studied how this works.
Comment 35 Nathan Froyd [:froydnj] 2012-06-26 06:49:00 PDT
(In reply to Karl Tomlinson (:karlt) from comment #34)
> It was an idea to avoid the need for the redispatch in comment 30.  In
> nested event loops, even if one event won't complete, the others will run. 
> That is the status quo, but, thinking about it more, events for the same
> window should not be processed during a nested event loop - the idea is to
> pause until the nested event loop returns.  However, using the stack to
> store state just can't (and doesn't) work when there are multiple nested
> event loops because they need to unravel in reverse order.

One possibility for solving nested event loops with per-window event queues is for each thread to maintain a stack of event queues, where the thread's event queue is always present in the stack.  Next event processing would only look at the event queue on the top of the stack, which would generally be the thread's own event queue.  When GlobalEventQueue::Run executes, it would push its event queue onto the stack; nested event loops running several callers down from GlobalEventQueue::Run would then be grabbing events from the window's event queue, rather than the thread's event queue.  And then everybody would (mostly) be happy.

I chatted with bsmedberg briefly on IRC the other day about dealing with nested event loops; perhaps he has comments about how this should all work.

> I also wondered whether running all queued events for one window in sequence
> is really what is wanted.  Perhaps round-robin scheduling could be fairer,
> but I confess I haven't studied how this works.

I don't know what the right solution for scheduling things is either.  The rough priorities are chrome > content, foreground tabs > background tabs, but whether you should dispatch all in one go or round-robin sources from a given priority class or something else entirely is still something to be worked out.
Comment 36 Nathan Froyd [:froydnj] 2012-06-28 14:34:23 PDT
Created attachment 637685 [details] [diff] [review]
WIP rollup patch
Comment 37 Nathan Froyd [:froydnj] 2012-06-28 14:51:37 PDT
Created attachment 637694 [details]
stack frames

Two stack frames: the first is when we call into nsDOMWindowUtils::DispatchToEventTarget from SimpleTest.executeSoon; the second is when we're calling ContentScriptErrorReporter.  We should be going through NS_ScriptErrorReporter and discovering the onerror handler that SimpleTest sets up.
Comment 38 Nathan Froyd [:froydnj] 2012-07-17 07:05:55 PDT
Just a quick update on how this is going: tests with one global queue seem to work OK, except for perma-oranging bug 623242.  I've filed a patch there to fix the intermittent orange.  There are a few Linux debug failures remaining and toolkit/content/tests/chrome/test_mousecapture.xul seems to very frequently orange on Mac, e.g.:

https://tbpl.mozilla.org/?tree=Try&rev=8a701b8bb187

Switching to per-page queues works beautifully, except that it causes massive leaks:

https://tbpl.mozilla.org/?tree=Try&rev=c1bc0acdac18

I'm now working on debugging those.  Fixing those may fix some other intermittent oranges along the way, which would be great.
Comment 39 Nathan Froyd [:froydnj] 2012-07-25 09:18:22 PDT
Created attachment 645790 [details] [diff] [review]
WIP rollup patch

New patch with per-document queues that mostly doesn't leak, but causes some crashes in the cycle collector and perma-oranges some different tests.
Comment 40 Nathan Froyd [:froydnj] 2012-07-31 13:26:39 PDT
Created attachment 647657 [details] [diff] [review]
part 1 - per-document event queue infrastructure

In advance of getting all the orange tests fixed (just one or two more left!), I thought I'd start asking for reviews to try and parallelize things.

This is part 1 of the series, wherein we implement a per-document event queue.  Nothing's redirected there yet, except for timeouts as a proof-of-concept.
Comment 41 Nathan Froyd [:froydnj] 2012-07-31 13:28:57 PDT
Created attachment 647658 [details] [diff] [review]
part 2 - redirect events to the per-document event queue

The very large and very boring part 2, wherein we redirect events to flow through the nearest document's queue.

The browser should be functional at this point, but tests need to be tweaked a little bit.  That's part 3, forthcoming.
Comment 42 Nathan Froyd [:froydnj] 2012-07-31 13:37:12 PDT
Created attachment 647661 [details] [diff] [review]
part 3 - make tests aware of the per-document event queue

mochitests's executeSoon sticks events into the main thread's event queue.  That doesn't work well with the model that the document queues up events and then redispatches them to the event loop.  Consider the following, with an event E that's going to be dispatched via executeSoon:

1. Something that E depends on is dispatched to the document's queue, call it D.  The main thread's event queue now looks like:

  stuff...document's event queue (containing D)

2. executeSoon(E).  The main thread now has:

  stuff...document's event queue (containing D)...E

3. The document's event queue is emptied and all the events contained therein are dispatched back to the main thread.  We then have:

  ...E...events from the document's event queue...D...

and we are in trouble, because E was depending on D, but E will now execute before D.

The solution is to make it possible for JS to dispatch into the document's event loop, which this patch does.  Need test review on the mochitest bits and DOM review on the dom/ bits.
Comment 43 Josh Matthews [:jdm] (away until 9/3) 2012-07-31 13:44:02 PDT
Comment on attachment 647661 [details] [diff] [review]
part 3 - make tests aware of the per-document event queue

>+    winutils.dispatchToEventTarget({
>       run: function() {
>         func();
>       }
>-    }, Ci.nsIThread.DISPATCH_NORMAL);
>+    });

Since nsIRunnable has the function attribute, you should be able to just pass func as the parameter to dispatchToEventTarget.
Comment 44 Joel Maher ( :jmaher) 2012-08-01 02:08:04 PDT
Comment on attachment 647661 [details] [diff] [review]
part 3 - make tests aware of the per-document event queue

Review of attachment 647661 [details] [diff] [review]:
-----------------------------------------------------------------

From the mochitest perspective this is great.

::: testing/mochitest/tests/SimpleTest/SimpleTest.js
@@ +661,5 @@
> +            // XXX can't use SpecialPowers, as this gets used for
> +            // mochitests and mochitest-chrome, and SpecialPowers
> +            // doesn't work in the latter.
> +            var winutils = window.QueryInterface(Components.interfaces.nsIInterfaceRequestor).
> +                               getInterface(Components.interfaces.nsIDOMWindowUtils);

we should be able to get rid of the enablePrivilege call and instead of defining winutils, we can do something like this:
 var winutils = SpecialPowers.getDOMWindowUtils(window);

All of this code is wrapped in a try/catch, and falls back to a setTimeout on failure.  We really should clean this up as much as possible.
Comment 45 Nathan Froyd [:froydnj] 2012-08-01 09:44:21 PDT
(In reply to Joel Maher (:jmaher) from comment #44)
> ::: testing/mochitest/tests/SimpleTest/SimpleTest.js
> @@ +661,5 @@
> > +            // XXX can't use SpecialPowers, as this gets used for
> > +            // mochitests and mochitest-chrome, and SpecialPowers
> > +            // doesn't work in the latter.
> > +            var winutils = window.QueryInterface(Components.interfaces.nsIInterfaceRequestor).
> > +                               getInterface(Components.interfaces.nsIDOMWindowUtils);
> 
> we should be able to get rid of the enablePrivilege call and instead of
> defining winutils, we can do something like this:
>  var winutils = SpecialPowers.getDOMWindowUtils(window);
> 
> All of this code is wrapped in a try/catch, and falls back to a setTimeout
> on failure.  We really should clean this up as much as possible.

I tried doing that initially, I think, hence the XXX comment.  ISTR that SpecialPowers .getDOMWindowUtils doesn't work universally in all the places SimpleTest.js gets used.  I forget exactly what clued me in to this, but suffice to say there was much weeping and gnashing of teeth before figuring this out.  Perhaps the comment just before this block:

    // Once SpecialPowers is available in chrome mochitests, we can replace the
    // body of this function with a call to SpecialPowers.executeSoon().

I assumed the presence of the comment suggested I wasn't making things any worse by the QueryInterface/getInterface dance.

I will attempt to post try run evidence of the not-workingness of SpecialPowers in this case, but try and I are not getting along at the moment.
Comment 46 Justin Lebar (not reading bugmail) 2012-08-01 11:30:28 PDT
> Don't yet free() GlobalTimeoutQueue::sInstance, but that's on the radar.

This is probably fixed now, but FYI, you can use Static{Auto,Ref}Ptr<T> sInstance and then ClearOnShutdown(&sInstance) to get the behavior you want here.
Comment 47 Justin Lebar (not reading bugmail) 2012-08-01 11:39:33 PDT
Comment on attachment 647661 [details] [diff] [review]
part 3 - make tests aware of the per-document event queue

This looks pretty reasonable to me modulo an existential question at the end, although I haven't looked at parts 1 and 2.

>diff --git a/content/events/src/nsDOMEventTargetHelper.h b/content/events/src/nsDOMEventTargetHelper.h
>index 3ba3121..68e6794 100644
>--- a/content/events/src/nsDOMEventTargetHelper.h
>+++ b/content/events/src/nsDOMEventTargetHelper.h
>@@ -14,16 +14,18 @@
> #include "nsPIDOMWindow.h"
> #include "nsIScriptGlobalObject.h"
> #include "nsEventListenerManager.h"
> #include "nsIScriptContext.h"
> #include "nsWrapperCache.h"
> #include "mozilla/ErrorResult.h"
> #include "mozilla/Attributes.h"
> 
>+class nsIRunnable;
>+
> class nsDOMEventListenerWrapper MOZ_FINAL : public nsIDOMEventListener
> {
> public:
>   nsDOMEventListenerWrapper(nsIDOMEventListener* aListener)
>   : mListener(aListener) {}
> 
>   NS_DECL_CYCLE_COLLECTING_ISUPPORTS
>   NS_DECL_CYCLE_COLLECTION_CLASS(nsDOMEventListenerWrapper)

Don't need this hunk, right?

>diff --git a/dom/base/nsDOMWindowUtils.cpp b/dom/base/nsDOMWindowUtils.cpp
>index 616a694..0f2f861 100644
>--- a/dom/base/nsDOMWindowUtils.cpp
>+++ b/dom/base/nsDOMWindowUtils.cpp
>@@ -1714,16 +1714,35 @@ nsDOMWindowUtils::SendContentCommandEvent(const nsAString& aType,
> NS_IMETHODIMP
>+nsDOMWindowUtils::DispatchToEventTarget(nsIRunnable *aRunnable)
>+{

You need a UniversalXPConnectPrivilege check here (as elsewhere in the file),
because AIUI content can call this if it tries hard enough.

>+  nsCOMPtr<nsPIDOMWindow> window = do_QueryReferent(mWindow);
>+  NS_ENSURE_TRUE(window, NS_ERROR_FAILURE);
>+
>+  nsCOMPtr<nsIDOMDocument> domdoc;
>+  nsresult rv = window->GetDocument(getter_AddRefs(domdoc));
>+  if (NS_FAILED(rv)) {
>+    return rv;
>+  }
>+
>+  nsCOMPtr<nsIDocument> doc = do_QueryInterface(domdoc);
>+  if (!doc) {
>+    return NS_ERROR_FAILURE;
>+  }
>+  return doc->DispatchToEventTarget(aRunnable);
>+}

Maybe it makes more sense in the context of the other two patches, but
"DispatchToEventTarget" doesn't make much sense to me.  What you're really
doing is dispatching to a private event queue, right?  The fact that event
targets are involved seems only tangentially related.

>diff --git a/dom/interfaces/base/nsIDOMWindowUtils.idl b/dom/interfaces/base/nsIDOMWindowUtils.idl
>index 1401a9b..a4dd5ad 100644
>--- a/dom/interfaces/base/nsIDOMWindowUtils.idl
>+++ b/dom/interfaces/base/nsIDOMWindowUtils.idl
> [scriptable, uuid(b276a71e-21ee-4be6-894d-4039523e1ad8)]
> interface nsIDOMWindowUtils : nsISupports {

Change the UUID.


It's also not clear to me why SimpleTest.executeSoon doesn't just do
setTimeout(0).  What is this code supposed to do that's different from
setTimeout(0)?
Comment 48 Nathan Froyd [:froydnj] 2012-08-01 11:55:03 PDT
(In reply to Justin Lebar [:jlebar] from comment #46)
> > Don't yet free() GlobalTimeoutQueue::sInstance, but that's on the radar.
> 
> This is probably fixed now, but FYI, you can use Static{Auto,Ref}Ptr<T>
> sInstance and then ClearOnShutdown(&sInstance) to get the behavior you want
> here.

It is fixed, since the queues are now owned by nsIDocument and go away appropriately upon destruction.  But good to know in any event!

(In reply to Justin Lebar [:jlebar] from comment #47)
> Comment on attachment 647661 [details] [diff] [review]
> part 3 - make tests aware of the per-document event queue
> 
> This looks pretty reasonable to me modulo an existential question at the
> end, although I haven't looked at parts 1 and 2.
> 
> >diff --git a/content/events/src/nsDOMEventTargetHelper.h b/content/events/src/nsDOMEventTargetHelper.h
> >+class nsIRunnable;
> 
> Don't need this hunk, right?

I think it is needed so the declaration of DispatchToEventTarget is valid.  I certainly do remember needing it at one point, but I collapsed my patch queue quite a bit to produce these reviewable patches; it's possible it's no longer required.  I will double-check.

> >diff --git a/dom/base/nsDOMWindowUtils.cpp b/dom/base/nsDOMWindowUtils.cpp
>
> You need a UniversalXPConnectPrivilege check here (as elsewhere in the file),
> because AIUI content can call this if it tries hard enough.

Good to know, thanks.

> >+  nsCOMPtr<nsPIDOMWindow> window = do_QueryReferent(mWindow);
> >+  NS_ENSURE_TRUE(window, NS_ERROR_FAILURE);
> >+
> >+  nsCOMPtr<nsIDOMDocument> domdoc;
> >+  nsresult rv = window->GetDocument(getter_AddRefs(domdoc));
> >+  if (NS_FAILED(rv)) {
> >+    return rv;
> >+  }
> >+
> >+  nsCOMPtr<nsIDocument> doc = do_QueryInterface(domdoc);
> >+  if (!doc) {
> >+    return NS_ERROR_FAILURE;
> >+  }
> >+  return doc->DispatchToEventTarget(aRunnable);
> >+}
> 
> Maybe it makes more sense in the context of the other two patches, but
> "DispatchToEventTarget" doesn't make much sense to me.  What you're really
> doing is dispatching to a private event queue, right?  The fact that event
> targets are involved seems only tangentially related.

The idea was that I'd move everything to dispatching to an event target, which would be the main thread initially.  Then we'd swap out the main thread for a per-document queue.  That was more-or-less the way the patch series was debugged, though it's not the way it was packaged up for review.

Naming suggestions welcome; I was going for something sufficiently different from the DOM event bits that it would be obvious which sort of events were being dealt with.  If a DispatchEvent overload or similar would be preferred, I'll happily s/// the patch.

> >diff --git a/dom/interfaces/base/nsIDOMWindowUtils.idl b/dom/interfaces/base/nsIDOMWindowUtils.idl
>
> Change the UUID.

Doh, of course.

> It's also not clear to me why SimpleTest.executeSoon doesn't just do
> setTimeout(0).  What is this code supposed to do that's different from
> setTimeout(0)?

I do not know the mochitest history here, but inserting into an event queue has the advantage of being (more?) deterministic.  After fixing some intermittent oranges along the way, I support any and all alternatives to setTimeout.
Comment 49 Justin Lebar (not reading bugmail) 2012-08-01 12:05:22 PDT
> I do not know the mochitest history here, but inserting into an event queue has the advantage of 
> being (more?) deterministic.  After fixing some intermittent oranges along the way, I support any 
> and all alternatives to setTimeout.

But did you fix any intermittent oranges caused by setTimeout(0) and fixed by SimpleTest.executeSoon()?

You're now the expert on how setTimeout works; I was hoping you could tell me what's the difference between setTimeout(0) and executeSoon (under your patches).  :)

I guess one difference would be that executeSoon() ought to execute soon even in background tabs, while we will throttle setTimeout's.  But it seems like, for the purposes of our test suite, we could just disable background throttling and get the same effect.

> If a DispatchEvent overload or similar would be preferred, I'll happily s/// the patch.

DispatchToPrivateQueue, DispatchToOwnEventQueue, or DispatchToDocumentEventQueue, maybe?
Comment 50 Nathan Froyd [:froydnj] 2012-08-01 13:21:49 PDT
(In reply to Justin Lebar [:jlebar] from comment #49)
> > I do not know the mochitest history here, but inserting into an event queue has the advantage of 
> > being (more?) deterministic.  After fixing some intermittent oranges along the way, I support any 
> > and all alternatives to setTimeout.
> 
> But did you fix any intermittent oranges caused by setTimeout(0) and fixed
> by SimpleTest.executeSoon()?

No.  The oranges were using executeSoon already, but they were assuming too much about when the handler would execute relative to what they were trying to watch for.  So that doesn't argue for explicit insertion into the event queue versus implicit insertion via setTimeout.

> You're now the expert on how setTimeout works; I was hoping you could tell
> me what's the difference between setTimeout(0) and executeSoon (under your
> patches).  :)

Methinks you overestimate my competency. :)

I assume that executeSoon was introduced partially to prevent people from dealing with oranges by raising the timeout value.  I don't have any good practical insight currently as to why it's preferable to go through event queues if possible inside executeSoon itself.  Should try and see what breaks.

> > If a DispatchEvent overload or similar would be preferred, I'll happily s/// the patch.
> 
> DispatchToPrivateQueue, DispatchToOwnEventQueue, or
> DispatchToDocumentEventQueue, maybe?

I'll go with DispatchToDocumentEventQueue for +1 review point, Alex.
Comment 51 Justin Lebar (not reading bugmail) 2012-08-01 13:23:39 PDT
Comment on attachment 647661 [details] [diff] [review]
part 3 - make tests aware of the per-document event queue

r=me, but would you mind testing whether we can avoid this rigamarole by making executeSoon a synonym for setTimeout(0)?  I'd certainly prefer that.
Comment 52 Nathan Froyd [:froydnj] 2012-08-03 14:53:34 PDT
So here's an interesting test case that I spent a while chasing down.  From:

http://mxr.mozilla.org/mozilla-central/source/content/events/test/test_bug493251.html?force=1

The part that's interesting is:

http://mxr.mozilla.org/mozilla-central/source/content/events/test/test_bug493251.html?force=1#138

which I'll reproduce here, with a bit more context:

  function doTest() {
    win.document.getElementsByTagName("input")[0].focus();
    win.addEventListener("keydown",
                         function(e) { dumpEvent(e); ++keyDown; }, true);
    win.addEventListener("keypress",
                         function(e) { dumpEvent(e); ++keyPress; }, true);
    win.addEventListener("keyup",
                         function(e) { dumpEvent(e); ++keyUp; }, true);
    win.addEventListener("mousedown",
                         function(e) { dumpEvent(e); ++mouseDown; }, true);
    win.addEventListener("mouseup",
                         function(e) { dumpEvent(e); ++mouseUp; }, true);
    win.addEventListener("click",
                         function(e) { dumpEvent(e); ++mouseClick; }, true);

    ok(true, "doTest #1...");
    dispatchKeyEvent("keydown");
    dispatchKeyEvent("keypress");
    dispatchKeyEvent("keyup");
    is(keyDown, 1, "Wrong number events (1)");
    is(keyPress, 1, "Wrong number events (2)");
    is(keyUp, 1, "Wrong number events (3)");

    ok(true, "doTest #2...");
    suppressEventHandling(true);
    dispatchKeyEvent("keydown");
    dispatchKeyEvent("keypress");
    dispatchKeyEvent("keyup");
    is(keyDown, 1, "Wrong number events (4)");
    is(keyPress, 1, "Wrong number events (5)");
    is(keyUp, 1, "Wrong number events (6)");
    suppressEventHandling(false);
    is(keyDown, 1, "Wrong number events (7)");
    is(keyPress, 1, "Wrong number events (8)");
    is(keyUp, 1, "Wrong number events (9)");

    setTimeout(continueTest1, 0);
    }

  function continueTest1() {
    ok(true, "continueTest1...");
    dispatchKeyEvent("keydown");
    suppressEventHandling(true);
    dispatchKeyEvent("keypress");
    dispatchKeyEvent("keyup");
    is(keyDown, 2, "Wrong number events (10)");
    is(keyPress, 1, "Wrong number events (11)");
    is(keyUp, 1, "Wrong number events (12)");
    suppressEventHandling(false);
    setTimeout(continueTest2, 0);
  }

  function continueTest2() {
    ok(true, "continueTest2 #1...");
    is(keyDown, 2, "Wrong number events (13)");
    is(keyPress, 2, "Wrong number events (14)");
    is(keyUp, 2, "Wrong number events (15)");

    dispatchMouseEvent("mousedown", 5, 5, 0, 1, 0);
    dispatchMouseEvent("mouseup", 5, 5, 0, 1, 0);
    is(mouseDown, 1, "Wrong number events (16)");
    is(mouseUp, 1, "Wrong number events (17)");
    is(mouseClick, 1, "Wrong number events (18)");

    ok(true, "continueTest2 #2...");
    suppressEventHandling(true);
    dispatchMouseEvent("mousedown", 5, 5, 0, 1, 0);
    dispatchMouseEvent("mouseup", 5, 5, 0, 1, 0);
    suppressEventHandling(false);
    is(mouseDown, 1, "Wrong number events (19)");
    is(mouseUp, 1, "Wrong number events (20)");
    is(mouseClick, 1, "Wrong number events (21)");

    setTimeout(continueTest3, 0);
  }

What this test is attempting to confirm is that we hold the keypress and keyup events until after call suppressEventHandling(false), at which point they are dispatched to the window, and thus to the event handlers set up in doTest.  This test has intermittent failures currently (bug 565245), but with these patches, it would fairly consistent fail *just* the (14) and (15) tests, mostly on Mac, but occasionally on Windows.

I do not have an explanation for why we have intermittent failures in bug 565245; those are subtle and mysterious.  But these new failures have a reasonably simple explanation.

Notice that we're setting up the event handlers (and all the event handling suppression, event dispatches, and so forth), on |win|, but we're actually calling setTimeout on |window|.  Under this patch series, that means that we have a race between |win| and |window| about who gets to deposit their events into the event queue first: is |win| going to properly unblock event handling before |window|'s timeout fired (the Right Thing) or vice versa, which is responsible for test (14) and (15) failures cited above?

I don't know whether this could cause web compat issues (suppressEventHandling and the code behind it is chrome-only), but it is nicely illustrative of the ordering issues that come up with per-document/window event queues.
Comment 53 Justin Lebar (not reading bugmail) 2012-08-03 15:52:49 PDT
> I don't know whether this could cause web compat issues

Yeah, I wonder if you'll need an event queue per "zone" as bhackett defines it in bug 718121.  I think a zone is roughly "all the documents which transitively can touch each other" (so A and B are in the same zone if there exists a path of documents x1, x2, ..., xn with A=x1, B=xn, such that x1 can touch x2 can touch x3, ... can touch xn).  But I also recall Brian telling me that wasn't quite right.
Comment 54 Nathan Froyd [:froydnj] 2012-08-06 07:52:30 PDT
(In reply to Justin Lebar [:jlebar] from comment #53)
> > I don't know whether this could cause web compat issues
> 
> Yeah, I wonder if you'll need an event queue per "zone" as bhackett defines
> it in bug 718121.  I think a zone is roughly "all the documents which
> transitively can touch each other" (so A and B are in the same zone if there
> exists a path of documents x1, x2, ..., xn with A=x1, B=xn, such that x1 can
> touch x2 can touch x3, ... can touch xn).  But I also recall Brian telling
> me that wasn't quite right.

I'm guessing this is roughly what the HTML5 spec calls "unit of related similar-origin browsing contexts"?

http://www.whatwg.org/specs/web-apps/current-work/#unit-of-related-similar-origin-browsing-contexts

If so, it might be better to restructure how the patch series works.
Comment 55 Nathan Froyd [:froydnj] 2012-08-08 10:46:51 PDT
(In reply to Justin Lebar [:jlebar] from comment #53)
> > I don't know whether this could cause web compat issues
> 
> Yeah, I wonder if you'll need an event queue per "zone" as bhackett defines
> it in bug 718121.

Can this roughly be accomplished via the existing docshell and/or parent/child document mechanism, or does it require something more sophisticated?
Comment 56 Justin Lebar (not reading bugmail) 2012-08-08 10:57:45 PDT
> Can this roughly be accomplished via the existing docshell and/or parent/child document mechanism, 
> or does it require something more sophisticated?

A zone can span parent/child boundaries (think window.open).  I'm not sure what docshell mechanism you're talking about...
Comment 57 Boris Zbarsky [:bz] 2012-08-08 11:12:03 PDT
I would love us to introduce explicit tracking of the "unit of related similar-origin browsing contexts"....
Comment 58 Nathan Froyd [:froydnj] 2012-08-09 09:04:33 PDT
(In reply to Justin Lebar [:jlebar] from comment #56)
> > Can this roughly be accomplished via the existing docshell and/or parent/child document mechanism, 
> > or does it require something more sophisticated?
> 
> A zone can span parent/child boundaries (think window.open).  I'm not sure
> what docshell mechanism you're talking about...

docshells can have children...but that's an admittedly superficial reading of docshell code, it might be something totally different than what's being dealt with here.

(In reply to Boris Zbarsky (:bz) [In and out Aug 1 - 10, out Aug 11-20] from comment #57)
> I would love us to introduce explicit tracking of the "unit of related
> similar-origin browsing contexts"....

Beyond spec checkmarks, what's the advantage of doing this?  Does it make life easier for some things?
Comment 59 Justin Lebar (not reading bugmail) 2012-08-09 09:09:22 PDT
> docshells can have children...

These mirror exactly the document parent/child tree.
Comment 60 Boris Zbarsky [:bz] 2012-08-09 10:26:15 PDT
> Does it make life easier for some things?

Well, it would make it easier to implement parts of the spec phrased in term of that concept if we had one place that maintained that concept instead of having every consumer have to figure it out.
Comment 61 Nathan Froyd [:froydnj] 2012-08-09 11:35:17 PDT
Created attachment 650628 [details] [diff] [review]
part 1 - redirect events to the nearest document's event queue

There appears to be agreement that we need something more sophisticated than a simple per-document event queue.  While I work out what that might involve, how does this patch look for a stepping stone?  We'll hide dispatching to threads behind nsIDocument::{GetEventTarget,DispatchToEventTarget} and then at some future point, those methods can grow to be more complex.

Justin, rest assured that your objection to |DispatchToEventTarget| as a function name has been registered. :)

bz, if you happen to look at this, that'd be great; please bounce it to somebody else if you run out of or simply don't have the time to look at it.
Comment 62 Boris Zbarsky [:bz] 2012-08-09 11:59:51 PDT
Comment on attachment 650628 [details] [diff] [review]
part 1 - redirect events to the nearest document's event queue

This seems reasonable in general.  One thing worth thinking about is assigning the terminology with the spec so that people trying to implement spec bits have an easier time figuring out how to do it.  So maybe call the event-posting method QueueTask and have it take an enum for the task source to use (either now or eventually; I'm not sure we really need separate task sources quite yet) or something along those lines?

(For the record the spec terminology is "queue a task" for posting an event, "task queue" for an event queue, "task source" for a source of tasks that need to come in queuing order; there can be multiple task sources for a single task queue.)
Comment 63 Justin Lebar (not reading bugmail) 2012-08-10 13:07:25 PDT
Comment on attachment 650628 [details] [diff] [review]
part 1 - redirect events to the nearest document's event queue

I presume bz's f+ is sufficient for you, Nathan?  It's certainly sufficient for me!
Comment 64 :Ms2ger (⌚ UTC+1/+2) 2012-08-11 01:45:33 PDT
Comment on attachment 647661 [details] [diff] [review]
part 3 - make tests aware of the per-document event queue

Review of attachment 647661 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsDOMWindowUtils.cpp
@@ +1729,5 @@
> +  if (NS_FAILED(rv)) {
> +    return rv;
> +  }
> +
> +  nsCOMPtr<nsIDocument> doc = do_QueryInterface(domdoc);

Just use window->GetDoc()
Comment 65 Nathan Froyd [:froydnj] 2012-10-31 07:37:32 PDT
Created attachment 677005 [details] [diff] [review]
current rollup patch

Fixed a number of intermittent/perma-oranges with this patch and got wise to dispatching timer expirations to the document's event target.  Still chasing some mochitest-3 leaks and other intermittent oranges, including a few related to scrolling and link clicking.  Also need to add some semblance of HTML5 spec conformance.
Comment 66 Gian-Carlo Pascutto [:gcp] 2012-12-05 02:17:01 PST
Nathan, the file you uploaded is just the diffstat, not the actual patch.
Comment 67 Nathan Froyd [:froydnj] 2012-12-05 08:00:26 PST
Created attachment 688792 [details] [diff] [review]
current rollup patch

Whoops, this patch should be much more representative.  Try run doesn't look too bad either:

https://tbpl.mozilla.org/?tree=Try&rev=79021f81ee25
Comment 68 Nathan Froyd [:froydnj] 2012-12-05 10:48:22 PST
Created attachment 688863 [details] [diff] [review]
part 1 - redirect events through the nearest document's task queue

Everything DOM-related needs to go through the document's task queue, rather
than being thrown at the main thread.  This patch is just setting up all the
redirecting; future patches will put in a more interesting event target.
Comment 69 Nathan Froyd [:froydnj] 2012-12-05 10:50:50 PST
Created attachment 688864 [details] [diff] [review]
part 2 - make nsDocument manage its own event queue rather than deferring to the current thread

This patch adds a place for a document to keep track of its own events.

The checks for non-nullptr event targets in some places are to facilitate
easy debugging: one can switch to the old everything-on-the-main-thread
style by not initializing the event target in nsIDocument::nsIDocument.
Comment 70 Nathan Froyd [:froydnj] 2012-12-05 10:51:33 PST
Created attachment 688865 [details] [diff] [review]
part 3 - post timers to the correct nsIEventTarget

Timers need to flow through the document's task queue as well.
Comment 71 Nathan Froyd [:froydnj] 2012-12-05 10:52:52 PST
Created attachment 688867 [details] [diff] [review]
part 4 - update MockFilePicker.init() to take a window for event dispatching purposes

MockFilePicker needs to be spinning the correct window's event loop.  This
patch makes that happen.
Comment 72 Nathan Froyd [:froydnj] 2012-12-05 10:53:23 PST
Created attachment 688869 [details] [diff] [review]
part 5 - fix executeSoon to be window-aware

Don't use SpecialPowers to implement executeSoon for normal mochitests;
SpecialPowers doesn't have a good way of accessing the window/document
and injecting events into the document's task queue.

Similarly, browser-chrome mochitests need to be able to pass the window
into executeSoon so we'll spin the event loop of the correct window.
Comment 73 Nathan Froyd [:froydnj] 2012-12-05 10:58:10 PST
Created attachment 688870 [details] [diff] [review]
part 6 - make iframes and window.open'd windows share event targets with parents

Windows created via window.open need to share a task queue with their opener.
Similarly, iframes need to share a task queue with their parent.  This is the
minimum amount of work required to make that happen.

This explicitly does not deal with any of the spec language around groups of
windows related by domains.  Mostly because I didn't want to spend time
ferretting out all the places when document URL setting, etc. could happen.
And implementing correct sharing (and when necessary, divorcing) behavior
doesn't make any difference as far as security goes.  So I think it can be
deferred to a followup bug.
Comment 74 Nathan Froyd [:froydnj] 2012-12-05 10:59:39 PST
Created attachment 688871 [details] [diff] [review]
part 7 - fixup tests to call executeSoon/setTimeout on the correct windows

Now that setTimeout doesn't spin the global event loop, but only the
per-window one, we need to fix some tests, even disabling some if it's
not clear how to make cross-window synchronization work properly.
Comment 75 Nathan Froyd [:froydnj] 2012-12-05 11:03:02 PST
Created attachment 688874 [details] [diff] [review]
part 4 - update MockFilePicker.init() to take a window for event dispatching purposes

MockFilePicker needs to be spinning the correct window's event loop.  This
patch makes that happen.
Comment 76 Joel Maher ( :jmaher) 2012-12-05 11:18:26 PST
Comment on attachment 688874 [details] [diff] [review]
part 4 - update MockFilePicker.init() to take a window for event dispatching purposes

Review of attachment 688874 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Comment 77 Joel Maher ( :jmaher) 2012-12-05 11:21:04 PST
Comment on attachment 688869 [details] [diff] [review]
part 5 - fix executeSoon to be window-aware

Review of attachment 688869 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/mochitest/tests/SimpleTest/SimpleTest.js
@@ -650,5 @@
>   */
>  SimpleTest.executeSoon = function(aFunc) {
> -    if ("SpecialPowers" in window) {
> -        return SpecialPowers.executeSoon(aFunc, window);
> -    }

should this take a window argument as well?  Seems odd to just delete this when we are adding a window argument to the browser-test.js version.
Comment 78 Nathan Froyd [:froydnj] 2012-12-06 10:54:40 PST
Ehsan asked me whether I had done performance testing.  I pushed the series to try yesterday:

http://perf.snarkfest.net/compare-talos/index.html?oldRevs=5d986fddd6f7&newRev=e32e4ad69f19&submit=true

Dromaeo gets a little worse (noise?).  ts_paint and tpaint look absolutely awful on Mac 10.6; tpaint on Linux64 is a little worrisome as well.
Comment 79 Nathan Froyd [:froydnj] 2012-12-06 11:01:27 PST
(In reply to Joel Maher (:jmaher) from comment #77)
> ::: testing/mochitest/tests/SimpleTest/SimpleTest.js
> @@ -650,5 @@
> >   */
> >  SimpleTest.executeSoon = function(aFunc) {
> > -    if ("SpecialPowers" in window) {
> > -        return SpecialPowers.executeSoon(aFunc, window);
> > -    }
> 
> should this take a window argument as well?  Seems odd to just delete this
> when we are adding a window argument to the browser-test.js version.

I agree that it's a bit odd.  The discrepancy here is because for mochitest, all windows opened by the test will share an event queue, so we don't need to worry about which window is setting the timeout.  (We do, however, need to get it into said event queue, rather than dispatching it to the main thread, which is why we're not using SpecialPowers for doing so anymore.)  Whereas for browser-test.js, we're usually opening different chrome windows, which don't share an event queue (and which we may also want to change; I could see addons getting bit hard by this bug), and so we need to be concerned about what window we're waiting on.

You can see this in patch 7: all the tweaked tests are browser-chrome tests.  The only one that's a vanilla mochitest probably doesn't need the tweaks I applied to it anymore.  I should go fix that.
Comment 80 Boris Zbarsky [:bz] 2012-12-07 21:51:38 PST
Nathan, I'm going to try to do the reviews here early next week.
Comment 81 Nathan Froyd [:froydnj] 2012-12-10 08:06:22 PST
Created attachment 690394 [details] [diff] [review]
redirect events through the nearest document's task queue

Now without over-eager renaming, causing things to not compile.
Comment 82 Boris Zbarsky [:bz] 2012-12-12 18:50:13 PST
Comment on attachment 688864 [details] [diff] [review]
part 2 - make nsDocument manage its own event queue rather than deferring to the current thread

>+++ b/content/base/src/nsDocument.cpp
>+class nsIDocument::EventQueue : public nsRunnable
>+  mozilla::Mutex mMutex;
>+  nsEventQueue mEventQueue;
>+  bool mEventsAreDoomed;

Please document which members or functions are protected by mMutex (looks like mEventsAreDoomed and the "we are empty" state of mEventQueue?) and what mEventsAreDoomed means.

>+  static const bool sRequeueEventsToMainThread = true;

Hmm.  Is the plan to get rid of this at some point?

> nsIDocument::~nsIDocument()
>+  // Avoid a race in the code:
>+  //   queue->Drain();
>+  //   mEventQueue = nullptr;
>+  // where an event could be added between the call to |Drain| and
>+  // releasing the queue.

Please make it clear that this could happen if one of the events being drained enqueued further events, not because of any sort of threading thing?

>@@ -1762,16 +1816,22 @@ NS_IMPL_CYCLE_COLLECTION_UNLINK_BEGIN(nsDocument)
>+    nsIDocument::EventQueue* queue
>+      = static_cast<nsIDocument::EventQueue*>(tmp->mEventQueue.get());
>+    queue->Drain();
>+    tmp->mEventQueue = nullptr;

Why doesn't this need to swap into a stack-local the way the destructor does?  Seems odd to me.  If, as I suspect, it does, maybe factor out this code into a private helper method?

>+nsIDocument::SetTaskQueue(nsIEventTarget* aTarget)
>+  if (mEventQueue) {
>+    mEventQueue = aTarget;

I don't understand this null-check.  Is the point that mEventQueue is only null if we've been unlinked, and in that case we don't want to pick up a new event target?  If so, please document that!

Should SetTaskQueue take an nsIDocument::EventQueue*, perhaps, to make it clear why the casts in unlink/~nsIDocument are ok?

>@@ -6793,16 +6873,20 @@ nsDocument::Destroy()
>+  if (mEventQueue) {
>+    static_cast<nsIDocument::EventQueue*>(mEventQueue.get())->Shutdown();

So after this we stop accepting events in mEventQueue, and forward them to the main-thread event queue instead... unless someone calls SetTaskQueue, right?  Is that purposeful?  That is, can a document end up with a resurrected event queue after Destroy()?

>+nsIDocument::EventQueue::Run()

So this will run to completion...  That's fine while all it does is redispatch to the main thread event queue, but later on we'll need to do something to prevent one document event queue starving others, right?

r=me with the above nits
Comment 83 Boris Zbarsky [:bz] 2012-12-12 18:50:43 PST
Comment on attachment 688865 [details] [diff] [review]
part 3 - post timers to the correct nsIEventTarget

r=me
Comment 84 Boris Zbarsky [:bz] 2012-12-12 18:51:05 PST
Comment on attachment 688870 [details] [diff] [review]
part 6 - make iframes and window.open'd windows share event targets with parents

r=me, though note that this means all tabs in a given Firefox window share the same event queue.  Is that the plan?

Also, you may need to set the document event queue on external resource documents.
Comment 85 Nathan Froyd [:froydnj] 2012-12-13 07:46:29 PST
Hm, part 3 is going to have to get shuffled around thanks to the recent refresh driver changes.
Comment 86 Nathan Froyd [:froydnj] 2012-12-13 08:45:46 PST
Thanks for the reviews.

(In reply to Boris Zbarsky (:bz) from comment #82)
> >+++ b/content/base/src/nsDocument.cpp
> >+class nsIDocument::EventQueue : public nsRunnable
> >+  mozilla::Mutex mMutex;
> >+  nsEventQueue mEventQueue;
> >+  bool mEventsAreDoomed;
> 
> Please document which members or functions are protected by mMutex (looks
> like mEventsAreDoomed and the "we are empty" state of mEventQueue?) and what
> mEventsAreDoomed means.

Will do.

> >+  static const bool sRequeueEventsToMainThread = true;
> 
> Hmm.  Is the plan to get rid of this at some point?

I would like to get rid of it.  But that seems like a rather large task, see e.g. the discussion in bug 820411 about this sort of thing.  I think it is useful to have there as an explanation of what we aspire to and why we can't do it that way, but I'm happy to take it out.

> > nsIDocument::~nsIDocument()
> >+  // Avoid a race in the code:
> >+  //   queue->Drain();
> >+  //   mEventQueue = nullptr;
> >+  // where an event could be added between the call to |Drain| and
> >+  // releasing the queue.
> 
> Please make it clear that this could happen if one of the events being
> drained enqueued further events, not because of any sort of threading thing?

OK.

> >@@ -1762,16 +1816,22 @@ NS_IMPL_CYCLE_COLLECTION_UNLINK_BEGIN(nsDocument)
> >+    nsIDocument::EventQueue* queue
> >+      = static_cast<nsIDocument::EventQueue*>(tmp->mEventQueue.get());
> >+    queue->Drain();
> >+    tmp->mEventQueue = nullptr;
> 
> Why doesn't this need to swap into a stack-local the way the destructor
> does?  Seems odd to me.  If, as I suspect, it does, maybe factor out this
> code into a private helper method?

Good point.

> >+nsIDocument::SetTaskQueue(nsIEventTarget* aTarget)
> >+  if (mEventQueue) {
> >+    mEventQueue = aTarget;
> 
> I don't understand this null-check.  Is the point that mEventQueue is only
> null if we've been unlinked, and in that case we don't want to pick up a new
> event target?  If so, please document that!

The null-check here was to make my life easier when debugging this; with a #define, the initialization of mEventQueue in the constructor could be commented out and I could compare the everything-happens-on-the-main-thread behavior with the individual-document behavior.  But I suppose it can help in the case when mEventQueue has already gone away, too.  Will add a comment.

> Should SetTaskQueue take an nsIDocument::EventQueue*, perhaps, to make it
> clear why the casts in unlink/~nsIDocument are ok?

I was trying to keep the concrete EventQueue class out of nsIDocument to avoid having to pull in a bunch of additional headers.  I agree the static_casts are ugly, but I'm not sure there's a better way to go about it.

> >@@ -6793,16 +6873,20 @@ nsDocument::Destroy()
> >+  if (mEventQueue) {
> >+    static_cast<nsIDocument::EventQueue*>(mEventQueue.get())->Shutdown();
> 
> So after this we stop accepting events in mEventQueue, and forward them to
> the main-thread event queue instead... unless someone calls SetTaskQueue,
> right?  Is that purposeful?  That is, can a document end up with a
> resurrected event queue after Destroy()?

That shouldn't happen.  I will add a check in SetTaskQueue to DTRT.

> >+nsIDocument::EventQueue::Run()
> 
> So this will run to completion...  That's fine while all it does is
> redispatch to the main thread event queue, but later on we'll need to do
> something to prevent one document event queue starving others, right?

Yes, we'll need to do something here.

(In reply to Boris Zbarsky (:bz) from comment #84)
> Comment on attachment 688870 [details] [diff] [review]
> part 6 - make iframes and window.open'd windows share event targets with
> parents
> 
> r=me, though note that this means all tabs in a given Firefox window share
> the same event queue.  Is that the plan?

Oh!  Hm, no, that's not the plan.  Should fix that--if that's the case.  I don't recall seeing any sharing (via debugging and such) when opening unrelated tabs, though my memory could be faulty and you know this code better than I.

> Also, you may need to set the document event queue on external resource
> documents.

Can you explain this a little further?  I do not follow what you're saying here.
Comment 87 Boris Zbarsky [:bz] 2012-12-13 10:15:10 PST
> I think it is useful to have there as an explanation of what we aspire to and why we
> can't do it that way

Yeah, that's fine.

> I was trying to keep the concrete EventQueue class out of nsIDocument 

You don't need the concrete class there to declare a pointer arg; a forward-declaration is enough, as long as the concrete class is visible in the scope of the implementation of SetTaskQueue, right?

> Should fix that--if that's the case. 

Worth checking what documents go through SetSubDocumentFor.  I would have expected the <browser>s of a Firefox <tabbrowser> to go through there too.

> Can you explain this a little further?  

SVG allows loading extra SVG documents to be used as sources of paint servers (filters, clips, etc).  These are stored in a hashtable in nsDocument, and can have runnables happening in them (most obviously, SMIL stuff).  See nsExternalResourceMap::PendingLoad::SetupViewer for where those documents are created; presumably you want to set their event target to that of mDisplayDocument.

One other question on this bit, though.  It looks like we can reset the event target of a document multiple times (e.g. the whole SetSubDocumentFor thing can be called multiple times on the same document in some cases).  Should setting the event target enumerate subdocuments and external resources?  Or should it be impossible to reset the event target more than once?
Comment 88 Boris Zbarsky [:bz] 2012-12-14 22:11:53 PST
Comment on attachment 690394 [details] [diff] [review]
redirect events through the nearest document's task queue

Why is the interface requestor bit on document needed?  Might be worth documenting, at least.  But it seems to be unused...

>+++ b/content/events/src/nsDOMEventTargetHelper.cpp
>+    return NS_DispatchToMainThread(aRunnable, NS_DISPATCH_NORMAL);

Drop the second arg?

>+++ b/content/html/content/src/nsGenericHTMLElement.cpp
>+    rv = aDocument->DispatchToTaskQueue(event);

aDocument can be null there (though arguably we shouldn't do autofocus stuff if it is).  Can you null-check, and file a bug on Mounir about hoisting that null-check higher up?

In the global window code, I'd be a little worried about mDoc being null in various places.

Same thing in nsPresContext notification methods, since those might fire after it's been torn down, I think.

In the various frame code, the return value of GetContent can just be assigned to an nsIContent*, I would think.

With those nits, r=me
Comment 89 Nathan Froyd [:froydnj] 2013-01-04 09:52:43 PST
Comment on attachment 688869 [details] [diff] [review]
part 5 - fix executeSoon to be window-aware

Asking for re-review of this patch based on the explanation in comment 79.  Testing has confirmed that normal mochitests don't need any tweaking; all the issues are in browser-chrome mochitests.
Comment 90 Joel Maher ( :jmaher) 2013-01-04 10:18:56 PST
Comment on attachment 688869 [details] [diff] [review]
part 5 - fix executeSoon to be window-aware

Review of attachment 688869 [details] [diff] [review]:
-----------------------------------------------------------------

this still makes sense.
Comment 91 Nathan Froyd [:froydnj] 2013-01-04 11:56:07 PST
(In reply to Boris Zbarsky (:bz) from comment #87)
> > I was trying to keep the concrete EventQueue class out of nsIDocument 
> 
> You don't need the concrete class there to declare a pointer arg; a
> forward-declaration is enough, as long as the concrete class is visible in
> the scope of the implementation of SetTaskQueue, right?

Mmm, I suppose so.  In that case, it'd be nice to just have nsRefPtr<EventQueue> for the event queue member, but that definitely needs the concrete definition available. =/

> > Should fix that--if that's the case. 
> 
> Worth checking what documents go through SetSubDocumentFor.  I would have
> expected the <browser>s of a Firefox <tabbrowser> to go through there too.

Yeah, you are correct.  So we can have content documents with chrome parents?  That seems like asking for trouble.  For task queue sharing purposes, can I check for a chrome doc parent in SetSubDocumentFor and not do the sharing if so, or is there some other bit of code I should be looking at?

> > Can you explain this a little further?  
> 
> SVG allows loading extra SVG documents to be used as sources of paint
> servers (filters, clips, etc).  These are stored in a hashtable in
> nsDocument, and can have runnables happening in them (most obviously, SMIL
> stuff).  See nsExternalResourceMap::PendingLoad::SetupViewer for where those
> documents are created; presumably you want to set their event target to that
> of mDisplayDocument.

Thanks for the explanation.  I think I have seen some intermittent failures related to svg bits; maybe sharing task queues there will help squash them.

> One other question on this bit, though.  It looks like we can reset the
> event target of a document multiple times (e.g. the whole SetSubDocumentFor
> thing can be called multiple times on the same document in some cases). 
> Should setting the event target enumerate subdocuments and external
> resources?  Or should it be impossible to reset the event target more than
> once?

I don't think we want the latter.  The former is probably correct, but I'm not totally clear on how subdocuments actually come about in the Real World.
Comment 92 Boris Zbarsky [:bz] 2013-01-04 12:03:16 PST
> So we can have content documents with chrome parents? 

Yes, absolutely.  That's how the Firefox UI works!

> can I check for a chrome doc parent in SetSubDocumentFor and not do the sharing if so

Hmm.  You can probably check for a different docshell type, which is what normally indicates boundaries like that, I guess...

> I'm not totally clear on how subdocuments actually come about in the Real World.

<frame> and <iframe>.
Comment 93 Nathan Froyd [:froydnj] 2013-01-24 07:22:02 PST
Committed the MockFilePicker bits:

https://hg.mozilla.org/integration/mozilla-inbound/rev/5703c2676a6e
Comment 94 Ryan VanderMeulen [:RyanVM] 2013-01-24 18:12:00 PST
https://hg.mozilla.org/mozilla-central/rev/5703c2676a6e
Comment 95 Nathan Froyd [:froydnj] 2013-01-29 08:05:50 PST
An update on where things stand: the lifetime bits for event queues have been totally reworked, after I realized that something like:

1. Document A opens window with Document B.
2. Event queues are duly shared between documents A and B.
3. Window containing document B is closed.
4. Document B shuts down its event queue (and so, implicitly, does document A).
5. Document A now dispatches all its events to the main thread rather than its event queue.

is not what we want.  Actually handling lifetime issues means that many more tests are actually running with event queues...which also means many more broken tests to fix. :(  On the bright side, fixing this bit of bad behavior also fixed many of the leaks that would crop up in the tests.

Also, there are some bad interactions with any tests that touch the message manager.  Many tests assume that (async) messages sent through the message manager and asynchronous document events are implicitly ordered.  With the document events going through a separate queue, this assumption is no longer true.  I've attempted to place the asynchronous sends on the sending document's event queue, but that does not appear to help; it seems what's really necessary is to place the *receives* on the receiving document's event queue.  But I haven't figured out how to retrofit that capability on to the message manager.  In talking with smaug, it's not even obvious to me that we can even get a handle on the receiving document at that point.
Comment 96 Nathan Froyd [:froydnj] 2013-07-11 09:55:45 PDT
Created attachment 774059 [details] [diff] [review]
part 1 - redirect events through the nearest document's task queue

This patch has been rebased to current-ish HEAD, though it is uncompiled/untested.
Comment 97 Nathan Froyd [:froydnj] 2013-07-11 09:56:50 PDT
Created attachment 774060 [details] [diff] [review]
part 2 - make nsDocument manage its own event queue rather than deferring to the current thread

This patch has been rebased to current HEAD, although it is uncompiled/untested.

IIRC, this version is also much more correct WRT thread safety and destruction than the
previous version.
Comment 98 Nathan Froyd [:froydnj] 2013-07-11 09:57:30 PDT
Created attachment 774062 [details] [diff] [review]
part 3 - post timers to the correct nsIEventTarget

This patch has been rebased to current HEAD, although it is uncompiled/untested.

There may be other timers lurking that need this same treatment.
Comment 99 Nathan Froyd [:froydnj] 2013-07-11 09:58:11 PDT
Created attachment 774063 [details] [diff] [review]
part 6 - make iframes and window.open'd windows share event targets with parents

This patch has been rebased to current HEAD, although it is uncompiled/untested.
Comment 100 Nathan Froyd [:froydnj] 2013-07-11 10:01:07 PDT
Created attachment 774065 [details] [diff] [review]
part 7 - fixup tests to call executeSoon/setTimeout on the correct windows

This patch has been rebased to current HEAD, although it is uncompiled/untested.

The hacks for the private browsing tests are related to bug 847991; current tests in-tree
seem to be conflicted about when to execute their onload/DOMContentLoaded handlers when
using private browsing.

This is by no means an exhaustive fixing of all affected tests.  Lots of tests (devtools
tests particularly) rely on all windows going through the same event loop and I have not
fixed all of them.  If somebody else picks this up, have fun tracking those down.
Comment 101 Olli Pettay [:smaug] 2013-09-17 18:00:53 PDT
This may need quite significant changes still, since the event queue processing needs to interact with
appshell. Otherwise we starve OS-level events - in other words, don't paint or accept any user input.
Comment 102 Nathan Froyd [:froydnj] 2014-03-14 11:08:04 PDT
Created attachment 8391366 [details] [diff] [review]
part 1 - redirect all events through the nearest document

khuey was asking about these patches, so I'm re-posting the current state of
them as from my tree.  They are sort-of-green on try:

https://tbpl.mozilla.org/?tree=Try&rev=71da72a8f33a

I haven't been able to reproduce the m5 crashes locally, and dealing with the
bc oranges is a real rathole, given the amount of code those tests cover.
Comment 103 Nathan Froyd [:froydnj] 2014-03-14 11:08:22 PDT
Created attachment 8391368 [details] [diff] [review]
part 2 - post timers to nsIEventTargets associated with documents
Comment 104 Nathan Froyd [:froydnj] 2014-03-14 11:08:38 PDT
Created attachment 8391370 [details] [diff] [review]
part 3 - make nsDocument manage its own event queue rather than deferring to the current thread
Comment 105 Nathan Froyd [:froydnj] 2014-03-14 11:08:53 PDT
Created attachment 8391371 [details] [diff] [review]
part 4 - make iframes and window.open'd windows share event targets with parents
Comment 106 Nathan Froyd [:froydnj] 2014-03-14 11:09:10 PDT
Created attachment 8391372 [details] [diff] [review]
part 5 - fix executeSoon to be window-aware

Don't use SpecialPowers to implement executeSoon for normal mochitests;
SpecialPowers doesn't have a good way of accessing the window/document
and injecting events into the document's task queue.

Similarly, browser-chrome mochitests need to be able to pass the window
into executeSoon so we'll spin the event loop of the correct window.
Comment 107 Nathan Froyd [:froydnj] 2014-03-14 11:12:55 PDT
Created attachment 8391375 [details] [diff] [review]
part 6 - fixup tests to call executeSoon/setTimeout on the correct window(s)

Most of these are pretty straightforward: if you're doing something to window2
in code executing for window1, you need to executeSoon/setTimeout in *window2*
to see those changes.

All the assumptions of asynchronous execution order are not yet fixed by this
patch.  I suspect that some of the bc oranges left are because the tests care
about the arrival order of asynchronous messages as posted through the message
manager.  And there may be timeouts that aren't quite right; some tests seem to
like using a timer module that wraps nsITimer, and expecting the timeout
handlers to follow some ordering relative to window timeouts and modifications.
Comment 108 Nathan Froyd [:froydnj] 2014-03-14 11:13:53 PDT
Created attachment 8391376 [details] [diff] [review]
add envvar switch for turning off event queues

This is a debugging aid, so one doesn't need two different trees to compare
what happens with and without event queues.  Just set MOZ_DISABLE_QUEUES=1
prior to executing your test and you can watch what happens in a "normal"
browser.
Comment 109 Nathan Froyd [:froydnj] 2014-04-28 08:48:46 PDT
I've been looking at these orange off-and-on for a couple of weeks and I think some of the remaining oranges can be explained by people expecting sendAsyncMessage() to arrive in sending order on the necessary handlers.  See for instance something like:

TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/components/sessionstore/test/browser_480148.js | Test #2: 'visible' tabs restored first - Got 0 1 2 3 4 5 6 7 8 9 10 11 12, expected 12 7 8 9 10 11 0 1 2 3 4 5 6

...and all the other failures after it in that particular test.

What session restore does:

- Collect all the information for the tabs to be restored;
- Compute an order in which the tabs are to be restored:

http://hg.mozilla.org/mozilla-central/file/86dce59c3d16/browser/components/sessionstore/src/SessionStore.jsm#l2528

- Attempts to ensure we don't have async messages pending:

http://hg.mozilla.org/mozilla-central/file/86dce59c3d16/browser/components/sessionstore/src/SessionStore.jsm#l2571

- Tell the content script side of things that it's time to restore the data:

http://hg.mozilla.org/mozilla-central/file/86dce59c3d16/browser/components/sessionstore/src/SessionStore.jsm#l2613

- Once the content script gets that, it sends a message back:

http://hg.mozilla.org/mozilla-central/file/86dce59c3d16/browser/components/sessionstore/content/content-sessionStore.js#l127

- ...so we can fire DOM events at listeners:

http://hg.mozilla.org/mozilla-central/file/86dce59c3d16/browser/components/sessionstore/src/SessionStore.jsm#l598

This test is checking that the DOM events arrive in a specific order, namely, the order in which we wanted to restore the tabs originally.  This order is more-or-less guaranteed by the single event queue we maintain on the main thread:

- Fire async message at tab A;
- Tab A's message manager posts handler runnable to the main thread;
- Fire async message at tab B;
- Tab B's message manager posts handler runnable to the main thread;
- ... and so on.

But with this patch series, we dispatch those async message handler runnables to the associated document's event queue.  So the ordering we get is not guaranteed:

- Fire async message at tab A;
- Tab A's message manager posts handler runnable to the document's task queue;
- Document A's task queue is now pending on the main thread;
- Fire async message at tab B;
- Tab B's message manager posts handler runnable to the document's task queue;
- Document B's task queue may have *already* been pending on the main thread prior to document A's task queue.
- The handler runnable therefore gets inserted into the pending task queue, whose runnables will all execute prior to the runnables in document A's task queue.

So in this case, the async message for tab B is going to arrive before the async message for tab A.  Consistently, in fact, judging by those tests and how reproducible the oranges are.  This behavior would normally be OK, since tab A and tab B should be independent, but if you have something outside of them that expects things to arrive in a certain order...

Also worth noting that you really do want async messages posted to the document's task queue.  If you don't, other things fail in mysterious ways, usually because DOM operations haven't taken place when event handlers expect them to.  The current state of the patchset is the right way to do things, but I don't know what to do about issues like the above.
Comment 110 Nathan Froyd [:froydnj] 2014-04-28 09:00:25 PDT
Created attachment 8413795 [details] [diff] [review]
rollup patch that applies to m-c as of now

Tim was wondering if I had a patch for m-c.  This patch compiles locally for me (Linux 64 opt+debug) with today's (2014-04-28) mozilla-central.  It is slightly different from the concatenation of all the previous posted patches, but not terribly so--rebasing fixups and a few more test fixes is all.
Comment 111 Tim Taubert [:ttaubert] 2014-04-28 10:12:43 PDT
The whole feature implemented by bug 480148 would be rendered pointless with this patch. It seems like the tabs' event queues are iterated in creation order which means ordering tabs before calling .sendAsyncMessage() for every one of them doesn't change the order in which they're processed.

This is surely not a big loss in comparison to the whole feature but I don't see a way around this currently...
Comment 112 Tim Taubert [:ttaubert] 2014-04-28 10:16:48 PDT
With restore_on_demand=true being the default nowadays I'm not sure the feature is that relevant anymore.
Comment 113 Dão Gottwald [:dao] 2014-04-28 12:28:53 PDT
(In reply to Tim Taubert [:ttaubert] from comment #112)
> With restore_on_demand=true being the default nowadays I'm not sure the
> feature is that relevant anymore.

I don't think it is. We should get rid of that code.
Comment 114 Tim Taubert [:ttaubert] 2014-04-28 12:53:04 PDT
(In reply to Dão Gottwald [:dao] from comment #113)
> I don't think it is. We should get rid of that code.

I tried but that unfortunately breaks other tests. I'll try harder tomorrow.
Comment 115 Tim Taubert [:ttaubert] 2014-04-29 04:48:57 PDT
Filed bug 1003096 that will remove the test and the feature.
Comment 116 David.P 2015-02-03 01:19:23 PST Comment hidden (me-too)
Comment 117 Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2015-08-20 14:50:48 PDT
I own figuring out how to move forward here.

Note You need to log in before you can comment on or make changes to this bug.