Closed Bug 1009628 Opened 5 years ago Closed 5 years ago

Need mozAfterRemotePaint event for remote iframes

Categories

(Core :: Graphics: Layers, defect)

x86
All
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla34

People

(Reporter: BenWa, Assigned: handyman)

References

(Depends on 3 open bugs, Blocks 1 open bug)

Details

Attachments

(3 files, 32 obsolete files)

5.45 KB, patch
mconley
: review+
Details | Diff | Splinter Review
26.26 KB, patch
mconley
: review+
Details | Diff | Splinter Review
23.82 KB, patch
mconley
: review+
Details | Diff | Splinter Review
Right now when we make large changes to remote iframes from a parent process we will send an update to the compositor and include a RefLayer that the child process must populate out of sync. Since the content process can take much longer to populate the RefLayer and there's no syncronization this leads to problems like bug 986782 and issues with the keyboard on b2g. This can be hacked around manually but since it's a commun problem the platform should have a nice event listener to deal with this.

I'm suggesting that we add something like mozAfterRemotePaint callback (better name?) to iframes to allow manual synchronization in the parent process. For example:

hide(tabIFrame); // Make sure we still paint tabIFrame
tabIFrame.addEventListener("MozAfterRemotePaint", function() {
  // The compositor now has the RefLayer populated
  show(tabIFrame);
});

Detailed plan:
* Add the event listener for remote iframe
* Send an IPC message to the content process's main thread
* When the IPC message is received send a message to the compositor
* Compositor sends a message back to the content process
* Content process sends a response to the parent

NOTE: Last two steps could be replaced by a direct message from the compositor to the content process but it might be more tricky.

Bug 979949 has an example how to go from the content process to the compositor which could be useful has an example here.
Assignee: nobody → davidp99
I'm posting a backend solution to this -- we are talking about how to go forward integrating this into the frontend.
I've added a patch that is the beginning of the front-end integration.  The browser clears the old content immediately so this doesn't work quite correctly yet but it has the timing right...
Status: NEW → ASSIGNED
Attached patch frontend (obsolete) — Splinter Review
Updated front-end patch, seems to work but I dont know the implications...
Attachment #8426754 - Attachment is obsolete: true
Comment on attachment 8427151 [details] [diff] [review]
frontend

Review of attachment 8427151 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/content/widgets/tabbox.xml
@@ +661,5 @@
> +          // TODO: Figure out a decent way to keep it from clearing the old
> +          // contents so that they stay on screen until we reveal the new contents.
> +          // I wonder if this would be safe if we replace setting opacity with the
> +          // "this.setAttribute("selectedIndex", val);" call below?
> +          if(typeof(gBrowser) !== 'undefined' && gBrowser.getBrowserAtIndex(val) !== 'undefined')   {

I think I like the general idea, which as I understand it, is to make the tab invisible until MozAfterRemotePaint is fired.

Just a few suggestions:

Because this lies within toolkit, making assumptions about (or even checking) the app using this (like with the presence of gBrowser), is something we try hard to avoid.

I'm going to try to make this deck switching function asynchronous, and fire a "beforeshowing" event to poll listeners for Promises to wait before completing the switch. Then I'll alter tabbrowser.xml to listen for that event, and return a Promise that resolves when the remote browser fires the MozAfterRemotePaint. It's a rube-goldberg, but I think it's also the least bad idea off the top of my dome.
(In reply to Benoit Girard (:BenWa) from comment #0)
> I'm suggesting that we add something like mozAfterRemotePaint callback
> (better name?) to iframes to allow manual synchronization in the parent
> process. For example:
> 
> hide(tabIFrame); // Make sure we still paint tabIFrame
> tabIFrame.addEventListener("MozAfterRemotePaint", function() {
>   // The compositor now has the RefLayer populated
>   show(tabIFrame);
> });

I was wondering if instead of using Yet Another Event, we shouldn't use requestAnimationFrame here. It fits the description of requestAnimationFrame quite nicely, and it gets around the typical problems of event listeners missing the event.

Of course, it would mean requestAnimationFrame would need to be implemented (with remote support) for the tabIFrame, while the API is spec'ed to be used from the window object, not from elements.
So I've got the workings of a PoC patch that puts most of the weight into tabbrowser.xml, and leaves tabbox.xml _mostly_ alone. I want to test it, but it looks like the backend patch isn't applying cleanly now.

When I manually de-bitrot the patch, I hit the following build error:

5:56.48 ../../dist/include/nsAutoPtr.h:924:27: error: implicit instantiation of undefined template 'nsISupports::COMTypeInfo<mozilla::dom::TabChildBase, void>'
 5:56.48     if (NS_FAILED(aHelper(NS_GET_TEMPLATE_IID(T), &newRawPtr))) {
 5:56.48                           ^
 5:56.48 ../../dist/include/nsID.h:139:36: note: expanded from macro 'NS_GET_TEMPLATE_IID'
 5:56.48 #define NS_GET_TEMPLATE_IID(T) (T::template COMTypeInfo<T, void>::kIID)
 5:56.48                                    ^
 5:56.49 ../../dist/include/nsError.h:186:68: note: expanded from macro 'NS_FAILED'
 5:56.49 #define NS_FAILED(_nsresult)    ((bool)MOZ_UNLIKELY(NS_FAILED_impl(_nsresult)))
 5:56.49                                                                    ^
 5:56.49 ../../dist/include/mozilla/Likely.h:17:48: note: expanded from macro 'MOZ_UNLIKELY'
 5:56.49 #  define MOZ_UNLIKELY(x) (__builtin_expect(!!(x), 0))
 5:56.49                                                ^
 5:56.49 /Users/mikeconley/Projects/mozilla-central/gfx/layers/ipc/CompositorChild.cpp:265:31: note: in instantiation of member function 'nsRefPtr<mozilla::dom::TabChildBase>::nsRefPtr' requested here
 5:56.49   nsRefPtr<dom::TabChildBase> tabChildBase(do_QueryReferent(mWeakTabChild));
 5:56.49                               ^
 5:56.49 ../../dist/include/nsISupportsBase.h:40:3: note: template is declared here
 5:56.49   NS_DECLARE_STATIC_IID_ACCESSOR(NS_ISUPPORTS_IID)
 5:56.49   ^
 5:56.49 ../../dist/include/nsID.h:121:10: note: expanded from macro 'NS_DECLARE_STATIC_IID_ACCESSOR'
 5:56.49   struct COMTypeInfo;
 5:56.49          ^
 5:56.49 In file included from /Users/mikeconley/Projects/mozilla-central/obj-x86_64-apple-darwin12.5.0/gfx/layers/Unified_cpp_gfx_layers3.cpp:2:
 5:56.49 In file included from /Users/mikeconley/Projects/mozilla-central/gfx/layers/composite/ColorLayerComposite.cpp:6:
 5:56.49 In file included from /Users/mikeconley/Projects/mozilla-central/gfx/layers/composite/ColorLayerComposite.h:9:
 5:56.49 In file included from /Users/mikeconley/Projects/mozilla-central/gfx/layers/Layers.h:14:
 5:56.49 In file included from ../../dist/include/gfxContext.h:15:
 5:56.49 In file included from ../../dist/include/gfxPattern.h:16:
 5:56.49 ../../dist/include/nsAutoPtr.h:924:27: error: incomplete definition of type 'nsISupports::COMTypeInfo<mozilla::dom::TabChildBase, void>'
 5:56.49     if (NS_FAILED(aHelper(NS_GET_TEMPLATE_IID(T), &newRawPtr))) {
 5:56.49                           ^
 5:56.49 ../../dist/include/nsID.h:139:65: note: expanded from macro 'NS_GET_TEMPLATE_IID'
 5:56.49 #define NS_GET_TEMPLATE_IID(T) (T::template COMTypeInfo<T, void>::kIID)
 5:56.49                                                                 ^
 5:56.49 ../../dist/include/nsError.h:186:68: note: expanded from macro 'NS_FAILED'
 5:56.49 #define NS_FAILED(_nsresult)    ((bool)MOZ_UNLIKELY(NS_FAILED_impl(_nsresult)))
 5:56.50                                                                    ^
 5:56.50 ../../dist/include/mozilla/Likely.h:17:48: note: expanded from macro 'MOZ_UNLIKELY'
 5:56.50 #  define MOZ_UNLIKELY(x) (__builtin_expect(!!(x), 0))
 5:56.50                                                ^
 5:56.50 2 errors generated.

Any idea what is going on, handyman?
Flags: needinfo?(davidp99)
(In reply to Mike Conley (:mconley) from comment #6)
> So I've got the workings of a PoC patch that puts most of the weight into
> tabbrowser.xml, and leaves tabbox.xml _mostly_ alone. I want to test it, but
> it looks like the backend patch isn't applying cleanly now.
> 
> When I manually de-bitrot the patch, I hit the following build error:
> 
<snip>
> 
> Any idea what is going on, handyman?

Umm... I have no clue.  Something seems to have broken in the XPCOM code but I can't tell what.  I'm posting an updated patch that circumvents the issue.  You should be able to proceed with it.
Attachment #8426615 - Attachment is obsolete: true
Flags: needinfo?(davidp99)
So, just a quick update with this - I've got a patch that successfully waits for the MozAfterRemotePaint event before flipping the deck.

One challenge I had with this, was that I didn't realize that we do activation and deactivation of docshells for the tabs the are switched to and from. That also needs to be asyncified if this approach is going to work.
(In reply to :Felipe Gomes from comment #5)
> (In reply to Benoit Girard (:BenWa) from comment #0)
> > I'm suggesting that we add something like mozAfterRemotePaint callback
> > (better name?) to iframes to allow manual synchronization in the parent
> > process. For example:
> > 
> > hide(tabIFrame); // Make sure we still paint tabIFrame
> > tabIFrame.addEventListener("MozAfterRemotePaint", function() {
> >   // The compositor now has the RefLayer populated
> >   show(tabIFrame);
> > });
> 
> I was wondering if instead of using Yet Another Event, we shouldn't use
> requestAnimationFrame here. It fits the description of requestAnimationFrame
> quite nicely, and it gets around the typical problems of event listeners
> missing the event.
> 
> Of course, it would mean requestAnimationFrame would need to be implemented
> (with remote support) for the tabIFrame, while the API is spec'ed to be used
> from the window object, not from elements.

That's an interesting idea. IMO having requestAnimationFrame per element is the right thing to do.
Attached patch WIP front-end patch 1 (obsolete) — Splinter Review
Attachment #8427151 - Attachment is obsolete: true
Comment on attachment 8427962 [details] [diff] [review]
WIP front-end patch 1

Review of attachment 8427962 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/base/content/tabbrowser.xml
@@ +3280,5 @@
> +          const kRemotePaintMaxWait = 200; // ms
> +
> +          let deferred = Promise.defer();
> +          let toBrowser = this.getBrowserAtIndex(event.detail);
> +          toBrowser.setAttribute("type", "content-primary");

I had to move these from updateCurrentBrowser, otherwise MozAfterRemotePaint fired immediately on the next tick. Not sure if that was the right call there.

::: toolkit/content/widgets/tabbox.xml
@@ +679,5 @@
>              var event = document.createEvent("Events");
>              event.initEvent("select", true, true);
>              this.dispatchEvent(event);
>            }
> +          this.Promise.all(promises).then(() => {

So one nit right off the bat is that we might want to put the timeout handling inside tabbox so that consumers really don't have to worry about it - if the Promises don't resolve within some threshold, just do the flip and bail.
Comment on attachment 8427962 [details] [diff] [review]
WIP front-end patch 1

Curious to get some feedback on my approach here.

Neil, I went with Promises because I felt it'd be nice if the consumer of the event didn't have to call into tabpanel in order to "finish the job", so to speak. They just have to resolve their promise and they're done.
Attachment #8427962 - Flags: feedback?(ttaubert)
Attachment #8427962 - Flags: feedback?(enndeakin)
Comment on attachment 8427962 [details] [diff] [review]
WIP front-end patch 1

>+
>+            this.mPanelContainer.addEventListener("beforeselect", this);

Note that since you added the <handler> you shouldn't need this. Same with the "select" one.


>+
>+          var beforeSelect = document.createEvent("CustomEvent");
>+          beforeSelect.initCustomEvent("beforeselect", true, true, val);
>+          beforeSelect.promises = [];
>+          this.dispatchEvent(beforeSelect);

That said, if you're not going to use the cancel event technique, it might be better to just call some function that one can set on the tabpanel that returns a promise rather than firing an event here.
 
>+      <property name="Promise" readonly="true">
>+        <getter><![CDATA[
>+          if (!this._promise) {
>+            this._promise = Components.utils.import("resource://gre/modules/Promise.jsm", {}).Promise;
>+          }
>+          return this._promise;
>+        ]]></getter>
>+      </property>

I really dislike importing this just for this one case. Can't we just use DOM promises instead?
Thanks Neil! I'll try to address this feedback soonish.
Attachment #8427962 - Flags: feedback?(enndeakin)
Comment on attachment 8427962 [details] [diff] [review]
WIP front-end patch 1

Review of attachment 8427962 [details] [diff] [review]:
-----------------------------------------------------------------

So IIUIC, this patch wants to solve the problem of the random memory flash when switching tabs?

This looks like it would regress Talos tests related to tab switching but also tab switch speed in the real world. Having a full event queue and some GC/CC in between could really slow this down. Additionally, I thought there were lots of tests that rely on setting the selected tab and updating the currently shown browser immediately, so shouldn't that break lots of tests? And add-ons?

::: browser/base/content/tabbrowser.xml
@@ -1020,5 @@
>                  forwardButtonContainer.removeAttribute("switchingtabs");
>                });
>              }
> -            
> -            dump("****************!!!!!!!!!!!!!!!!! MozAfterRemotePaint INITIATING !!!!!!!!!!!!!!!************\n");

That patch isn't based off of the tip, is it?
Attachment #8427962 - Flags: feedback?(ttaubert)
(In reply to Tim Taubert [:ttaubert] from comment #16)
> Comment on attachment 8427962 [details] [diff] [review]
> WIP front-end patch 1
> 
> Review of attachment 8427962 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> So IIUIC, this patch wants to solve the problem of the random memory flash
> when switching tabs?
> 
> This looks like it would regress Talos tests related to tab switching but
> also tab switch speed in the real world. Having a full event queue and some
> GC/CC in between could really slow this down. Additionally, I thought there
> were lots of tests that rely on setting the selected tab and updating the
> currently shown browser immediately, so shouldn't that break lots of tests?
> And add-ons?

While it's true that CC/GC could slow down the tab switch, from my understanding, the same things could also slow down the async IPC communication going on to actually show the new browser.

Note that my patch doesn't actually change the behaviour of tabbrowser such that the actual selected browser is changed in an async way - rather, it's only the _displaying_ of the selected browser that is being async-ified, at the level of the deck in the tabbox.

So chrome code and add-ons that switch to a tab and immediately try to manipulate that tab's browser _should_ get that browser. It's just not guaranteed that the browser has been flipped to in the deck, so it might not actually be displayed just yet.

Does that alleviate your concerns any, or are there still more?

> 
> ::: browser/base/content/tabbrowser.xml
> @@ -1020,5 @@
> >                  forwardButtonContainer.removeAttribute("switchingtabs");
> >                });
> >              }
> > -            
> > -            dump("****************!!!!!!!!!!!!!!!!! MozAfterRemotePaint INITIATING !!!!!!!!!!!!!!!************\n");
> 
> That patch isn't based off of the tip, is it?

Nope - it's based on handyman's WIP backend patch (attachment 8427871 [details] [diff] [review]).
Er, meant to needinfo ttaubert there about whether or not my defense of this approach was any good. :)
Flags: needinfo?(ttaubert)
Attached patch WIP front-end patch 1 (obsolete) — Splinter Review
De-bitrotted.
Attachment #8427962 - Attachment is obsolete: true
(In reply to Mike Conley (:mconley) from comment #17)
> While it's true that CC/GC could slow down the tab switch, from my
> understanding, the same things could also slow down the async IPC
> communication going on to actually show the new browser.

Right, I think we could give it a try and see whether there are incoming reports of tab switching feeling slow. It is true that there is no difference for e10s builds but for today's Firefox there really might be a difference. We're doing everything sync now and afaik GC can't interfere with that.

> Note that my patch doesn't actually change the behaviour of tabbrowser such
> that the actual selected browser is changed in an async way - rather, it's
> only the _displaying_ of the selected browser that is being async-ified, at
> the level of the deck in the tabbox.

That indeed alleviates my concerns and should work.
Flags: needinfo?(ttaubert)
Attached patch WIP front-end patch 2 (obsolete) — Splinter Review
Using DOM Promises now. I haven't tried switching to Neil's approach to have a function be assignable to tabpanels that can return Promises, but I'll try that next.

A few bugs on this patch so far:

1) I'm still seeing periodic flashes when switching tabs, even though we're waiting for the ready-to-paint event. It's intermittent though. Either this is because we're hitting the timeout, or we're not actually ready to paint when the event is fired. Not sure, need to investigate.

2) Closing tabs before the currently selected tab causes the currently selected tab to flash. This is because the tabpanels deck is index based, so removing child nodes before the currently selected one immediately causes the deck to try to paint the child _after_ the selected tab. It looks like this code deals with that: http://hg.mozilla.org/mozilla-central/file/9dc0ffca10f4/browser/base/content/tabbrowser.xml#l2100

but since we're waiting for the event now, we get a flash.

3) Just closing tabs in general seems to cause flashes of white here and there.
Attachment #8437035 - Attachment is obsolete: true
Attachment #8437837 - Attachment is patch: true
Hrm, I'm also seeing a case where the selected tab _never_ shows up, which is worrying. No reliable STR just yet, but I see it sometimes when I open a series of tabs in rapid succession via middle-clicking on a bunch of links.
For the (2) issue in comment 22, I'm planning on seeing if I can alter nsDeckFrame.cpp so that if an unselected child with an index less than the currently selected child is removed, we decrement the current index appropriately, so that hacks like http://hg.mozilla.org/mozilla-central/file/9dc0ffca10f4/browser/base/content/tabbrowser.xml#l2100 are not necessary.
Started to piece this together - I think this is all foul-ups on the front-end.

Specifically - I didn't know that tabbrowser keeps an array of browsers that do not map directly to the "array" of childnodes under the tabpanels. This means that when new browsers are created (by opening new tabs, for example), a new child is _appended_ to the tabpanel, instead of inserted at the index that the associated tab exists at in the tabstrip.

So my usage of getBrowserAtIndex in tabbrowser.xml from the tabpanels event detail is no good - the value that we're getting is the childNode index, not the index for the tabbrowser collection of browsers.

The secondary flickering that I (and BenWa at some point) saw seems to be related to us setting docshell inactive on the old tab after we've switched to the new one. If we stop making that docshell inactivate, the flicker goes away. That might be good follow-up fodder.
(In reply to Mike Conley (:mconley) from comment #25)
> So my usage of getBrowserAtIndex in tabbrowser.xml from the tabpanels event
> detail is no good - the value that we're getting is the childNode index, not
> the index for the tabbrowser collection of browsers.

Yes, you should use tab.linkedBrowser to go from the tab to the browser. You can also use tabs.getRelatedElement to go from tab to panel or tabpanels.getRelatedElement to go from panel to tab.
Attachment #8437034 - Attachment is obsolete: true
Assignee: davidp99 → mconley
Attachment #8437837 - Attachment is obsolete: true
Hey handyman, the front-end for this is coming together. Is the back-end together enough that we should get it reviewed and landed?
Flags: needinfo?(davidp99)
Comment on attachment 8438719 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser

Ok, I think this front-end business is starting to come together. It's pretty tricky async-ifying code that fully expects to be sync.

This _seems_ to work. I'm a bit worried that Promise lets the event loop breathe between "then's", so there might be a gap between the fromBrowser having its docshell deactivated and the deck switching to the new browser. Open to ideas on how to address that.

All feedback welcome.
Attachment #8438719 - Flags: feedback?(ttaubert)
Attachment #8438719 - Flags: feedback?(enndeakin)
The back end should be in good shape for review.  I haven't tested the behavior on Linux tho...
Flags: needinfo?(davidp99)
Comment on attachment 8438717 [details] [diff] [review]
"Need mozAfterRemotePaint event for remote iframes" []

Seems to work OK on Linux once I force-enable OMTC.

Let's get someone to take a look at this - I'll try BenWa first.
Attachment #8438717 - Flags: review?(bgirard)
OS: Mac OS X → All
Comment on attachment 8438717 [details] [diff] [review]
"Need mozAfterRemotePaint event for remote iframes" []

Review of attachment 8438717 [details] [diff] [review]:
-----------------------------------------------------------------

- A few trailing whitespace in the patch, bugzilla should point them out for you
- The style for if should be |if (...) {|

I think there's a simpler way to do this that I hadn't realized previously:
In the content process we should have to wait until we send the transaction. I believe once it's sent then any transaction sent from the root process will get the proper ordering. This means we can eliminate all the compositor side changes. Sorry for not noticing this earlier.

This patch is on the right track but let's make sure we get it right so we don't have to revisit this.

::: dom/events/EventListenerManager.cpp
@@ +288,5 @@
> +    // This message can only be heard on the chrome browser object.  This
> +    // *should* be an error if done on another window (but we settle for
> +    // an assert).
> +    MOZ_ASSERT(mIsMainThreadELM && mTarget, 
> +        "MozAfterRemotePaint can only be requested from chrome UI!");

Shouldn't this check for privileged content? This isn't my area of expertize so we will need to find a good reviewer. for this

::: dom/ipc/PBrowser.ipdl
@@ +535,5 @@
> +     * the graphics objects are ready to display.  
> +     * @see PCompositor
> +     * @see RemotePaintIsReady
> +     */
> +    async NotifyAfterRemotePaint();

This message name implies that it sending a notification that the remote paint occurred but it's really asking to be notified. Perhaps RequestNotifyAfterRemotePaint.

::: gfx/layers/ipc/CompositorChild.cpp
@@ +288,5 @@
> +{
> +  MOZ_ASSERT(tabChild, "NULL TabChild not allowed in "
> +      "CompositorChild::NotifyAfterRemotePaint");
> +  // XPCOM weirdness requires the cast
> +  mWeakTabChild = do_GetWeakReference( (dom::TabChildBase *)tabChild );

This means we can only have one pending notify and that previous pending notification will be clobbered silently? I think we need a better solution

::: gfx/layers/ipc/CompositorParent.cpp
@@ +1077,5 @@
>  
>    NS_INLINE_DECL_THREADSAFE_REFCOUNTING(CrossProcessCompositorParent)
>  public:
>    CrossProcessCompositorParent(Transport* aTransport, ProcessId aOtherProcess)
> +    : mTransport(aTransport), mNotifyAfterRemotePaint(false),

This should keep using the style before:
 : a(a)
 , b(b)
 , c(c)

@@ +1153,5 @@
> +  // If true, we should send a RemotePaintIsReady message when graphics are
> +  // available on GPU.
> +  bool mNotifyAfterRemotePaint;
> +  // If true, graphics objects are available on GPU.
> +  bool mGraphicsAreReady;

I don't think this variable is required at all and all that is associated with it.
Attachment #8438717 - Flags: review?(bgirard) → review-
I'm working on updating the patch.  Some notes on the review:

(In reply to Benoit Girard (:BenWa) from comment #34)
> - A few trailing whitespace in the patch, bugzilla should point them out for
> you
> - The style for if should be |if (...) {|

White-space cleanup.
Done.

> I think there's a simpler way to do this that I hadn't realized previously:
> In the content process we should have to wait until we send the transaction.
> I believe once it's sent then any transaction sent from the root process
> will get the proper ordering. This means we can eliminate all the compositor
> side changes. Sorry for not noticing this earlier.

I don't get the idea here.  I think my confusion is on "the transaction"s.  It might help if you explain this in terms of the objects and messages (ie TabParent/Child,CompositorParent/Child and the NotifyAfterRemotePaint/RemoteGraphicsAreReady messages).

> 
> ::: dom/events/EventListenerManager.cpp
> @@ +288,5 @@
> > +    // This message can only be heard on the chrome browser object.  This
> > +    // *should* be an error if done on another window (but we settle for
> > +    // an assert).
> > +    MOZ_ASSERT(mIsMainThreadELM && mTarget, 
> > +        "MozAfterRemotePaint can only be requested from chrome UI!");
> 
> Shouldn't this check for privileged content? This isn't my area of expertize
> so we will need to find a good reviewer. for this

The EventListenerManager would just need to be able to identify that the frame requesting the event is privileged and get the nsFrameLoader/TabParent for the frame.  Not being an expert on this either, I don't know of a way to do either of these.



> ::: dom/ipc/PBrowser.ipdl
> @@ +535,5 @@
> > +     * the graphics objects are ready to display.  
> > +     * @see PCompositor
> > +     * @see RemotePaintIsReady
> > +     */
> > +    async NotifyAfterRemotePaint();
> 
> This message name implies that it sending a notification that the remote
> paint occurred but it's really asking to be notified. Perhaps
> RequestNotifyAfterRemotePaint.

Done.
 
> ::: gfx/layers/ipc/CompositorChild.cpp
> @@ +288,5 @@
> > +{
> > +  MOZ_ASSERT(tabChild, "NULL TabChild not allowed in "
> > +      "CompositorChild::NotifyAfterRemotePaint");
> > +  // XPCOM weirdness requires the cast
> > +  mWeakTabChild = do_GetWeakReference( (dom::TabChildBase *)tabChild );
> 
> This means we can only have one pending notify and that previous pending
> notification will be clobbered silently? I think we need a better solution

This could be easily done by just making a queue of WeakTabChilds and draining it when the paint happens but I've gone a bit further since both billm and I had the same objection:

I'm in the process of changing it so that the Compositor responds with the layerID of the TabChild that spawned the request.  The response is then sent to TabChild::GetFrom(layerID) (assuming it still exists) instead of the TabChild weak-reference.  To do this, I have to change the sTabChildren map in TabChild from holding raw pointers to holding weak references (so that I dont send the message to a dead TabChild).

That doesn't address the complaint about the patch tho.  I still need to maintain a list of layerIDs (instead of weak-refs to TabChilds).  However, by using IDs, I should be able to match the layerIDs with the layer being rendered so that the solution is more accurate, meaning that while "Draining the list" notifies all listeners of the event regardless of which Tab was actually rendered, with this setup, I can notify only the tab whose layers actually did render.  An argument against doing this would be that layers that don't finish painting would never receive the event.  With the current patch, the event is sent to the last tab that issued a request whenever any tab is painted (this works for bug 986782) -- it already doesn't respond to a request if another request comes in quickly.  There are still a few extraneous issues that could invalidate this idea (like connecting TabChild IDs to LayerTransaction IDs).


> ::: gfx/layers/ipc/CompositorParent.cpp
> @@ +1077,5 @@
> >  
> >    NS_INLINE_DECL_THREADSAFE_REFCOUNTING(CrossProcessCompositorParent)
> >  public:
> >    CrossProcessCompositorParent(Transport* aTransport, ProcessId aOtherProcess)
> > +    : mTransport(aTransport), mNotifyAfterRemotePaint(false),
> 
> This should keep using the style before:
>  : a(a)
>  , b(b)
>  , c(c)

Done.

> 
> @@ +1153,5 @@
> > +  // If true, we should send a RemotePaintIsReady message when graphics are
> > +  // available on GPU.
> > +  bool mNotifyAfterRemotePaint;
> > +  // If true, graphics objects are available on GPU.
> > +  bool mGraphicsAreReady;
> 
> I don't think this variable is required at all and all that is associated
> with it.

I assume this will be true if the conjecture above that the Compositor code can be replaced is valid.
Setting needinfo so this doesn't fall off the radar.
Flags: needinfo?(bgirard)
(In reply to David Parks [:handyman] from comment #35)
> I'm working on updating the patch.  Some notes on the review:
> 
> (In reply to Benoit Girard (:BenWa) from comment #34)
> > - A few trailing whitespace in the patch, bugzilla should point them out for
> > you
> > - The style for if should be |if (...) {|
> 
> White-space cleanup.
> Done.
> 
> > I think there's a simpler way to do this that I hadn't realized previously:
> > In the content process we should have to wait until we send the transaction.
> > I believe once it's sent then any transaction sent from the root process
> > will get the proper ordering. This means we can eliminate all the compositor
> > side changes. Sorry for not noticing this earlier.
> 
> I don't get the idea here.  I think my confusion is on "the transaction"s. 
> It might help if you explain this in terms of the objects and messages (ie
> TabParent/Child,CompositorParent/Child and the
> NotifyAfterRemotePaint/RemoteGraphicsAreReady messages).
> 

When the content process calls SendUpdateNoSwap/SendUpdate. Actually since this can be call more then once per paint the best place is |ClientLayerManager::EndTransaction| in the case where !mRepeatTransaction which is the final transaction for that paint. When that occurs we will call SendUpdate[NoSwap] which means that the transaction is pending in the compositor and that the remote process may now send its transaction that will be ordered after the transaction that we just sent.

> > 
> > ::: dom/events/EventListenerManager.cpp
> > @@ +288,5 @@
> > > +    // This message can only be heard on the chrome browser object.  This
> > > +    // *should* be an error if done on another window (but we settle for
> > > +    // an assert).
> > > +    MOZ_ASSERT(mIsMainThreadELM && mTarget, 
> > > +        "MozAfterRemotePaint can only be requested from chrome UI!");
> > 
> > Shouldn't this check for privileged content? This isn't my area of expertize
> > so we will need to find a good reviewer. for this
> 
> The EventListenerManager would just need to be able to identify that the
> frame requesting the event is privileged and get the nsFrameLoader/TabParent
> for the frame.  Not being an expert on this either, I don't know of a way to
> do either of these.
> 

I don't know who to ask for this really but ask around in IRC or check the blame log for these types of events.

> 
> 
> > ::: dom/ipc/PBrowser.ipdl
> > @@ +535,5 @@
> > > +     * the graphics objects are ready to display.  
> > > +     * @see PCompositor
> > > +     * @see RemotePaintIsReady
> > > +     */
> > > +    async NotifyAfterRemotePaint();
> > 
> > This message name implies that it sending a notification that the remote
> > paint occurred but it's really asking to be notified. Perhaps
> > RequestNotifyAfterRemotePaint.
> 
> Done.
>  
> > ::: gfx/layers/ipc/CompositorChild.cpp
> > @@ +288,5 @@
> > > +{
> > > +  MOZ_ASSERT(tabChild, "NULL TabChild not allowed in "
> > > +      "CompositorChild::NotifyAfterRemotePaint");
> > > +  // XPCOM weirdness requires the cast
> > > +  mWeakTabChild = do_GetWeakReference( (dom::TabChildBase *)tabChild );
> > 
> > This means we can only have one pending notify and that previous pending
> > notification will be clobbered silently? I think we need a better solution
> 
> This could be easily done by just making a queue of WeakTabChilds and
> draining it when the paint happens but I've gone a bit further since both
> billm and I had the same objection:
> 
> I'm in the process of changing it so that the Compositor responds with the
> layerID of the TabChild that spawned the request.  The response is then sent
> to TabChild::GetFrom(layerID) (assuming it still exists) instead of the
> TabChild weak-reference.  To do this, I have to change the sTabChildren map
> in TabChild from holding raw pointers to holding weak references (so that I
> dont send the message to a dead TabChild).

I believe if you follow my suggestion you wont have to worry about any of that since the compositor will process messages in order even if they are sent from different processes. I do need to verify this with :bent since it's a critical property for my new design to work properly.

> 
> That doesn't address the complaint about the patch tho.  I still need to
> maintain a list of layerIDs (instead of weak-refs to TabChilds).  However,
> by using IDs, I should be able to match the layerIDs with the layer being
> rendered so that the solution is more accurate, meaning that while "Draining
> the list" notifies all listeners of the event regardless of which Tab was
> actually rendered, with this setup, I can notify only the tab whose layers
> actually did render.  An argument against doing this would be that layers
> that don't finish painting would never receive the event.  With the current
> patch, the event is sent to the last tab that issued a request whenever any
> tab is painted (this works for bug 986782) -- it already doesn't respond to
> a request if another request comes in quickly.  There are still a few
> extraneous issues that could invalidate this idea (like connecting TabChild
> IDs to LayerTransaction IDs).

Keep in mind that we need to know that the LayerTransaction has been received in the queue, we don't need to wait for a composite after that transaction to have occurred or even wait for the transaction to have been executed.
Flags: needinfo?(bgirard)
Bent: I have an IPDL question. We have a ProcessLink and a ThreadLink connected to the Compositor thread in the system process on b2g. After a child process has called SendUpdate[NoSwap] to the compositor thread via a ProcessLink, if we notify the system' process' main thread somehow (IPC) and then it makes another SendUpdate[NoSwap] via the ThreadLink are there any ordering guarantees in which order the two SendUpdate[NoSwap] will be processed? Can we assume that the second message, which was definitely sent after, will be queued and processed after the first SendUpdate?
Comment on attachment 8438719 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser

Review of attachment 8438719 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/base/content/tabbrowser.xml
@@ +3303,5 @@
> +          let toBrowser = this.getBrowserForTab(tab);
> +          toBrowser.setAttribute("type", "content-primary");
> +          toBrowser.docShellIsActive = true;
> +
> +          let promise = Promise.resolve();

Why the default promise? If shouldWait=false we problably shouldn't push a new promise?

@@ +3318,5 @@
> +            let panelContainer = this.mPanelContainer;
> +            promise = new Promise((aResolve, aReject) => {
> +
> +              toBrowser.addEventListener("MozAfterRemotePaint", function onRemotePaint() {
> +                toBrowser.removeEventListener("MozAfterRemotePaint", onRemotePaint);

If I'm fast enough I could probably close the tab before that event makes it here?

@@ +3335,5 @@
> +
> +            let fromBrowser = this.mCurrentBrowser;
> +            promise.then(() => {
> +              fromBrowser.setAttribute("type", "content-targetable");
> +              fromBrowser.docShellIsActive = false;

When switching tabs quickly, and having the above promise rejected, we'll never set the type and activity state of the old browser. We probably also need to ensure that the old browser isn't suddenly the selected browser again.

@@ +3336,5 @@
> +            let fromBrowser = this.mCurrentBrowser;
> +            promise.then(() => {
> +              fromBrowser.setAttribute("type", "content-targetable");
> +              fromBrowser.docShellIsActive = false;
> +            })

Nit: missing semicolon.

::: toolkit/content/widgets/tabbox.xml
@@ +672,5 @@
> +          if (Array.isArray(beforeSelect.promises) && beforeSelect.promises.length > 0) {
> +            promises = beforeSelect.promises;
> +          } else {
> +            promises.push(Promise.resolve());
> +          }

You can remove this branch, Promise.all([]) works.

@@ +693,5 @@
> +
> +          this._selectedIndex = val;
> +
> +          Promise.race([allEventPromises, timeoutPromise]).then(() => {
> +            this.setAttribute("selectedIndex", val);

So how does this affect other <tabbox> users? Thunderbird etc. probably don't have any use for this behavior and even a single tick might break stuff for them.
Attachment #8438719 - Flags: feedback?(ttaubert)
Comment on attachment 8438719 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser

Ok, cool, thanks ttaubert.

I'll start attacking those issues. I'll also see about having Firefox use its own asyncified tabpanels implementation so as to not disturb other consumers of the original.
Attachment #8438719 - Flags: feedback?(enndeakin)
Also, re-assigning to handyman. I'll still be working on the front-end patch, but he's doing most of the heavy-lifting in the background, so he should get the credit. :)
Assignee: mconley → davidp99
de-bitrotted.
Attachment #8438717 - Attachment is obsolete: true
Attachment #8452505 - Attachment is patch: true
Forked from the tabbox bindings, with some small fix-its. Still a WIP.
Attachment #8438719 - Attachment is obsolete: true
Assignee: davidp99 → mconley
Attachment #8452507 - Attachment is obsolete: true
Attachment #8453355 - Attachment is obsolete: true
Attachment #8453784 - Attachment is obsolete: true
Comment on attachment 8453797 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser

(In reply to Tim Taubert [:ttaubert] (away July 7th-18th) from comment #40)
> Comment on attachment 8438719 [details] [diff] [review]
> Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint
> to fire from a remote browser
> 
> Review of attachment 8438719 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: browser/base/content/tabbrowser.xml
> @@ +3303,5 @@
> > +          let toBrowser = this.getBrowserForTab(tab);
> > +          toBrowser.setAttribute("type", "content-primary");
> > +          toBrowser.docShellIsActive = true;
> > +
> > +          let promise = Promise.resolve();
> 
> Why the default promise? If shouldWait=false we problably shouldn't push a
> new promise?
> 

Good point. Fixed.

> @@ +3318,5 @@
> > +            let panelContainer = this.mPanelContainer;
> > +            promise = new Promise((aResolve, aReject) => {
> > +
> > +              toBrowser.addEventListener("MozAfterRemotePaint", function onRemotePaint() {
> > +                toBrowser.removeEventListener("MozAfterRemotePaint", onRemotePaint);
> 
> If I'm fast enough I could probably close the tab before that event makes it
> here?
> 

With the latest code, we'll hit the timeout in that case, and cancel the tabswitch. Is that sufficient?

> @@ +3335,5 @@
> > +
> > +            let fromBrowser = this.mCurrentBrowser;
> > +            promise.then(() => {
> > +              fromBrowser.setAttribute("type", "content-targetable");
> > +              fromBrowser.docShellIsActive = false;
> 
> When switching tabs quickly, and having the above promise rejected, we'll
> never set the type and activity state of the old browser. We probably also
> need to ensure that the old browser isn't suddenly the selected browser
> again.
> 

Ah, good eye. Fixed.

> @@ +3336,5 @@
> > +            let fromBrowser = this.mCurrentBrowser;
> > +            promise.then(() => {
> > +              fromBrowser.setAttribute("type", "content-targetable");
> > +              fromBrowser.docShellIsActive = false;
> > +            })
> 
> Nit: missing semicolon.
> 

Fixed.

> ::: toolkit/content/widgets/tabbox.xml
> @@ +672,5 @@
> > +          if (Array.isArray(beforeSelect.promises) && beforeSelect.promises.length > 0) {
> > +            promises = beforeSelect.promises;
> > +          } else {
> > +            promises.push(Promise.resolve());
> > +          }
> 
> You can remove this branch, Promise.all([]) works.
> 

Refactored completely.

> @@ +693,5 @@
> > +
> > +          this._selectedIndex = val;
> > +
> > +          Promise.race([allEventPromises, timeoutPromise]).then(() => {
> > +            this.setAttribute("selectedIndex", val);
> 
> So how does this affect other <tabbox> users? Thunderbird etc. probably
> don't have any use for this behavior and even a single tick might break
> stuff for them.

Forked to our own tabbrowser-tabpanels binding. Maybe I should add this thing to tabbox.xml as a new extension of tabpanels instead, but for now, I've got it in tabbrowser.xml.

ttaubert: So I've refactored this thing completely, and plugged the holes you mentioned. What are your thoughts on this approach?

Enn: I had to patch nsDeckFrame.cpp because previously, when removing a tab before the current tab, we had to workaround the fact that the index of the deck would get offset, and we'd be selecting the child (if one existed) immediately after the one we had started on. The old workaround simply set the selectedPanel to the original panel immediately after removing the child - but that doesn't work if selecting the panel is asynchronous.

My patch decrements the mIndex in nsDeckFrame if the removed child had an index less than the currently selected child index. I then dispatch a runnable to set the attribute. Thoughts?
Attachment #8453797 - Flags: feedback?(ttaubert)
Attachment #8453797 - Flags: feedback?(enndeakin)
Comment on attachment 8453797 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser

The deck changes look ok.
Attachment #8453797 - Flags: feedback?(enndeakin) → feedback+
Comment on attachment 8453797 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser

Turns out ttaubert is on PTO, and we're trying to wrap up M1 sooner than when he gets back.

Jared, would you mind checking out whether or not this is a good idea?
Attachment #8453797 - Flags: feedback?(ttaubert) → feedback?(jaws)
Comment on attachment 8453797 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser

And I just realized jaws is on PTO too! How about Gijs?
Attachment #8453797 - Flags: feedback?(jaws) → feedback?(gijskruitbosch+bugs)
Hey dveditz - a security-related question for you here.

This patch aimes to make our tab switching asynchronous for e10s. When a user clicks on a tab, we change all of the UI immediately - the URL bar, the selected tab, etc. The content, however, will not switch over until the content process is "ready" (we know this when it sends the MozAfterRemotePaint event). We wait for that event before switching the tabpanel to show the newly selected tab. Until then, we show the current selected tab.

As a guard, if we've waited longer than 200ms and we still haven't heard the MozAfterRemotePaint event, we switch the tabpanel anyways to show the blank content.

MattN tells me that you might be a little concerned from a security perspective about that short window whereby the URL bar (and various security UI) may not agree with the tabpanel being shown. Is that something we need to discuss?
Flags: needinfo?(dveditz)
Hey Dao,

See comment 53 for a description of what we're trying to do here.

So, one of my concerns is breaking the world in terms of built-in assumptions about how tabs work synchronously - not just for our own code, but for add-ons and automated tests.

How catastrophic do you think that kind of breakage would be with what I'm proposing in my front-end patch?

As a follow-up, MattN and I were discussing the possibility of making it so that programmatically setting the selected tab worked synchronously, but that a user selecting a tab either by mouse or keyboard is asynchronous. That'd allow us to (hopefully) sidestep some assumption breakage, but it would mean that under certain circumstances, the browser might show a flash of white, black, or (in the worst case) random-looking video memory while we wait for the content process to get ready to paint.

Thoughts on all of this? What kind of rats nest are we getting ourselves into here?
Flags: needinfo?(dao)
Comment on attachment 8453797 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser

Review of attachment 8453797 [details] [diff] [review]:
-----------------------------------------------------------------

How does this jive with selectedTab/selectedPanel/selectedBrowser/currentURI and such? I'm not familiar enough with e10s to know for sure which information is kept where (e.g. things like SSLStatus, document.domain (which is content-settable!) and currentURI all intersect, of course). I'd be worried that if one part updated and the other part was async, that you'd get different (conflicting) information depending on which path you took to get the information, which would be a bit of a footgun.

I do know that lots of tests expect setting selectedTab to be synchronous, and this effectively makes it async (ie if you tried to click something on the page right after, that wouldn't work), so that's one thing I'd expect to break...

All that being said, these are issues with the general "make this async" bit, not with the approach, so f+, I guess? In a certain sense, it feels to me like I don't know e10s well enough to give a full on review or something like that.

::: layout/xul/nsDeckFrame.cpp
@@ +166,5 @@
> +    // by 1.
> +    //
> +    // We attempt to keep the same child displayed by automatically
> +    // updating our internal notion of the current index.
> +    int32_t removedIndex = mFrames.IndexOf(aOldFrame);

Is there any possibility aOldFrame is not in mFrames? Then if mIndex is 0, this will crash in debug builds. Is that intentional? I'm thinking a silent fail if removedIndex == -1 might be more appropriate... but this is layout, so maybe I'm just seeing spooks.
Attachment #8453797 - Flags: feedback?(gijskruitbosch+bugs) → feedback+
(In reply to Mike Conley (:mconley) from comment #53)
> As a guard, if we've waited longer than 200ms and we still haven't heard the
> MozAfterRemotePaint event, we switch the tabpanel anyways to show the blank
> content.
> 
> MattN tells me that you might be a little concerned from a security
> perspective about that short window whereby the URL bar (and various
> security UI) may not agree with the tabpanel being shown. Is that something
> we need to discuss?

We have had a fair number of security bugs where an attacker can show their own content when the navigation bar is showing "MyBank" or "Google". There have been a lot of techniques such as exploiting timing bugs, redirecting to a page that returns an error, history navigations, opening a new tab/window, etc. The end bad state is there is page content the user can interact with that they might believe is the claimed site -- essentially a perfect phishing attempt.

I'm not concerned that content and URL might not match for a period less than human reaction time. If you have a guard and blank the content after 200 or even 500ms that's fine. Just don't get stuck waiting showing the old content.

That said, I've seen pages take a lot longer than that to reach onload (slow-to-load sub-resources?). When does this event fire, after the first paint?
Flags: needinfo?(dveditz)
(In reply to Daniel Veditz [:dveditz] from comment #56)
> That said, I've seen pages take a lot longer than that to reach onload
> (slow-to-load sub-resources?). When does this event fire, after the first
> paint?

I'm going to redirect this question to the guy who developed the back-end to this patch. Handyman, would you mind filling dveditz in?
Flags: needinfo?(davidp99)
(In reply to Mike Conley (:mconley) from comment #57)
> (In reply to Daniel Veditz [:dveditz] from comment #56)
> > That said, I've seen pages take a lot longer than that to reach onload
> > (slow-to-load sub-resources?). When does this event fire, after the first
> > paint?
> 
> I'm going to redirect this question to the guy who developed the back-end to
> this patch. Handyman, would you mind filling dveditz in?

While the MozAfterRemotePaint event fires much later than page-loading, I believe that the front-end timeout should still save us from the situation dveditz suggests here.  The MozAfterRemotePaint message fires after the graphics objects for the page are "ready to be composited" -- ie once they exist in the Compositor (and accelerated objects on the graphics card) as a collection of not-yet-composited images.  This is earlier than any painting of the tab (ie before MozAfterPaint).  As BenWa suggested, that event name is a WIP as it is a bit misleading.

The background for this is bug 986782.  That bug talks about a quick flash of junk onscreen when switching tabs but it also concludes that we should be delaying tab switches when the graphics aren't painted yet (modulo a 200ms time-out), similar to other browsers.  MozAfterRemotePaint is then like MozAfterPaint except much sooner -- you wouldn't want to wait for the entire page to paint before you allow tab switching but switching to the 'totally blank' page looks crummy.  But we should not wait longer than about 200ms before switching (even to a blank page... even if loading takes a forever...).  The "flash of junk" goes away as a by-product.  I don't know the tab code well at all but it looks to me like mconley's patch kicks in where the old code did its synchronous tab switch and that he completes the tab-switch in 200ms with or without the MozAfterRemotePaint message so the URL bar shouldn't be out-of-sync for longer than that.
Flags: needinfo?(davidp99)
Does comment 58 clear things up?
Flags: needinfo?(dveditz)
Sounds good, shouldn't be a problem.
Flags: needinfo?(dveditz)
So I'll note that even with these patches applied, I'm noticing periodic white flashes when switching tabs, at least on OS X.

Hopefully this video helps describe what I'm seeing:

https://www.youtube.com/watch?v=3AmOWJsRWxA

The left browser is using remote tabs, the right one is not.
BenWa, you and I noticed these flashes when I first demo's my front-end patch a while back, and I remember you saying that you might know what was going on. Can you describe what you think might be happening here?
Flags: needinfo?(bgirard)
I just had a look with mconley. We saw a flash of white but it sounded like it was just that we were hitting the timeout. Looks like we just need to tweak our timeout and what we do when we hit that timeout.
Flags: needinfo?(bgirard)
Attachment #8455554 - Attachment is obsolete: true
Attachment #8453797 - Attachment is obsolete: true
Attachment #8455595 - Attachment is obsolete: true
So one thing I've observed having toyed with this front-end patch for a bit, is that it's broken focus and blur events for content in e10s tabs.

That's causing an orange on mochitest-bc with e10s on: https://tbpl.mozilla.org/php/getParsedLog.php?id=43786937&tree=Try

I'm looking into what we can do about this.
Attachment #8455606 - Attachment is obsolete: true
Comment on attachment 8456920 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser. r=?

Dao, I think you're probably the most qualified reviewer for the tabbrowser changes. What are your thoughts on it?

And Enn, the nsDeckFrame changes are for you. :D

Here's a successful try push: https://tbpl.mozilla.org/?tree=Try&rev=f4510c7fe16c

(The non-unified build error for Linux Debug is something that needs to be fixed in the backend patch).
Attachment #8456920 - Flags: review?(enndeakin)
Attachment #8456920 - Flags: review?(dao)
Comment on attachment 8456920 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser. r=?

Hey felipe, do you have the cycles to look at this?
Attachment #8456920 - Flags: review?(dao) → review?(felipc)
Comment on attachment 8456920 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser. r=?

I'll review this (might take till next week, though)
Attachment #8456920 - Flags: review?(felipc) → review?(dao)
Flags: needinfo?(dao)
Comment on attachment 8456920 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser. r=?

>+    MOZ_ASSERT(removedIndex >= 0,
>+               "A deck child was removed that was not in mFrames.");
>+    if (removedIndex < mIndex) {
>+      mIndex--;
>+      MOZ_ASSERT(mIndex >= 0,
>+                 "Tried to display a deck child indexed at a negative number.");

This second assertion is redundant since we know that removedIndex is 0 or more and mIndex is greater than that.
Attachment #8456920 - Flags: review?(enndeakin) → review+
BenWa, can you review the new version of the patch?  All of the new changes are just from your earlier review.

This does not include the previously-suggested behavior where we replace the content-thread <-> compositor-thread communication with an assumption that the compositor will process future paint requests from the main/chrome thread only after it processes updated shadow layers from the content thread.  bent says that queue (temporal) order isn't maintained between messages arriving on different IPC queues.
The TabChild reference is also still needed since we couldn't get rid of the content <-> compositor communication.
Attachment #8452505 - Attachment is obsolete: true
Attachment #8458291 - Flags: review?(bgirard)
There should also be a test for the deck frame index check. You can add this to toolkit/content/tests/chrome/test_deck.xul
Attachment #8456920 - Attachment is obsolete: true
Attachment #8456920 - Flags: review?(dao)
Comment on attachment 8456920 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser. r=?

bzexport mistakenly decided to obsolete this. Lovely.
Attachment #8456920 - Attachment is obsolete: false
Attachment #8456920 - Flags: review?(dao)
Comment on attachment 8458779 [details] [diff] [review]
Tests for XUL deck changes. r=?

Tests, as requested.
Attachment #8458779 - Flags: review?(enndeakin)
Comment on attachment 8458291 [details] [diff] [review]
Need MozAfterRemotePaint Event for remote iframes (backend)

Review of attachment 8458291 [details] [diff] [review]:
-----------------------------------------------------------------

Few instances of trailing whitespace, check the patch in splinter.

Looks good, the patch is almost there. r- because I'd like to have another look with the mGraphicsReady corrections.

::: dom/ipc/PBrowser.ipdl
@@ +384,5 @@
> +    /**
> +     * Child informs the parent that the graphics objects are ready for
> +     * compositing.  This usually means that the graphics objects (textures
> +     * and the like) are available on the GPU.  This is used for tab content.
> +     * Note that the parent/child relationship is flipped when compared with

Maybe adjust this comment to be more explicit about the real guarantee we are aiming for:
This is sent when all pending changes have been sent to the compositor and are ready to be shown on the next composite.

We shouldn't make any guarantee about the state of graphics object and anything being on the GPU as we could be using a software compositor for example.

::: dom/ipc/TabParent.cpp
@@ +2159,5 @@
> +  NS_ENSURE_TRUE(target, true);
> +
> +  nsCOMPtr<nsIDOMEvent> event;
> +  NS_NewDOMNotifyPaintEvent(getter_AddRefs(event), target, nullptr /* nsPresContext */, 
> +                            nullptr /* WidgetEvent */, NS_AFTER_REMOTE_PAINT, 

trailing whitespace

::: gfx/layers/ipc/CompositorChild.h
@@ +137,5 @@
>    // compositor context in another process.
>    static CompositorChild* sCompositor;
>  
> +  // Weakly hold the TabChild that made a request to be alerted when graphics
> +  // objects are ready on GPU for this layer.

to be alerted when the transaction has been received.

::: gfx/layers/ipc/CompositorParent.cpp
@@ +1140,5 @@
>    virtual bool RecvStopFrameTimeRecording(const uint32_t& aStartIndex, InfallibleTArray<float>* intervals) MOZ_OVERRIDE  { return true; }
>  
> +  /**
> +   * Means that we should send a message when the compositor is done
> +   * posting graphics for the adjacent CompositorChild's content to the GPU.

adjacent? I think just saying CompositorChild implies the corresponding one in the context of the CompositorParent.

Again lets do away with the GPU reference and instead talk about the transaction:

Means that we should send a message when the compositor has received the transaction.

@@ +1189,5 @@
> +  // If true, we should send a RemotePaintIsReady message when graphics are
> +  // available on GPU.
> +  bool mNotifyAfterRemotePaint;
> +  // If true, graphics objects are available on GPU.
> +  bool mGraphicsAreReady;

The above should also mention transaction instead of the GPU.

@@ +1285,5 @@
>  
> +bool
> +CrossProcessCompositorParent::RecvRequestNotifyAfterRemotePaint()
> +{
> +  if(mGraphicsAreReady)  {

style should be:
if (...) {
And below as well.

Is this condition ever false? mGraphicsAreReady is set to false only while executing a transaction which should happen on the same thread as this. This should never get to run while we're in a shadow layers update. Might be worth checking because if that's the case then I'm misunderstanding this.

I think what should happen here is you need to remove mGraphicsAreReady and just use the logic of mNotifyAfterRemotePaint. So when we receive RecvRequestNotifyAfterRemotePaint we wait for the next transaction to be received and we fire the event.

@@ +1290,5 @@
> +    unused << SendRemotePaintIsReady();
> +    return true;
> +  }
> +  mNotifyAfterRemotePaint = true;
> +  return true;  // success

No need for this comment.
Attachment #8458291 - Flags: review?(bgirard) → review-
Comment on attachment 8458291 [details] [diff] [review]
Need MozAfterRemotePaint Event for remote iframes (backend)

Request review for files in content/* and dom/*
Attachment #8458291 - Flags: review- → review?(bugs)
Comment on attachment 8458291 [details] [diff] [review]
Need MozAfterRemotePaint Event for remote iframes (backend)


> }
> 
>+bool nsFrameLoader::SendRequestNotifyAfterRemotePaint()  {
{ to its own line when defining methods


>+  if (mRemoteBrowser) {
>+    return mRemoteBrowser->SendRequestNotifyAfterRemotePaint();
>+  }
>+
>+  // Requested MozAfterRemotePaint in a non-e10s implementation.  We simply
>+  // respond immediately in that case.
>+  // Since painting will occur from the same (ie only) process, we don’t have to wait
some odd characters after don and before t. Shouldn't it be don't

>+  // on an async process to paint to present in a single transaction so this
>+  // should be fine.
>+  nsCOMPtr<nsIDocShell> docshell = GetExistingDocShell();
>+  MOZ_ASSERT(docshell);
No need for this assert (and it is even wrong.)

>+  nsCOMPtr<nsPIDOMWindow> window = do_GetInterface(docshell);
>+  MOZ_ASSERT(window);
Make this if check.

And because do_GetInterface is null safe, I think you could just
nsCOMPtr<nsPIDOMWindow> window = do_GetInterface(GetExistingDocShell());
if (!window) {
  return;
}


>+  nsCOMPtr<EventTarget> chromeEventHandler =
>+      do_QueryInterface(window->GetChromeEventHandler());
>+
>+  nsCOMPtr<nsIDOMEvent> event;
>+  NS_NewDOMNotifyPaintEvent(getter_AddRefs(event), chromeEventHandler, nullptr /* nsPresContext */,
>+                            nullptr /* WidgetEvent */, NS_AFTER_REMOTE_PAINT,
>+                            nullptr /* invalidate request list */);
>+  MOZ_ASSERT(event);
>+  event->SetOwner(chromeEventHandler);
>+  bool dummy;
>+  chromeEventHandler->DispatchEvent(event, &dummy);
>+  return true;
Do we really want this to happen synchronously. If the not-in-same process is async, better to have the
in-process async too.
Also, since you don't actually add any special data to the event, why not just use Event interface.
The target for the event should be window->GetParentTarget()


>+  } else if (aType == NS_AFTER_REMOTE_PAINT) {
>+    // This message can only be heard on the chrome browser object.  This
>+    // *should* be an error if done on another window (but we settle for
>+    // an assert).
>+    MOZ_ASSERT(mIsMainThreadELM && mTarget,
>+        "MozAfterRemotePaint can only be requested from chrome UI!");
>+    nsCOMPtr<nsIFrameLoaderOwner> flOwner = do_QueryInterface(mTarget);
>+    MOZ_ASSERT(flOwner, "target was not an nsIFrameLoaderOwner");
>+    nsRefPtr<nsFrameLoader> fl = flOwner->GetFrameLoader();
>+    MOZ_ASSERT(fl, "No FrameLoader for FrameLoaderOwner");
>+    unused << fl->SendRequestNotifyAfterRemotePaint();
I don't understand the setup. If in non-remote case one adds a listener, event is dispatched immediately but never afterwards
unless someone adds a new listener.


>+NON_IDL_EVENT(MozAfterRemotePaint,
>+              NS_AFTER_REMOTE_PAINT,
>+              EventNameType_None,
>+              NS_EVENT)
This isn't needed if simple Event interface is used.
(and this is wrong anyway, since the event struct you're using in the patch isn't NS_EVENT)
So, just remove this

>+TabChild::RecvRequestNotifyAfterRemotePaint()
>+{
>+  // Get the CompositorChild instance for this content thread.
>+  CompositorChild *compositor = CompositorChild::Get();
nit, * goes with the type. Composito* compositor.

>+  MOZ_ASSERT(compositor, "TabChild::RecvRequestNotifyAfterRemotePaint thread has no "
>+    "CompositorChild");
I don't know what guarantees compositor is non-null.


>+TabParent::RecvRemotePaintIsReady()
>+{
>+  MOZ_ASSERT(mFrameElement);
>+  nsCOMPtr<mozilla::dom::EventTarget> target = do_QueryInterface(mFrameElement);
>+  NS_ENSURE_TRUE(target, true);
mFrameElement is an EventTarget, so no need to QI, but
perhaps null check mFrameElement.


>+CompositorChild::RequestNotifyAfterRemotePaint(TabChild *tabChild)
>+{
>+  MOZ_ASSERT(tabChild, "NULL TabChild not allowed in "
>+      "CompositorChild::RequestNotifyAfterRemotePaint");
>+  // XPCOM weirdness requires the cast
>+  mWeakTabChild = do_GetWeakReference( (dom::TabChildBase *)tabChild );
C++ casting please


>diff --git a/widget/BasicEvents.h b/widget/BasicEvents.h
>--- a/widget/BasicEvents.h
>+++ b/widget/BasicEvents.h
>@@ -322,16 +322,17 @@ enum nsEventStructType
> #define NS_RATECHANGE          (NS_MEDIA_EVENT_START+17)
> #define NS_DURATIONCHANGE      (NS_MEDIA_EVENT_START+18)
> #define NS_VOLUMECHANGE        (NS_MEDIA_EVENT_START+19)
> #define NS_NEED_KEY            (NS_MEDIA_EVENT_START+20)
> 
> // paint notification events
> #define NS_NOTIFYPAINT_START    3400
> #define NS_AFTERPAINT           (NS_NOTIFYPAINT_START)
>+#define NS_AFTER_REMOTE_PAINT   (NS_NOTIFYPAINT_START+1)
Shouldn't need this


So if the event is dispatched usually just once per tab or so, some other name would be good.
MozAfterPaint is dispatched all the time, MozAfterRemotePaint sounds too much like it.
Attachment #8458291 - Flags: review?(bugs) → review-
Also, I don't quite understand the setup.
(In reply to Benoit Girard (:BenWa) from comment #79)
> @@ +1285,5 @@
> >  
> > +bool
> > +CrossProcessCompositorParent::RecvRequestNotifyAfterRemotePaint()
> > +{
> > +  if(mGraphicsAreReady)  {
> 
> Is this condition ever false? mGraphicsAreReady is set to false only while
> executing a transaction which should happen on the same thread as this. This
> should never get to run while we're in a shadow layers update. Might be
> worth checking because if that's the case then I'm misunderstanding this.
>
> I think what should happen here is you need to remove mGraphicsAreReady and
> just use the logic of mNotifyAfterRemotePaint. So when we receive
> RecvRequestNotifyAfterRemotePaint we wait for the next transaction to be
> received and we fire the event.

I believe you are correct.  I've removed it.

Everything else was pretty straightforward.  I'll update the patch when I've taken care of smaug's concerns.
(In reply to Olli Pettay [:smaug] from comment #81)
> Comment on attachment 8458291 [details] [diff] [review]
> Need MozAfterRemotePaint Event for remote iframes (backend)
> 
> 

> >+  // on an async process to paint to present in a single transaction so this
> >+  // should be fine.
> >+  nsCOMPtr<nsIDocShell> docshell = GetExistingDocShell();
> >+  MOZ_ASSERT(docshell);
> No need for this assert (and it is even wrong.)
> 
> >+  nsCOMPtr<nsPIDOMWindow> window = do_GetInterface(docshell);
> >+  MOZ_ASSERT(window);
> Make this if check.
> 
> And because do_GetInterface is null safe, I think you could just
> nsCOMPtr<nsPIDOMWindow> window = do_GetInterface(GetExistingDocShell());
> if (!window) {
>   return;
> }

I've done what you suggested but my concern is that the behavior is unpredictable -- if, somehow, we reach this function and the docShell for the window isn't available (or some of the other casting fails) then we can't respond to the MozAfterRemotePaint request -- we leave it hanging.  I'm not suggesting that I know that the assertion was legitimate in the code -- just that, if its not, I've potentially got a huge problem that the return doesn't really solve -- it just obscures it.  (This is also true for the other asserts you mentioned, which I have also replaced.)  Maybe the messaging part needs to be more robust to responding in the face of failure under the covers?  Or maybe the event needs a stronger definition of what it promises (ie maybe it shouldn't promise to provide a response)?


> >+  nsCOMPtr<EventTarget> chromeEventHandler =
> >+      do_QueryInterface(window->GetChromeEventHandler());
> >+
> >+  nsCOMPtr<nsIDOMEvent> event;
> >+  NS_NewDOMNotifyPaintEvent(getter_AddRefs(event), chromeEventHandler, nullptr /* nsPresContext */,
> >+                            nullptr /* WidgetEvent */, NS_AFTER_REMOTE_PAINT,
> >+                            nullptr /* invalidate request list */);
> >+  MOZ_ASSERT(event);
> >+  event->SetOwner(chromeEventHandler);
> >+  bool dummy;
> >+  chromeEventHandler->DispatchEvent(event, &dummy);
> >+  return true;
> Do we really want this to happen synchronously. If the not-in-same process
> is async, better to have the
> in-process async too.

OK.  Done.

> Also, since you don't actually add any special data to the event, why not
> just use Event interface.
> The target for the event should be window->GetParentTarget()

I can do that if you prefer.  How would I use the Event interface to do this?  (I don't know anything about Event dispatch -- this code just mirrors the NS_AFTERPAINT mechanism).

> 
> >+  } else if (aType == NS_AFTER_REMOTE_PAINT) {
> >+    // This message can only be heard on the chrome browser object.  This
> >+    // *should* be an error if done on another window (but we settle for
> >+    // an assert).
> >+    MOZ_ASSERT(mIsMainThreadELM && mTarget,
> >+        "MozAfterRemotePaint can only be requested from chrome UI!");
> >+    nsCOMPtr<nsIFrameLoaderOwner> flOwner = do_QueryInterface(mTarget);
> >+    MOZ_ASSERT(flOwner, "target was not an nsIFrameLoaderOwner");
> >+    nsRefPtr<nsFrameLoader> fl = flOwner->GetFrameLoader();
> >+    MOZ_ASSERT(fl, "No FrameLoader for FrameLoaderOwner");
> >+    unused << fl->SendRequestNotifyAfterRemotePaint();
> I don't understand the setup. If in non-remote case one adds a listener,
> event is dispatched immediately but never afterwards
> unless someone adds a new listener.

Yes, that's all.  The remote and non-remote cases share this property (remote does not dispatch "immediately" tho).

> 
> >+NON_IDL_EVENT(MozAfterRemotePaint,
> >+              NS_AFTER_REMOTE_PAINT,
> >+              EventNameType_None,
> >+              NS_EVENT)
> This isn't needed if simple Event interface is used.
> (and this is wrong anyway, since the event struct you're using in the patch
> isn't NS_EVENT)
> So, just remove this

This is probably because I have no idea how to use the Event interface but this (and the code you mention below) are needed for the javascript that uses MozAfterRemotePaint to run and for the EventListenerManager.cpp code to compile.


> So if the event is dispatched usually just once per tab or so, some other
> name would be good.
> MozAfterPaint is dispatched all the time, MozAfterRemotePaint sounds too
> much like it.

I can do that.  Any suggestions?  MozPaintDataIsReady?  MozRemotePaintIsReady?


Most of that was pretty basic.  We also spoke on IRC.  You had an objection to either nsFrameLoader::SendRequestNotifyAfterRemotePaint or the NS_AFTER_REMOTE_PAINT block in EventListenerManager::AddEventListenerInternal.  You mentioned that there was an implied redraw request that would break something.  I'm still totally unable to detect any of that but I've been looking and want to record your pointers for posterity.  From the IRC log:


> smaug
>   handyman: event listeners aren't for one-only cases
>   unless you explicitly remove the event listener
> smaug
>   (or if you know that the event can never happen more than once )
> handyman
>   smaug: I see.  So what is the right thing to do in that case?
> smaug
>   handyman: as I said here
> smaug
>   you add listener
> smaug
>   then you explicitly request a repaint or such
> smaug
>   and child repaints, and notifies the parent
> smaug
>   and then parent dispatches event to indicate that the request has been done
> handyman
>   smaug: I think I'm  missing your point.  Above you said "should it be something like: addEventListener("MozRemoteRepaintRequestDone"), parent->child requestPaint(), child repaints, child->parent repainted(), dispatch("MozRemoteRepaintRequestDone");".
> handyman
>   Isnt that the same thing?  Because that's what the patch does.
> smaug
>   handyman: the patch doesn't do that
> handyman
>   Although the 'dispatch' is in C++
> smaug
>   oh, I mean dispatch would be in C++ too
> smaug
>   handyman: in your patch you request repaint when event listener is added
> smaug
>   that is really odd
> smaug
>   what if someone added event listener to the parent node of iframe ?
> smaug
>   and you'd be forced to remove the event listener in order to get a new repaint
> smaug
>   since if you add same event listener twice, it is registered only when it is added first time
> smaug
>   the second addEventListener call is no-op
> handyman
>   smaug: The repaint is requested by JS.  It shouldnt be requested by the user attaching the addEventListener().  You are right about the multiple calls to addEventListener tho.
> smaug
>   in your patch you do some magic when AddEventListener is called
> smaug
>   SendRequestNotifyAfterRemotePaint
> handyman
>   smaug: Maybe I goofed something but it wasn't my intent to do a repaint on AddEventListener.  Let me check...
> handyman
>   That function shouldnt request a repaint
> handyman
>   ...shouldnt...
> smaug
>   but it does in the patch ;)
> handyman
>   smaug: OK.  That definitely shouldn't happen but I'm not seeing it.  Can you point me to the code?
> smaug
>   handyman: in @@ -278,16 +280,27 @@ EventListenerManager::AddEventListenerIn
> smaug
>   unused << fl->SendRequestNotifyAfterRemotePaint();
> handyman
>   smaug: OK.  I guess I need to see if I can figure out what makes that request a repaint.  Theres a lot about ipdl Im not too clear on yet.
> smaug
>   so you could add requestRepaint() to nsIFrameLoader
> smaug
>   and TabParent would dispatch the event once it gets the message from child that the requested repaint has been done
> handyman
>   smaug: I dont actually want to request a repaint -- the front-end (tabbox) already handles that.  I just need to remove the implicit repaint that you referenced in the IPDL.
Comment on attachment 8458779 [details] [diff] [review]
Tests for XUL deck changes. r=?

>+<deck id="deck3" selectedIndex="1">
>+  <button id="d3b1" label="Remove me"/>
>+  <button id="d3b2" label="Keep me selected"/>
>+</deck>
>+<deck id="deck3" selectedIndex="1">
>+  <button id="d3b1" label="Remove me"/>
>+  <button id="d3b2" label="Keep me selected"/>
>+</deck>

This appears twice.


>+
>+  for (let i = 0; i < 3; ++i) {
>+    deck.lastChild.remove();
>+    is(deck.selectedIndex, expectedIndex,
>+       "Should have the deck element at index " + expectedIndex + " selected");
>+  }

Can you also expand the last test here to remove all of the children? That way, we can test what happens when the last child is current and is removed, plus when all children are removed.


>+}
>+
>+SimpleTest.waitForFocus(run_tests, window);

This test doesn't require focus.
Attachment #8458779 - Flags: review?(enndeakin) → review+
Assignee: mconley → davidp99
Gentle review ping?
Flags: needinfo?(dao)
Comment on attachment 8456920 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser. r=?

Hey Tim, do you have any time to look at this again?
Attachment #8456920 - Flags: review?(dao) → review?(ttaubert)
Comment on attachment 8456920 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser. r=?

>+              if (!gMultiProcessBrowser) {
>+                this.adjustFocus(this.mCurrentTab, oldTab);
>               }

nit: remove braces for consistency with surrounding code

>+      <method name="adjustFocus">

Prefix this with an underscore, since we don't want this to be part of the official tabbrowser API.

>+          if (!window.fullScreen) {
>+            gURLBar.focus();
>+            return;
>+          } else if (isTabEmpty(this.mCurrentTab)) {

no else after return

>+      <method name="prepareForTabSwitch">

prefix with underscore

>+            let attemptTabSwitch = (aResolve, aReject) => {
>+              let tab = panels.getRelatedElement(panels.selectedPanel);
>+              let currentBrowser = this.getBrowserForTab(tab);

Is this different from this.selectedBrowser? How so?

>+      <method name="_deactivateTab">
>+        <parameter name="tab"/>
>+        <body><![CDATA[
>+          let browser = this.getBrowserForTab(tab);
>+          if (browser && this.mCurrentBrowser != browser) {

Please document why browser can be null here and use selectedBrowser instead of mCurrentBrowser.

>+      <method name="finalizeTabSwitch">

>+      <method name="cancelTabSwitch">

prefix with underscore

Overall this approach looks sane if it doesn't break loads of tests, assuming browser chrome tests are even green enough with e10s enabled to act as a benchmark here.
Attachment #8456920 - Flags: review?(ttaubert) → feedback+
Attachment #8456920 - Attachment is obsolete: true
Attachment #8458779 - Attachment is obsolete: true
Assignee: davidp99 → mconley
Comment on attachment 8461744 [details] [diff] [review]
Tests for XUL deck changes. r=Enn.

Thanks Enn, comments addressed.
Attachment #8461744 - Flags: review+
So bzexport likes to set assignees on bugs. That's so lame.
Assignee: mconley → davidp99
Comment on attachment 8461747 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser. r=?

(In reply to Dão Gottwald [:dao] from comment #88)
> Comment on attachment 8456920 [details] [diff] [review]
> Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint
> to fire from a remote browser. r=?
> 
> >+              if (!gMultiProcessBrowser) {
> >+                this.adjustFocus(this.mCurrentTab, oldTab);
> >               }
> 
> nit: remove braces for consistency with surrounding code

Done.

> 
> >+      <method name="adjustFocus">
> 
> Prefix this with an underscore, since we don't want this to be part of the
> official tabbrowser API.
> 

Done.


> >+          if (!window.fullScreen) {
> >+            gURLBar.focus();
> >+            return;
> >+          } else if (isTabEmpty(this.mCurrentTab)) {
> 
> no else after return

Fixed.

> 
> >+      <method name="prepareForTabSwitch">
> 
> prefix with underscore
> 

Done.

> >+            let attemptTabSwitch = (aResolve, aReject) => {
> >+              let tab = panels.getRelatedElement(panels.selectedPanel);
> >+              let currentBrowser = this.getBrowserForTab(tab);
> 
> Is this different from this.selectedBrowser? How so?
> 

Good call - it's not, since updateCurrentBrowser has already been called by the time either the timeout and MozAfterRemotePaint promises need to resolve or reject.

> >+      <method name="_deactivateTab">
> >+        <parameter name="tab"/>
> >+        <body><![CDATA[
> >+          let browser = this.getBrowserForTab(tab);
> >+          if (browser && this.mCurrentBrowser != browser) {
> 
> Please document why browser can be null here and use selectedBrowser instead
> of mCurrentBrowser.

Done.

> 
> >+      <method name="finalizeTabSwitch">
> 
> >+      <method name="cancelTabSwitch">
> 
> prefix with underscore

Done.

> 
> Overall this approach looks sane if it doesn't break loads of tests,

Phew. :) I'll push to try again just to be sure with these latest changes.

> assuming browser chrome tests are even green enough with e10s enabled to act
> as a benchmark here.

Luckily, we're running mochitest-bc with e10s on tbpl. Some tests are disabled for e10s, but we're working on clearing that up.
Attachment #8461747 - Flags: review?(dao)
Flags: needinfo?(dao)
Attachment #8461744 - Attachment description: Tests for XUL deck changes → Tests for XUL deck changes. r=Enn.
Work on BenWa and smaug's suggestions.
Attachment #8458291 - Attachment is obsolete: true
Comment on attachment 8461747 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser. r=?

>+      <method name="_adjustFocus">

Can you make this clearer by renaming it e.g. to _adjustFocusAfterTabSwitch?

>+      <method name="_deactivateTab">

And this to _deactivateContent or so?

>+      <method name="_prepareForTabSwitch">
>+        <parameter name="toTab"/>
>+        <parameter name="fromTab"/>
>+        <body><![CDATA[
>+          const kTabSwitchTimeout = 300;
>+          let toBrowser = this.getBrowserForTab(toTab);
>+          let fromBrowser = fromTab ? this.getBrowserForTab(fromTab)
>+                                    : null;
>+
>+          // First, we need to activate the docShell on the tab we're switching
>+          // to - otherwise, we won't get the MozAfterRemotePaint event that we're
>+          // waiting for.
>+          toBrowser.setAttribute("type", "content-primary");
>+          toBrowser.docShellIsActive = true;

>+      <method name="_deactivateTab">
>+        <parameter name="tab"/>
>+        <body><![CDATA[
>+          // It's unlikely, yet possible, that while we were waiting
>+          // to deactivate this tab, that something closed it and wiped
>+          // out the browser. For example, during a tab switch, while waiting
>+          // for the MozAfterRemotePaint event to fire, something closes the
>+          // original tab that the user had selected. If that's the case, then
>+          // there's nothing to deactivate.
>+          let browser = this.getBrowserForTab(tab);
>+          if (browser && this.selectedBrowser != browser) {
>+            browser.setAttribute("type", "content-targetable");
>+            browser.docShellIsActive = false;
>+          }
>+        ]]></body>
>+      </method>
>+
>+      <method name="_finalizeTabSwitch">
>+        <parameter name="toTab"/>
>+        <parameter name="fromTab"/>
>+        <body><![CDATA[
>+          this._adjustFocus(toTab, fromTab);
>+          this._deactivateTab(fromTab);

It seems like between _prepareForTabSwitch and _finalizeTabSwitch, both browsers would have type="content-primary". That's not good, only one browser at a time should have that.
Attachment #8461747 - Flags: review?(dao) → review-
Attachment #8461744 - Attachment is obsolete: true
Attachment #8461747 - Attachment is obsolete: true
Comment on attachment 8461744 [details] [diff] [review]
Tests for XUL deck changes. r=Enn.

grrr, bzexport...
Attachment #8461744 - Attachment is obsolete: false
Attachment #8462695 - Attachment is obsolete: true
Comment on attachment 8462704 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser. r=?

Naming fixed.

Good eye on the content-primary - I've made it so that we only set the type attribute to content-primary on a browser once the tab switch to it has been finalized.
Attachment #8462704 - Flags: review?(dao)
Attached patch afterpaint.patch (obsolete) — Splinter Review
Update patch with fixes for reviews.
Attachment #8461797 - Attachment is obsolete: true
Updating comments on smaug's review:

(In reply to David Parks [:handyman] from comment #84)
> (In reply to Olli Pettay [:smaug] from comment #81)
> > Comment on attachment 8458291 [details] [diff] [review]
> > Need MozAfterRemotePaint Event for remote iframes (backend)
> > 

> > Also, since you don't actually add any special data to the event, why not
> > just use Event interface.
> > The target for the event should be window->GetParentTarget()

I'm still not perfectly clear on the Event interface but I've got a basic understanding now.  I think the Event interface is actually not an option because of a constraint that BenWa and I imposed early on... that the cost of the async calls reporting the MozAfterRemotePaint from the compositor shouldn't be paid unless the event was requested.   For that reason, we need to catch the AddEventListener call so that we know we should send a MozAfterRemotePaint.  As far as I know, this requires the EventListenerManager.  Let me know if I've got that wrong.


> > >+  } else if (aType == NS_AFTER_REMOTE_PAINT) {
> > >+    // This message can only be heard on the chrome browser object.  This
> > >+    // *should* be an error if done on another window (but we settle for
> > >+    // an assert).
> > >+    MOZ_ASSERT(mIsMainThreadELM && mTarget,
> > >+        "MozAfterRemotePaint can only be requested from chrome UI!");
> > >+    nsCOMPtr<nsIFrameLoaderOwner> flOwner = do_QueryInterface(mTarget);
> > >+    MOZ_ASSERT(flOwner, "target was not an nsIFrameLoaderOwner");
> > >+    nsRefPtr<nsFrameLoader> fl = flOwner->GetFrameLoader();
> > >+    MOZ_ASSERT(fl, "No FrameLoader for FrameLoaderOwner");
> > >+    unused << fl->SendRequestNotifyAfterRemotePaint();
> > I don't understand the setup. If in non-remote case one adds a listener,
> > event is dispatched immediately but never afterwards
> > unless someone adds a new listener.
> 
> Yes, that's all.  The remote and non-remote cases share this property
> (remote does not dispatch "immediately" tho).

I think I understand you better now.  This is another repercussion of the desire not to pay for the process communication if its not requested.  So, it is in fact by design that the message is sent only once when this happens.

> > >+NON_IDL_EVENT(MozAfterRemotePaint,
> > >+              NS_AFTER_REMOTE_PAINT,
> > >+              EventNameType_None,
> > >+              NS_EVENT)
> > This isn't needed if simple Event interface is used.
> > (and this is wrong anyway, since the event struct you're using in the patch
> > isn't NS_EVENT)
> > So, just remove this

Assuming that this stuff is in fact needed because we need to know about AddEventListener, this would still be 'wrong' because NS_EVENT is wrong.  However, its modeled from the MozAfterPaint event above it... also a paint event.  Its certainly possible both are technically wrong.

> You had an objection
> to either nsFrameLoader::SendRequestNotifyAfterRemotePaint or the
> NS_AFTER_REMOTE_PAINT block in
> EventListenerManager::AddEventListenerInternal.

Going over this stuff in detail, I think I've figured out the confusion.  I bet the code you were referencing was actually this stuff that sends the event:

>  NS_NewDOMNotifyPaintEvent(getter_AddRefs(event), target, nullptr,
>                            nullptr, NS_AFTER_REMOTE_PAINT,
>                            nullptr);

This function can be used to send repaint requests but its also used, probably incorrectly, by things like MozAfterPaint.  It doesn't actually request a repaint (nothing is invalidated anyway).  (I tried using the Event class directly but I couldn't get it to do anything.)
Attachment #8463188 - Flags: review?(bugs)
(In reply to David Parks [:handyman] from comment #103)
> > > Also, since you don't actually add any special data to the event, why not
> > > just use Event interface.
> > > The target for the event should be window->GetParentTarget()
> 
> I'm still not perfectly clear on the Event interface but I've got a basic
> understanding now.  I think the Event interface is actually not an option
> because of a constraint that BenWa and I imposed early on... that the cost
> of the async calls reporting the MozAfterRemotePaint from the compositor
> shouldn't be paid unless the event was requested.

That has nothing to do with the event interface type. You don't use any of the properties
of the NotifyPaintEvent interface, only properties of Event, so you could stick with Event.

> 
> I think I understand you better now.  This is another repercussion of the
> desire not to pay for the process communication if its not requested.  So,
> it is in fact by design that the message is sent only once when this happens.

It makes no sense. You have this special side-effect of addEventListener only when
it is called on a xul:browser/html:iframe. What if addEventListener is called on an ancestor object?

I thought we agreed on different design on IRC.
Attachment #8463188 - Flags: review?(bugs) → review-
Attached patch afterpaint.patch (obsolete) — Splinter Review
Updated patch based on smaug's review.  His suggestion was to use a separate, special function to request the MozAfterRemotePaint message (instead of piggybacking on the AddEventListener call to do this).  I've added sendRequestNotifyAfterRemotePaint to nsIFrameLoader.idl to do this.  This required a small update to Mike's patch (so both updates are needed to test this).
Attachment #8463188 - Attachment is obsolete: true
Attachment #8463719 - Flags: review?(bugs)
Minor change from mconley's previous patch -- simply added the call to sendRequestNotifyAfterRemotePaint:

> toBrowser.QueryInterface(Ci.nsIFrameLoaderOwner).frameLoader.sendRequestNotifyAfterRemotePaint();
Attachment #8462704 - Attachment is obsolete: true
Attachment #8462704 - Flags: review?(dao)
Comment on attachment 8463719 [details] [diff] [review]
afterpaint.patch

>+++ b/content/base/public/nsIFrameLoader.idl
When you change an idl interface, you're supposed to update uuid.



>@@ -200,16 +200,23 @@ interface nsIFrameLoader : nsISupports
>    * @see nsIDOMWindowUtils sendKeyEvent.
>    */
>   void sendCrossProcessKeyEvent(in AString aType,
>                                 in long aKeyCode,
>                                 in long aCharCode,
>                                 in long aModifiers,
>                                 [optional] in boolean aPreventDefault);
> 
>+  /**
>+   * Request that the next time a remote layer transaction has been
>+   * received by the Compositor, a MozAfterRemoteFrame event be sent
>+   * to the window.
>+   */
>+  void sendRequestNotifyAfterRemotePaint();
I would call this requestNotifyAfterRemotePaint();

>+  // If true, we are not remote-compositing but the window has requested to be told
>+  // when compositing is ready to begin.  So we will send such a response when painting
>+  // starts.
>+  bool mSendFakeRemotePaint;

You don't use this for anything

>+nsPIDOMWindow::SendAfterRemotePaint()
>+{
>+  nsCOMPtr<nsPIDOMWindow> window = do_GetInterface(mDocShell);
Odd. You're in a Window's method, and QI from mDocShell to it.



>+  if (!window) {
>+    NS_WARNING("Unable to find chrome window to send NS_AFTER_REMOTE_PAINT\n");
>+    return;
>+  }
>+
>+  nsCOMPtr<EventTarget> target = do_QueryInterface(window->GetParentTarget());
>+  if (!target) {
>+    return;
>+  }
You don't use target ever

>+
>+  nsContentUtils::DispatchTrustedEvent(window->GetDoc(),
>+                                       window->GetParentTarget(),
nsContentUtils::DispatchTrustedEvent(GetExtantDoc(), GetParentTarget(),...

>+CompositorChild::RecvRemotePaintIsReady()
>+{
>+  // Used on the content thread, this bounces the message to the
>+  // TabParent (via the TabChild) if the notification was previously requested.
>+  // XPCOM gives a soup of compiler errors when trying to do_QueryReference
>+  // so I'm using static_cast<>
>+  MOZ_LAYERS_LOG(("[RemoteGfx] CompositorChild received RemotePaintIsReady"));
>+  nsRefPtr<nsISupports> iTabChildBase(do_QueryReferent(mWeakTabChild));
>+  if( !iTabChildBase )  {
should be
if (!iTabChildBase) {



>+    MOZ_LAYERS_LOG(("[RemoteGfx] Note: TabChild was released before RemotePaintIsReady. "
>+        "MozAfterRemotePaint will not be sent to listener."));
>+    return true;
>+  }
>+  TabChildBase *tabChildBase = static_cast<TabChildBase *>(iTabChildBase.get());
* goes with the type, not variable name. So
TabChildBase* tabChildBase = static_cast<TabChildBase*>(iTabChildBase.get());


>+  MOZ_ASSERT(tabChildBase);
This is useless, since you had if (!iTabChildBase) { already


>+CompositorChild::RequestNotifyAfterRemotePaint(TabChild *tabChild)
TabChild* aTabChild


>+{
>+  MOZ_ASSERT(tabChild, "NULL TabChild not allowed in "
>+      "CompositorChild::RequestNotifyAfterRemotePaint");
Odd indentation

>+  // XPCOM weirdness requires the cast.
It is not XPCOM weirdness, but just plain C++ weirdness ;)

>+  mWeakTabChild = do_GetWeakReference( static_cast<dom::TabChildBase *>(tabChild) );
mWeakTabChild = do_GetWeakReference(static_cast<dom::TabChildBase*>(tabChild));


>+  // Sets the condition that we send an NS_AFTER_REMOTE_PAINT message just before the next
>+  // composite.  Used in non-e10s implementations.
>+  virtual void SetRequestNotifyAfterRemotePaint() MOZ_OVERRIDE {
{ goes to its own line
Attachment #8463719 - Flags: review?(bugs) → review+
Attachment #8463720 - Attachment is obsolete: true
Comment on attachment 8464041 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser. r=?

Brought in handyman's changes.
Attachment #8464041 - Flags: review?(dao)
Hey Dao - is it possible to get a review on this in the next day or so? This is blocking an e10s M1 bug. If you don't have the cycles, just let me know, ok?

Thanks!
Flags: needinfo?(dao)
(In reply to Mike Conley (:mconley) from comment #110)
> Hey Dao - is it possible to get a review on this in the next day or so?

yep
Flags: needinfo?(dao)
Comment on attachment 8464041 [details] [diff] [review]
Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser. r=?

_cancelTabSwitch calling _deactivateContent and type="content-targetable" being set there is somewhat confusing, as that attribute shouldn't have any other value prior to that call.
Attachment #8464041 - Flags: review?(dao) → review+
(In reply to Dão Gottwald [:dao] from comment #112)
> Comment on attachment 8464041 [details] [diff] [review]
> Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint
> to fire from a remote browser. r=?
> 
> _cancelTabSwitch calling _deactivateContent and type="content-targetable"
> being set there is somewhat confusing, as that attribute shouldn't have any
> other value prior to that call.

Good point. The reason for setting content-targetable is because we might be finalizing a tab switch, and setting the type of the browser we switched from.

What would you prefer I do? Should I only set content-targetable if the browser has content-primary in _deactivateContent?
Flags: needinfo?(dao)
How about you move the setAttribute call from _deactivateContent to _finalizeTabSwitch?
Flags: needinfo?(dao)
(In reply to Dão Gottwald [:dao] from comment #114)
> How about you move the setAttribute call from _deactivateContent to
> _finalizeTabSwitch?

Ah, yes, of course. Much better, thanks. :)
Implemented issues from smaug's review.
Unfortunately, I had to change the non-e10s code path a bit so I'm humbly requesting a quick re-review of that code.

A random note: I see that mconley has boosted his timeout from 200ms to 300ms... and I also note that, despite that, sometime in the last few days (week?), my macbook pro went from mostly switching tabs *before* timing out to mostly switching *by* timing out.  The experience isn't unpleasant but its worth knowing as time-to-switch-tabs is a perpetual concern.
Attachment #8463719 - Attachment is obsolete: true
Attachment #8465044 - Flags: review?(bugs)
Renaming the function call as requested by smaug.
Attachment #8464041 - Attachment is obsolete: true
Comment on attachment 8465044 [details] [diff] [review]
Need MozAfterRemotePaint Event for remote iframes (backend)

>exporti
>+nsPIDOMWindow::SendAfterRemotePaintIfRequested()
>+{
>+  if (!mSendAfterRemotePaint) {
>+    return;
>+  }
>+
>+  mSendAfterRemotePaint = false;
>+
>+  nsContentUtils::DispatchTrustedEvent(GetExtantDoc(),
>+                                       GetParentTarget(),
>+                                       NS_LITERAL_STRING("MozAfterRemotePaint"),
>+                                       false, false);
Hmm, so nsContentUtils::DispatchChromeEvent would work here, but not in the other case you want to dispatch the event...


>+TabParent::RecvRemotePaintIsReady()
>+{
>+  nsCOMPtr<mozilla::dom::EventTarget> target = do_QueryInterface(mFrameElement);
>+  if (!target) {
>+    NS_WARNING("Could not locate target for MozAfterRemotePaint message.");
>+    return true;
>+  }
>+
>+  nsContentUtils::DispatchTrustedEvent(mFrameElement->OwnerDoc(),
>+                                       target,
>+                                       NS_LITERAL_STRING("MozAfterRemotePaint"),
>+                                       false, false);
...we don't want this. This would leak MozAfterRemotePaint to non-chrome code in b2g at least.

So, 


nsCOMPtr<nsIDOMEvent> event;
NS_NewDOMEvent(getter_AddRefs(event), <your target, either the Window object or mFrameElement>, nullptr, nullptr);
event->InitEvent(NS_LITERAL_STRING("MozAfterRemotePaint"), false, false);
event->SetTrusted(true);
event->GetInternalNSEvent()->mFlags.mOnlyChromeDispatch = true;
bool dummy;
<your_target>->DispatchEvent(event, &dummy);




>+  TabChildBase* tabChildBase = static_cast<TabChildBase*>(iTabChildBase.get());
>+  TabChild *tabChild = static_cast<TabChild *>(tabChildBase);
* goes with the type

>+void
>+CompositorChild::RequestNotifyAfterRemotePaint(TabChild* tabChild)
aTabChild

>+   * Request that the parent tell us when graphics are ready on GPU.
>+   * When we get that message, we bounce it to the TabParent via
>+   * the TabChild
>+   * @param tabChild The object to bounce the note to.  Non-NULL.
>+   * @TODO: The treatment of the TabChild is unsafe
What does this TODO comment mean?


>+   */
>+  void RequestNotifyAfterRemotePaint(TabChild *tabChild);
TabChild* aTabChild


Those fixes, r+
Attachment #8465044 - Flags: review?(bugs) → review+
handyman: hey - are we ready to land this thing?
Flags: needinfo?(davidp99)
Flags: needinfo?(davidp99)
We are not ready...

In implementing smaug's review, I found that we are blocking sending the repaint request to the compositor until after the MozAfterRemotePaint timeout... in most cases.  We are blocking, so it doesn't matter what the timeout duration is (300ms or 3000ms... it'll just sit and wait that amount of time before switching tabs).  In those cases, the redraw request goes thru afterward.  I'm not sure if this is a race condition or if the successful times are a fluke based on something else.  Regardless, it would appear that this is not entirely our doing...

I've been able to confirm that this is the behavior when applying the patch to mozilla-central revision 196361 (Jul 28) but that the behavior is correct when applied to revision 196157 (Jul 25).  I will need more time to determine the actual cause and a resolution.
The issue I was seeing ended up being due to a race between the redraw request and the mechanism that hooked up the MozAfterRemotePaint listener.  It was not introduced by integrating changes -- its pretty random (although things did generally get slower in the last week).

I've moved the "redraw request" (activating the docShell) into the paintPromise.  Promise.race() isn't completely spec'ed but, assuming it is guaranteed to run all promises (regardless of whether one of them completes while it is starting the rest -- anything else would be weird), this change is fine.  The change means the tab switch rarely times-out (joy!).

mconley, what's our next move?
Attachment #8465045 - Attachment is obsolete: true
Flags: needinfo?(mconley)
Yep, that change makes sense to me. So, as I understand it, the problem was that the MozAfterRemotePaint was occurring faster than we could attach our listener.

I looked at the interdiff between your patch and my last patch[1], and your changes look spot on. I don't think we need to do another round of review.

I'm going to test these two patches, post my results, and then I'm going to go to bed. I'm on a holiday on Monday, so maybe someone else in #e10s can guide you on landing these?

Great job hunting this down. :)

[1]: https://bugzilla.mozilla.org/attachment.cgi?oldid=8465045&action=interdiff&newid=8466572&headers=1
Flags: needinfo?(mconley)
Looks good from testing. I think we're cleared to land!
Comment 112 doesn't seem to addressed yet.
Ah, yes - I never uploaded that patch. I'll apply handyman's changes and re-upload.
Attachment #8466572 - Attachment is obsolete: true
Keywords: checkin-needed
Attachment #8466658 - Attachment description: Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser → Asynchronous tab switching for tabbrowser that waits on MozAfterRemotePaint to fire from a remote browser. (r+'d by dao)
Attachment #8466658 - Flags: review+
Attachment #8466570 - Attachment description: Need MozAfterRemotePaint Event for remote iframes (backend) → Need MozAfterRemotePaint Event for remote iframes (backend) (r+'d by smaug)
Attachment #8466570 - Flags: review+
I think this is causing a perma-orange on fx-team:

https://tbpl.mozilla.org/php/getParsedLog.php?id=45262912&tree=Fx-Team

Once the retrigger confirms, I'll back out and see what's going on.
Backed out.

remote:   https://hg.mozilla.org/integration/fx-team/rev/604268a63c4f
remote:   https://hg.mozilla.org/integration/fx-team/rev/eb6854a81836
remote:   https://hg.mozilla.org/integration/fx-team/rev/135d84fc684b

Dug into this a little, and it looks like in the test that's failing here (http://hg.mozilla.org/mozilla-central/file/71497ed2e0db/dom/events/test/test_bug457672.html), we're not getting and focus or blur events. The reason seems to be that the window that's being opened errors out when attempting to switch tabs to and from it with the following error:

"toBrowser.QueryInterface(...).frameLoader.sendRequestNotifyAfterRemotePaint is not a function"

Argh, that's supposed to be requestNotifyAfterRemotePaint, not sendRequestNotifyAfterRemotePaint.

I'm going to correct that, push to try, and on green, re-push.
Whiteboard: [fixed-in-fx-team]
(In reply to Mike Conley from comment #6)
> nsRefPtr<dom::TabChildBase> tabChildBase(do_QueryReferent(mWeakTabChild));

As you subsequently worked out, TabChildBase isn't an interface and you therefore can't call XPCOM functions on it. Until I landed bug 514280 however, you could happily write this and the compiler would silently reinterpret_cast (not even static_cast) the pointer for you.
Re: comment 134, that's very good to know, thanks.  I knew that the IID was related to the XPCOM implementation but that explains the relationship to the class hierarchy.  I'm trying to hold all of these tiny XPCOM steps in my head but each new one makes me feel less like I understand the 'right' model for using it (For example: why would TabChildBase, or any interface subclass, *not* get an IID?  Is it not automatic for a reason?).  However, this, at least, reassures that the ugly static_cast is in fact not a bug-in-waiting.  As the comments suggested, that had made me a bit uneasy.
https://hg.mozilla.org/mozilla-central/rev/736c8625bcf7
https://hg.mozilla.org/mozilla-central/rev/7c813e99202d
https://hg.mozilla.org/mozilla-central/rev/06b6aa5c734d
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → mozilla34
See Also: → 1049548
Depends on: 1049551
Depends on: 1049548
See Also: 1049548
Depends on: 1050447
Depends on: 1059036
Depends on: 1067095
Depends on: 1077758
Depends on: 1081156
Depends on: 1108555
I filed bug 1132449 on issues that came up with the deck changes.
Depends on: 1102017
You need to log in before you can comment on or make changes to this bug.