Closed Bug 709492 Opened 8 years ago Closed 8 years ago

Suppress draw events and browser size changes during navigation

Categories

(Firefox for Android :: General, defect, P1)

ARM
Android
defect

Tracking

()

VERIFIED FIXED
Firefox 12
Tracking Status
firefox11 --- verified
firefox12 --- verified
fennec 11+ ---

People

(Reporter: pcwalton, Assigned: pcwalton)

References

Details

(Keywords: dev-doc-complete)

Attachments

(4 files, 23 obsolete files)

1.41 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
5.08 KB, patch
kats
: review+
Details | Diff | Splinter Review
5.90 KB, patch
kats
: review+
Details | Diff | Splinter Review
4.31 KB, patch
Details | Diff | Splinter Review
A fair bit of our jumpiness during navigation is due to altering the browser size while the page is loading. I experimented with just hiding the page from pagehide of the previous page to DOMContentLoaded of the next page and that made all the jumpiness go away. However, this is suboptimal because it's too blunt: the previous page (the one the user is navigating from) is actually visible on the screen for some time after the "pagehide" event, and the next page (the one the user is navigating to) can be usefully rendered anytime after we reach the <body> tag. (We can't render before then because the <meta viewport> might not have been parsed yet.)

My current thinking is that we should just suppress draw events at the widget level (thus ensuring that the previous page remains on the screen) and avoid altering the size of the <browser> from pagehide until there's at least one element under <body> in the new page.

Matt and Mark, I imagine you dealt with these issues in XUL Fennec -- do you have any insights?
Duplicate of this bug: 709491
(In reply to Patrick Walton (:pcwalton) from comment #0)

> Matt and Mark, I imagine you dealt with these issues in XUL Fennec -- do you
> have any insights?

Dealing with the "viewport" while a new page was loading was one of the biggest pains in XUL Fennec, and I don't know that it was satisfactorily handled. One thing we did was work under the assumption that no page could ever be smaller than the screen, even extending the length of the viewport if the page's height didn't extend to the bottom of the screen.

Could we use that assumption here? We should never really see a "jumping" viewport if we assume the viewport can't be smaller than the screen. That would allow the page to start filling in as it loads too.
There seem to be two main issues here:

(1) <meta viewport> gets applied too early or late. To reproduce, go from some site without <meta viewport> (news.ycombinator.com for example) *when fully zoomed out* to about:fennec and vice versa. You will see jumpiness. Commenting out the <meta viewport> handling code fixes this issue.

(2) Clicking on a link when zoomed in causes the page to zoom out too early. To reproduce this, zoom in to news.ycombinator.com and click on a comments link. You will see the viewport move around a bit before the new page loads. This doesn't have to do with <meta viewport>.

I don't believe there is one solution that will solve both problems. We need to tackle them individually.
(In reply to Mark Finkle (:mfinkle) from comment #2)
> Could we use that assumption here? We should never really see a "jumping"
> viewport if we assume the viewport can't be smaller than the screen. That
> would allow the page to start filling in as it loads too.

Well, one of the problems here is that the viewport starts to change before the new page appears on the screen. Thus the metrics of the new page (the one being navigated to) get applied to the previous page for a little bit, which is bizarre -- the user sees a jump to the top of the screen, for example.

I think that, unless we can get more precise events that say exactly when the new page is going to be rendered, we'll have to display a blank screen (or cached content of the page being navigated from) during the time between pagehide and pageshow. Pagehide is fired before the page is actually hidden, unfortunately, so we can't just delay viewport updates until we receive a pagehide.
Another issue that just occurred to me: It's possible for bits of the old page to be buffered in the texture before texture upload occurs. We should have a flag in the metadata that, when set, causes the screen to be blanked out.
Progress so far: Hiding the page on pagehide works for clicking on links, but it's too late for the back button: pagehide is sometimes fired too late when going back. However, nsIWebProgressListener.onStateChange seems to work fine: it seems to always be fired early enough for us to make the browser invisible in time for the viewport to be set.

I now have a patch that makes navigation flash-free by setting the opacity of the browser to 0 during page transitions. However, there are a few issues:

(1) At the moment I have <meta viewport> support disabled in my patch so I can focus on issue #2 above. When it's reenabled, I'll need to make sure that the viewport is parsed before making the browser visible.

(2) I'm not at all confident that this doesn't badly break the stop button. Should be a simple fix though (just make the browser visible again when we detect a stop).

(3) Setting the opacity of the browser to 0 is a gross hack. It also causes checkerboards to flash for a bit, because the viewport is still being set even though the browser isn't visible. Instead, I think this should be something that's communicated up to Java. GeckoSoftwareLayerClient should be aware that a page transition is taking place and the normal rendering logic can be short-circuited. (Actually, we could even make widget aware of this so that it can suppress draws when we aren't going to display them anyway, which might improve our page load time.)

Finally, I'm not sure whether a white page is what we should be displaying during early page load. It should in theory be possible to instead just show the cached content of the previous page while the new page is loading. This would avoid ugly white screens. The downside here is that the user might try clicking on links on the previous page, which would be ignored. On the other hand, I'm not sure users really expect clicks to work during navigation anyway. They seem inherently unreliable.

Opinions here?
Note to self: We should be using the chromeEventHandler on the docshell instead of letting pagehide and pageshow events bubble up to the outer window and/or chrome. We miss a lot of pagehide and pageshow events otherwise.
Depends on: 710096
Depends on: 711333
Priority: -- → P1
Part 1 adds the ability for the metadata provider to tell the compositor to stop drawing.
Assignee: nobody → pwalton
Status: NEW → ASSIGNED
WIP patch part 2 adds expose events, which allows JS to tell Java to start drawing again.
WIP patch part 3 (this is the most incomplete one) makes us hide and expose in the right place. This should work, but it comments out a bunch of stuff (like <meta viewport>). It's also gated on the dependency in bug 711333 to avoid being race-free.
Here are some new WIPs. The first one adds an observer to fire when the document is actually shown. I'm not actually sure if this is needed.
Attachment #582451 - Attachment is obsolete: true
Attachment #582452 - Attachment is obsolete: true
Attachment #582453 - Attachment is obsolete: true
WIP patch part 2 version 2 implements draw suppression.
WIP patch part 3 version 2 implements expose events.
WIP patch part 4 version 2 fires hide and expose events as necessary. This is the part that is the most "WIP". There is still a race in the commented-out code in GeckoSoftwareLayerClient. Uncommenting it causes flashes on page transitions. We should have a solution for this before landing.
Patch part 1, version 3 adds an event to the nsIObserverService that fires when a new page is shown. We need this so that we know when to start drawing again. (We suppress drawing from the time a document starts loading to the time the new page is shown.)
Attachment #582498 - Attachment is obsolete: true
Attachment #583004 - Flags: review?(bzbarsky)
Patch part 2 adds the ability for browser.js to suppress drawing.
Attachment #582499 - Attachment is obsolete: true
Attachment #583006 - Flags: review?(bugmail.mozilla)
Patch part 3, version 3 adds the ability for Java to tell browser.js that the page is safe to show again.
Attachment #582500 - Attachment is obsolete: true
Attachment #583012 - Flags: review?(bugmail.mozilla)
Patch part 4 hides the browser from the time a link is clicked to the time the new page is shown.
Attachment #582501 - Attachment is obsolete: true
Attachment #583014 - Flags: review?(bugmail.mozilla)
Depends on: 708746
Patch part 3, version 4 fixes a bug whereby expose events would be fired for subdocuments as well as the main document. This is fixed with a check in the document-shown event handler.
Attachment #583012 - Attachment is obsolete: true
Attachment #583012 - Flags: review?(bugmail.mozilla)
Attachment #583073 - Flags: review?(bugmail.mozilla)
Attachment #583014 - Attachment is obsolete: true
Attachment #583014 - Flags: review?(bugmail.mozilla)
Comment on attachment 583073 [details] [diff] [review]
Proposed patch, part 4, version 4: Hide and expose the visible area as necessary.

Accidentally uploaded the patch under the wrong name. Fixing.
Attachment #583073 - Attachment description: Proposed patch, part 3, version 4: Implement expose events. → Proposed patch, part 4, version 4: Implement expose events.
Attachment #583073 - Attachment description: Proposed patch, part 4, version 4: Implement expose events. → Proposed patch, part 4, version 4: Hide and expose the visible area as necessary.
Attachment #583012 - Attachment is obsolete: false
Attachment #583012 - Flags: review?(bugmail.mozilla)
These patches broke the loading of the first page. Patch part 5 fixes it. (Sometimes the viewport is wrong for a bit, though; I believe this is an issue with how session store saves and/or restores viewport metrics.)
Attachment #583077 - Flags: review?(bugmail.mozilla)
Attachment #583077 - Attachment description: Proposed patch, part 4, version 5: Fix initial page load. → Proposed patch, part 5: Fix initial page load.
Comment on attachment 583006 [details] [diff] [review]
Proposed patch, part 2, version 3: Implement draw suppression.

>diff --git a/mobile/android/chrome/content/browser.js b/mobile/android/chrome/content/browser.js
>+var MetadataProvider = {
>+  getDrawMetadata: function getDrawMetadata() {
>+    return BrowserApp.getDrawMetadata();
>+  },
>+
>+  drawingAllowed: function drawingAllowed() {
>+    return !BrowserApp.selectedTab.suppressDrawing;
>+  }
>+};
>+

There's no suppressDrawing field in this patch. You should probably add the field and just leave it false in this patch. I guess leaving it undefined has the same effect but is less good.

r+ with above change
Attachment #583006 - Flags: review?(bugmail.mozilla) → review+
Comment on attachment 583012 [details] [diff] [review]
Proposed patch, part 3, version 3: Implement expose events.

>diff --git a/widget/src/android/nsWindow.cpp b/widget/src/android/nsWindow.cpp
>     bool shouldDraw = true;
>     if (metadataProvider) {
>         metadataProvider->DrawingAllowed(&shouldDraw);
>     }
>     if (!shouldDraw) {
>         return;
>     }
> 
>+    // If this is a DRAW event (not an EXPOSE event), ask the metadata provider whether we should
>+    // draw, and bail out if it tells us not to.

Looks like a bad merge here.. this comment looks like it should be above the previous block of code, and there should be a check to do what the comment says is happening...
Attachment #583012 - Flags: review?(bugmail.mozilla) → review-
Comment on attachment 583073 [details] [diff] [review]
Proposed patch, part 4, version 4: Hide and expose the visible area as necessary.

>diff --git a/mobile/android/chrome/content/browser.js b/mobile/android/chrome/content/browser.js
>+    } else if (aTopic == "document-shown") {
>+      let tab = this.selectedTab;
>+      if (tab.browser.contentDocument != aSubject) {
>+        return;
>+      }
>+
>+      tab.shown = true;

This "shown" variable seems to never be used. Take it out? (I haven't looked at patch part 5 yet, so maybe it's used there. If so, adding this variable should be moved to that patch)

>+      ViewportHandler.updateMetadata(tab);
>+      tab.sendExposeEvent();

[snip]

>+  sendExposeEvent: function() {
>+    // Now that the document is actually on the screen, send an expose event.
>+    this._timer = Cc["@mozilla.org/timer;1"].createInstance(Ci.nsITimer);
>+    sendMessageToJava({
>+      gecko: {
>+        type: "Viewport:Expose"
>+      }
>+    });
>+  },

[snip]

>+        if ("Viewport:Expose".equals(event)) {
>+            GeckoAppShell.sendEventToGecko(new GeckoEvent("Viewport:Expose", ""));
>+            GeckoAppShell.sendEventToGecko(new GeckoEvent(GeckoEvent.EXPOSE));

[snip]

>+    } else if (aTopic == "Viewport:Expose") {
>+      this.expose();

[snip]

>+  expose: function() {
>+    this.selectedTab.suppressDrawing = false;
>+  },

I don't understand the point of doing this Viewport:Expose event round-trip. Can't you just set tab.suppressDrawing = false directly in the document-shown handler, and then not have to send a Viewport:Expose back from Java to JS? You can keep the GeckoEvent.EXPOSE sent to the nsWindow as-is. It's also kind of confusing to have two Viewport:Expose events.

>   get defaultBrowserWidth() {
>     delete this.defaultBrowserWidth;
>     let width = Services.prefs.getIntPref("browser.viewport.desktopWidth");
>     return this.defaultBrowserWidth = width;
>-   }
>+   },

Might as well fix the indent here while you're touching this code.

>   sendViewportUpdate: function() {
>     if (BrowserApp.selectedTab != this)
>       return;
>     sendMessageToJava({
>       gecko: {
>-        type: "Viewport:Update",
>+        type: "Viewport:UpdateLater",
>         viewport: JSON.stringify(this.viewport)
>       }

I'm worried about the implications of changing all the Update events to UpdateLater. What happens if there's no draw right after an UpdateLater? The code in GSLC will set the mUpdateViewportOnEndDraw flag but the viewport metrics won't actually get updated, so the JS and Java viewports will be out of sync. Or if Java sends a Viewport:Change event to JS and clobbers the JS viewport in between receiving the UpdateLater and the next draw?
Attachment #583073 - Flags: review?(bugmail.mozilla) → review-
Attachment #583077 - Flags: review?(bugmail.mozilla) → review+
(In reply to Kartikaya Gupta (:kats) from comment #24)
> I'm worried about the implications of changing all the Update events to
> UpdateLater. What happens if there's no draw right after an UpdateLater? The
> code in GSLC will set the mUpdateViewportOnEndDraw flag but the viewport
> metrics won't actually get updated, so the JS and Java viewports will be out
> of sync. Or if Java sends a Viewport:Change event to JS and clobbers the JS
> viewport in between receiving the UpdateLater and the next draw?

I feel as though Viewport:Update was always fundamentally broken: you have to draw when changing the viewport, or the tile will be in the right place with the wrong content. It has to be paired with a draw. Maybe the right solution here is to just have browser.js able to initiate a draw.
> I feel as though Viewport:Update was always fundamentally broken: you have
> to draw when changing the viewport, or the tile will be in the right place
> with the wrong content. It has to be paired with a draw. Maybe the right
> solution here is to just have browser.js able to initiate a draw.

Agreed. If we can initate a draw from browser.js and update the viewport on the endDrawing with that, that would be much closer to the right solution, I think.
Patch part 3 version 4 fixes the comment.
Attachment #583012 - Attachment is obsolete: true
Attachment #583230 - Flags: review?(bugmail.mozilla)
Patch part 4 version 5 removes the use of Viewport:UpdateLater as well as the expose round-trip.
Attachment #583073 - Attachment is obsolete: true
Attachment #583233 - Flags: review?(bugmail.mozilla)
Comment on attachment 583230 [details] [diff] [review]
Proposed patch, part 3, version 4: Implement expose events.

>diff --git a/widget/src/android/nsWindow.cpp b/widget/src/android/nsWindow.cpp
>     /*
>-     * Check to see whether browser.js wants us to draw. This will be false during page
>-     * transitions, in which case we immediately bail out.
>+     * If this is a DRAW event (not an EXPOSE event), check to see whether browser.js wants us to
>+     * draw. This will be false during page transitions, in which case we immediately bail out.
>      */
>     nsCOMPtr<nsIAndroidDrawMetadataProvider> metadataProvider =
>         AndroidBridge::Bridge()->GetDrawMetadataProvider();
> 
>     bool shouldDraw = true;
>     if (metadataProvider) {
>         metadataProvider->DrawingAllowed(&shouldDraw);
>     }
>     if (!shouldDraw) {
>         return;
>     }

Shouldn't this code be checking ae->Type() == AndroidGeckoEvent::DRAW (as per the comment)?
Comment on attachment 583233 [details] [diff] [review]
Proposed patch, part 4, version 5: Hide and expose the visible area as necessary.

>diff --git a/mobile/android/base/gfx/GeckoSoftwareLayerClient.java b/mobile/android/base/gfx/GeckoSoftwareLayerClient.java
>     public void handleMessage(String event, JSONObject message) {
>-        if ("Viewport:Update".equals(event)) {
>-            beginTransaction(mTileLayer);
>-            try {
>-                updateViewport(message.getString("viewport"), false);
>-            } catch (JSONException e) {
>-                Log.e(LOGTAG, "Unable to update viewport", e);
>-            } finally {
>-                endTransaction(mTileLayer);
>-            }
>+        if ("Viewport:Expose".equals(event)) {
>+            GeckoAppShell.sendEventToGecko(new GeckoEvent("Viewport:Expose", ""));

This event does nothing.

>+            GeckoAppShell.sendEventToGecko(new GeckoEvent(GeckoEvent.EXPOSE));
>+        } else if ("Viewport:UpdateAndDraw".equals(event)) {
>+            mUpdateViewportOnEndDraw = true;
>+
>+            // Redraw everything.
>+            Rect rect = new Rect(0, 0, mBufferSize.width, mBufferSize.height);
>+            GeckoAppShell.sendEventToGecko(new GeckoEvent(GeckoEvent.DRAW, rect));

So I think there's still a possible race condition here with the following sequence of events:
1) Animation starts in PZC
2) PZC changes the viewport, and a Viewport:Update event is sent to gecko and put in gecko's queue.
3) Gecko or user javascript changes the viewport in gecko
4) browser.js sends Viewport:UpdateAndDraw because of step 3. That comes in here, sets mUpdateViewportOnEndDraw and sends GeckoEvent.DRAW which gets queued in gecko
5) The Viewport:Update event from (2) gets processed, clobbering the new viewport set in (3)
6) The DRAW from (4) gets processed, and draws the old position (not the new viewport set in (2)).

Thoughts?

>+    } else if (aTopic == "document-shown") {
>+      let tab = this.selectedTab;
>+      if (tab.browser.contentDocument != aSubject) {
>+        return;
>+      }
>+
>+      ViewportHandler.updateMetadata(tab);
>+      tab.sendExposeEvent();
>+      tab.suppressDrawing = false;

The suppressDrawing should probably be before the sendExposeEvent() even though since it's a timer it won't make much difference.
Patch part 3 version 5 implements expose events.
Attachment #583230 - Attachment is obsolete: true
Attachment #583230 - Flags: review?(bugmail.mozilla)
Attachment #583307 - Flags: review?(bugmail.mozilla)
(In reply to Kartikaya Gupta (:kats) from comment #29)
> Comment on attachment 583230 [details] [diff] [review]
> Proposed patch, part 3, version 4: Implement expose events.
> 
> >diff --git a/widget/src/android/nsWindow.cpp b/widget/src/android/nsWindow.cpp
> >     /*
> >-     * Check to see whether browser.js wants us to draw. This will be false during page
> >-     * transitions, in which case we immediately bail out.
> >+     * If this is a DRAW event (not an EXPOSE event), check to see whether browser.js wants us to
> >+     * draw. This will be false during page transitions, in which case we immediately bail out.
> >      */
> >     nsCOMPtr<nsIAndroidDrawMetadataProvider> metadataProvider =
> >         AndroidBridge::Bridge()->GetDrawMetadataProvider();
> > 
> >     bool shouldDraw = true;
> >     if (metadataProvider) {
> >         metadataProvider->DrawingAllowed(&shouldDraw);
> >     }
> >     if (!shouldDraw) {
> >         return;
> >     }
> 
> Shouldn't this code be checking ae->Type() == AndroidGeckoEvent::DRAW (as
> per the comment)?

Oops, yes. Fixed.
(In reply to Kartikaya Gupta (:kats) from comment #30)
> So I think there's still a possible race condition here with the following
> sequence of events:
> 1) Animation starts in PZC
> 2) PZC changes the viewport, and a Viewport:Update event is sent to gecko
> and put in gecko's queue.
> 3) Gecko or user javascript changes the viewport in gecko
> 4) browser.js sends Viewport:UpdateAndDraw because of step 3. That comes in
> here, sets mUpdateViewportOnEndDraw and sends GeckoEvent.DRAW which gets
> queued in gecko
> 5) The Viewport:Update event from (2) gets processed, clobbering the new
> viewport set in (3)
> 6) The DRAW from (4) gets processed, and draws the old position (not the new
> viewport set in (2)).
> 
> Thoughts?

I'm not sure I understand. Viewport:Update doesn't exist anymore after this patch; do you mean Viewport:Change? Assuming you do, the DRAW event should always query the current viewport via the nsIAndroidDrawMetadataProvider, so it should be correct with respect to the contents. I guess you might want animation to override scroll changes set by content JavaScript, but that seems like a separate bug needing separate heuristics.
Attachment #583307 - Flags: review?(bugmail.mozilla) → review+
(In reply to Patrick Walton (:pcwalton) from comment #33)
> I'm not sure I understand. Viewport:Update doesn't exist anymore after this
> patch; do you mean Viewport:Change?

Sorry, yes, I meant Viewport:Change.

> Assuming you do, the DRAW event should
> always query the current viewport via the nsIAndroidDrawMetadataProvider, so
> it should be correct with respect to the contents. I guess you might want
> animation to override scroll changes set by content JavaScript, but that
> seems like a separate bug needing separate heuristics.

We don't want animation to override scrolll changes set by content Javascript. However, I believe that will become possible because of the changes in this patch. The DRAW event will use the "current" viewport, but my argument is that the "current" viewport will no longer be what the content Javascript set it to, because it is possible for a pending Viewport:Change event to sneak in and override it in between. I'll see if I can create a test page that reliably reproduces this problem.
Ok, so after much trying I wasn't able to produce the test case I wanted. For each scenario I tried to produce there was a different mitigating factor that prevented the bug from manifesting, so I guess with the current code the scenario I'm thinking of can't actually happen. Will r+ but don't forget to take out this line:

>+            GeckoAppShell.sendEventToGecko(new GeckoEvent("Viewport:Expose", ""));

before landing.
Attachment #583233 - Flags: review?(bugmail.mozilla) → review+
Comment on attachment 583004 [details] [diff] [review]
Proposed patch, part 1, version 3: Add an observer for page show.

Is there a reason that you're not using mozilla::services::GetObserverService()?  The code to get the observer service in this patch is not safe....

Also, generally running JS under show() is probably not safe.  Can you do that part async?
Attachment #583004 - Flags: review?(bzbarsky) → review-
bz, how about this one?
Attachment #583004 - Attachment is obsolete: true
Attachment #583967 - Flags: review?
Comment on attachment 583967 [details] [diff] [review]
Proposed patch, part 1, version 4: Add an observer for page show.

This leaks, no?
Oh right, thanks. Here's a new one.
Attachment #583967 - Attachment is obsolete: true
Attachment #583967 - Flags: review?
Attachment #584008 - Flags: review?
Attachment #584008 - Flags: review? → review?(bzbarsky)
Duplicate of this bug: 713861
I'm still trying to understand the overall idea here.  Why are there any draw events at all before show()?
(In reply to Boris Zbarsky (:bz) from comment #41)
> I'm still trying to understand the overall idea here.  Why are there any
> draw events at all before show()?

I had assumed that there could be draw events that occur from the time a page starts loading to show() when, for example, an animation is going on on the page being navigated from. Is that not the case?
Here's the bigger picture: In an environment with asynchronous scrolling like Native Fennec, the front end has to know about the dimensions and <meta viewport> information of the page at all times. browser.js informs Java when it sees (at least) (a) a page changing size; (b) <meta viewport> tag is added to the DOM; (c) a new window is created. However, during navigation a DOM is being created in the background while the previous page is still being shown, and we receive events for that new DOM, which causes strange visual effects during navigation.

Thus the simplest solution seemed to be (a) make sure all page size and <meta viewport> information is always communicated via extra data attached to DRAW events; (b) make sure that DRAW events are suppressed during page transitions until the next page is shown. These patches implement this.
Ah, ok.  So you're actually trying to suppress draw events for the page being navigated _from_.

When does this suppression start?  What happens if the page being navigated to is something that won't render in the browser but will be downloaded or will render with a helper app instead?
(In reply to Boris Zbarsky (:bz) from comment #44)
> Ah, ok.  So you're actually trying to suppress draw events for the page
> being navigated _from_.
> 
> When does this suppression start?  What happens if the page being navigated
> to is something that won't render in the browser but will be downloaded or
> will render with a helper app instead?

It starts when our implementation of nsIWebProgressListener's onStateChange method is called with the STATE_START state flag. You're right that it will probably break those two cases. It'd probably be better to instead suppress draws from the time the new DOM is created to the time it's shown; do you have an event we can listen to for that?
Perhaps the DOMWindowCreated event that was added in bug 567357? Going to try that.
What you really want is inner-window creation.  And you probably want to limit it to toplevel browsers, not subframes?  And yeah, DOMWindowCreated is what you want there.

Another option, if your browser.js is in-process with the content, is to expose a getter that indicates whether the document is actually showing yet or not, and have all the code that handles the <meta viewport> check that state, but it sounds like you've got the other approach pretty much implemented already...

With the approach in your last patch, you don't really need the weakref on the document, do you?  Since you're not holding a ref to the runnable itself, there's no reference cycle.
Here's a new version that doesn't use the weak reference. I suspected it wasn't necessary, but I wanted to be sure.
Attachment #584008 - Attachment is obsolete: true
Attachment #584008 - Flags: review?(bzbarsky)
Attachment #584689 - Flags: review?(bzbarsky)
Comment on attachment 584689 [details] [diff] [review]
Proposed patch, part 1, version 6: Add an observer for page show.

Just make nsDocumentShownDispatcher inherit from nsRunnable, and have its Run() method do the work of dispatching the observer service notification.

And the NS_DispatchToMainThread should happen in Show(); the current code is passing 0-refcount XPCOM objects around , which is generally a bad idea.

Also, you don't need to explicitly pass DISPATCH_NORMAL.  That's the default value.
Attachment #584689 - Flags: review?(bzbarsky) → review-
Patch part 1, version 7 addresses review comments.
Attachment #584689 - Attachment is obsolete: true
Attachment #584710 - Flags: review?(bzbarsky)
Patch part 4, version 6 avoids relying on STATE_START, as discussed.
Attachment #583233 - Attachment is obsolete: true
Attachment #584711 - Flags: review?(bugmail.mozilla)
Comment on attachment 584710 [details] [diff] [review]
Proposed patch, part 1, version 7: Add an observer for page show.

The constructor argument should just be nsIDocument*.  r=me with that.
Attachment #584710 - Flags: review?(bzbarsky) → review+
Comment on attachment 584711 [details] [diff] [review]
Proposed patch, part 4, version 6: Hide and expose the visible area as necessary.

I'd like the handler for "document-shown" to be moved to the Tab object, like "http-on-modify-request" is done.

Tab.observe() is impl'd to only handle a single topic, but that can be fixed to use an "if" or "switch"
Looking over the patches, I wanted to question the initial premise of this bug. The bug summary mentions suppressing draw events and browser size changes. I thought we only wanted to stop browser size changes, and the animations that go along with them.

Previously, in XUL Fennec, we _wanted_ to display page content as the page loaded to help with perceived page load speed: waiting to the end of the load to draw any content makes us seem to load the page slower.
(In reply to Mark Finkle (:mfinkle) from comment #54)
> Looking over the patches, I wanted to question the initial premise of this
> bug. The bug summary mentions suppressing draw events and browser size
> changes. I thought we only wanted to stop browser size changes, and the
> animations that go along with them.
> 
> Previously, in XUL Fennec, we _wanted_ to display page content as the page
> loaded to help with perceived page load speed: waiting to the end of the
> load to draw any content makes us seem to load the page slower.

In earlier versions of these patches, we suppressed draws until a significant amount of the page being navigated to had loaded. But not anymore: with the current patch set we stop suppressing draw events after Show() has fired. And Gecko is hard-wired at the engine level not to draw the page being navigated to before Show() is called.

So, fortunately, these patches don't affect when we start drawing during navigation anymore. There should be no effect on perceived page load speed. They only get rid of the viewport jumpiness.
Comment on attachment 584711 [details] [diff] [review]
Proposed patch, part 4, version 6: Hide and expose the visible area as necessary.

Is a DOMWindowCreated always going to be followed by a document-shown event? Are there any scenarios/error conditions in which the page load might be aborted without sending the document-shown event that would leave us with drawing incorrectly supressed? I can't think of anything off the top of my head but I also don't know that code. Based on bz's comments above I'll assume there is no such scenario, so r=me.
Attachment #584711 - Flags: review?(bugmail.mozilla) → review+
Duplicate of this bug: 709778
Duplicate of this bug: 711743
Samsung Nexus S (Android 4.0.3)
Samsung Galaxy SII (Android 2.3.4)
20111230040819
http://hg.mozilla.org/mozilla-central/rev/9fdaea5d67e2
Status: RESOLVED → VERIFIED
Attachment #583006 - Flags: approval-mozilla-aurora?
Attachment #583077 - Flags: approval-mozilla-aurora?
Backed out due to breaking external URLs when the browser is closed:

http://hg.mozilla.org/integration/mozilla-inbound/rev/b3113b7753c1
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
Part 6 of this patch fixes the external URI loading. There were actually two bugs here: one was that document-shown is fired *before* DOMWindowCreated for the first URI loaded, and one is that the screen size wasn't being communicated to Gecko for the first draw event. Fixing these two allow initial URIs to load as before.
Attachment #585100 - Flags: review?(bugmail.mozilla)
Patch part 4, version 7 makes the change requested by Mark above, integrates the fix in bug 714230, and uses ViewportHandler.resetMetadata() as patch part 5 used to do. Figured I might as well clean this up before this relands.
Attachment #583077 - Attachment is obsolete: true
Attachment #584711 - Attachment is obsolete: true
Attachment #583077 - Flags: approval-mozilla-aurora?
Attachment #585111 - Flags: review?(bugmail.mozilla)
Attachment #585100 - Attachment description: Proposed patch, part 6: Fix external URI loading. → Proposed patch, part 5: Fix external URI loading.
Attachment #585100 - Flags: review?(bugmail.mozilla) → review+
Attachment #585111 - Flags: review?(bugmail.mozilla) → review+
Comment on attachment 585111 [details] [diff] [review]
Proposed patch, part 4, version 7: Hide and expose the visible area as necessary.

>+  sendExposeEvent: function() {
>+    // Now that the document is actually on the screen, send an expose event.
>+    this._timer = Cc["@mozilla.org/timer;1"].createInstance(Ci.nsITimer);
>+    sendMessageToJava({
>+      gecko: {
>+        type: "Viewport:Expose"
>+      }
>+    });
>+  },

Looking over this patch again I realized that the _timer is never used, and shouldn't be created.
So I did some testing, and bz's suggested approach seems much cleaner and more reliable. The presentation shell has a PaintingSuppressed member that corresponds to what we want. All we have to do is forward that to widget and suppress drawing when that flag is on. I'll have patches up in a bit.
Here's part 1 an alternative approach, suggested by bz. It seems much simpler and more reliable. Instead of reimplementing painting suppression, it uses the painting suppression flag (IsPaintingSuppressed()) in the presentation shell.
Attachment #584710 - Attachment is obsolete: true
Attachment #585534 - Flags: review?
Patch part 2, version 4 is pretty much the same as version 3, but it renames drawingAllowed() to paintingSuppressed() to match the presentation shell better.
Attachment #583006 - Attachment is obsolete: true
Attachment #583006 - Flags: approval-mozilla-aurora?
Attachment #585536 - Flags: review?(bugmail.mozilla)
Patch part 3 changes Update to UpdateAndDraw, as part 4 did before. This is basically just a subset of part 4.

This is all that's needed in my testing to eliminate the jumpiness.
Attachment #583307 - Attachment is obsolete: true
Attachment #585100 - Attachment is obsolete: true
Attachment #585111 - Attachment is obsolete: true
Attachment #585537 - Flags: review?(bugmail.mozilla)
Attachment #585534 - Flags: review? → review?(bzbarsky)
Attachment #585536 - Flags: review?(bugmail.mozilla) → review+
Comment on attachment 585537 [details] [diff] [review]
Proposed patch, part 3, version 8: Change Update to UpdateAndDraw.

LGTM. This is much simpler than the previous approach.
Attachment #585537 - Flags: review?(bugmail.mozilla) → review+
Comment on attachment 585534 [details] [diff] [review]
Proposed patch, part 1, version 8: Allow JS to determine whether painting is suppressed.

presShell can end up null, in general.  For your use case it probably doesn't matter what the behavior is in that case, since toplevel content should never hit it; throwing would make sense to me.

r=me with that fixed.
Attachment #585534 - Flags: review?(bzbarsky) → review+
I need this patch too to avoid breaking the first page. It's identical to patch part 5 minus the draw suppression stuff, so I'll carry forward r+ since I'd like to get this relanded.
Adding dev-doc-needed because this changes nsIDOMWindowUtils.
Keywords: dev-doc-needed
Status: REOPENED → ASSIGNED
Comment on attachment 585534 [details] [diff] [review]
Proposed patch, part 1, version 8: Allow JS to determine whether painting is suppressed.

[Approval Request Comment]
Regression caused by (bug #): None
User impact if declined: Jumpiness during nearly all page transitions.
Testing completed (on m-c, etc.): Baking now.
Risk to taking this patch (and alternatives if risky): There could be instances in which painting doesn't happen if the presentation shell doesn't invalidate the right region after turning off painting suppression.
Attachment #585534 - Flags: approval-mozilla-aurora?
Attachment #585536 - Flags: approval-mozilla-aurora?
Attachment #585537 - Flags: approval-mozilla-aurora?
Attachment #585659 - Flags: approval-mozilla-aurora?
Comment on attachment 585534 [details] [diff] [review]
Proposed patch, part 1, version 8: Allow JS to determine whether painting is suppressed.

[Triage Comment]
Mobile only and a top MTD request - approved for Aurora.
Attachment #585534 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #585536 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #585537 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #585659 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
tracking-fennec: --- → 11+
Verified this following the steps from comment #3 and the issues are not reproducible anymore. 

Aurora: Mozilla /5.0 (Android;Linux armv7l;rv:11.0a2) Gecko/20120124 Firefox/11.0a2 Fennec/11.0a2
Nightly: Mozilla /5.0 (Android;Linux armv7l;rv:12.0a1) Gecko/20120124 Firefox/12.0a1 Fennec/12.0a1

Device: LG Optimus 2X (Android 2.2)
Status: RESOLVED → VERIFIED
Blocks: 706309
You need to log in before you can comment on or make changes to this bug.