Closed Bug 912806 Opened 11 years ago Closed 11 years ago

Regression in zooming behaviour on Fennec

Categories

(Firefox for Android Graveyard :: Toolbar, defect)

26 Branch
All
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Firefox 26

People

(Reporter: kats, Assigned: botond)

References

Details

(Keywords: regression, reproducible)

Attachments

(2 files, 9 obsolete files)

26.83 KB, patch
botond
: review+
Details | Diff | Splinter Review
1.75 KB, patch
tnikkel
: review+
Details | Diff | Splinter Review
Bug 907754 regressed the zooming behaviour in Fennec. Now when zooming in, the screen will often go blank and it will be a second or two (depending on device) while everything is repainted. The expected behaviour (which used to happen prior to bug 907754) is that upon zooming the old content is rendered at the new resolution (possibly blurry) while the new content is painted, and then the content re-renders sharply once the paint is done.

Associated with this regression is a Talos tcheck2 regression and a robopan improvement, neither of which are really expected.

Based on some investigation (see bug 907754 comment 8 onwards) I believe this happens because on Fennec we set the resolution on the root presshell, which has a scrollId of NULL_SCROLL_ID. Previously the mResolution for this layer would be the desired resolution, but as of bug 907754 it became 1.0. For layers inside the content presShell, the mResolution would always be (and still is) 1.0.

I believe this is another side-effect of not having fixed bug 732971. I am currently verifying to see if my patches (which mostly resolve that bug) fix this issue as well.

In the meantime, I have an ifdef patch to fix the regression. I have verified that this fixes the visible behavioural regression locally, and have pushed it to try to see if it fixes the talos regressions too. Try run is at https://tbpl.mozilla.org/?tree=Try&rev=661a2c4c7e9d
Attachment #799894 - Attachment description: 3-talos-7029f9e → Ifdef the code to revert Fennec behaviour
I have another patch that also fixes the visible behaviour issues but is more risky because it might leave other things unfixed. It is smaller and cleaner though. I have pushed this one to try as well: https://tbpl.mozilla.org/?tree=Try&rev=ad93e399e74d
(In reply to Timothy Nikkel (:tn) from bug 907754 comment #10)
> Not necessarily directly related I would think at first glance.
> RecordFrameMetrics is about scroll layers, resolution is different from that.

While that's true, there is a relationship here. What was happening before (on Fennec) is:

Which presshell?  | Which layer (scrollid) | Set Resolution  | mFrameMetrics.mResolution
------------------+------------------------+-----------------+--------------------------
Root              | Root (NULL_SCROLL_ID)  | (desired scale) | (desired scale)
Content           | Root (ROOT_SCROLL_ID)  | 1.0             | 1.0
Content           | Sub-layer (2..inf)     | 1.0             | 1.0

And with my change what is happening is:

Which presshell?  | Which layer (scrollid) | Set Resolution  | mFrameMetrics.mResolution
------------------+------------------------+-----------------+--------------------------
Root              | Root (NULL_SCROLL_ID)  | (desired scale) | 1.0
Content           | Root (ROOT_SCROLL_ID)  | 1.0             | 1.0
Content           | Sub-layer (2..inf)     | 1.0             | 1.0

But if we keep my change and fix bug 732971, we should end up with:

Which presshell?  | Which layer (scrollid) | Set Resolution  | mFrameMetrics.mResolution
------------------+------------------------+-----------------+--------------------------
Root              | Root (NULL_SCROLL_ID)  | 1.0             | 1.0
Content           | Root (ROOT_SCROLL_ID)  | (desired scale) | (desired scale)
Content           | Sub-layer (2..inf)     | 1.0             | 1.0
Ah, okay, I see what you are saying now.
Well. Those try runs were green but my trychooser syntax excluded talos which is what I really wanted to test for. Doh.

Try pushes for talos:
https://tbpl.mozilla.org/?tree=Try&rev=42b230b87960
https://tbpl.mozilla.org/?tree=Try&rev=f0828128d615
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #0)
> I believe this is another side-effect of not having fixed bug 732971. I am
> currently verifying to see if my patches (which mostly resolve that bug) fix
> this issue as well.

For the record, I think my patches on that bug have bitrotted or something because now the behaviour I see with those no longer matches what I used to see, and appears quite broken. I'll have to spend some time to update those patches before they're useful again.
tracking-fennec: --- → ?
My work on bug 898478 is revealing that we sometimes want FrameMetrics::mResolution to be cumulative (i.e., include the resolutions of parent layers), and sometimes not. (Before the changes in bug 907754, it was something in-between, which covered up the distinction in common cases.)

Kats and I discussed this and we propose adding a new field to FrameMetrics called mCumulativeResolution, which would store the cumulative resolution, while mResolution would always be non-cumulative (like now). Use sites of the current mResolution should be examined to see whether the cumulative or non-cumulative resolution is desired, and ajusted accordingly.

Unlike the calculation of mResolution, the calculation of mCumulativeResolution would not have to treat ROOT_SCROLL_ID specially. So, depending on whether or not the use sites relevant to this regression care about the cumulative or non-cumulative resolution, our proposed change might solve this regression.
Attachment #800446 - Flags: feedback?(bugmail.mozilla) → feedback+
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #4)
> Well. Those try runs were green but my trychooser syntax excluded talos
> which is what I really wanted to test for. Doh.
> 
> Try pushes for talos:
> https://tbpl.mozilla.org/?tree=Try&rev=42b230b87960
> https://tbpl.mozilla.org/?tree=Try&rev=f0828128d615

Interestingly, these try runs show rather large improvements for both tcheck2 and rp on Android 2.2, and tcheck2 on Android 4.0. The rp measures on Android 4.0 are still regressed, though (perhaps even more than they already are). That's pretty strange, considering the ifdef patch should be identical to reverting the original patch.
Comment on attachment 800446 [details] [diff] [review]
[Cumulative resolution] Part 1 - Introduce FrameMetrics::mCumulativeResolution and calculate it in RecordFrameMetrics()

Review of attachment 800446 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/FrameMetrics.h
@@ +220,5 @@
>    // we walk the layer tree, the resulting transform isn't stored here. Thus the
>    // resolution of parent layers is opaque to this metric.
>    LayoutDeviceToLayerScale mResolution;
>  
> +  // The cumulative resolution, along both axes, that the current frame has

The ScaleFactor classes just have one axis, btw. Presumably you copied this from the mResolution comment which should be updated as well.
Updated patch introducing FrameMetrics::mCumulativeResolution to address kats' comment #9, and to add some missing bits like adjusting serialization and operator== to account for the new field.
Attachment #800446 - Attachment is obsolete: true
Modified patch to make FrameMetrics::mResolution and FrameMetrics::mCumulativeResolution appropriate strongly-typed scale factors. To this end, a new coordinate type LocalLayerPixel was introduced which is like LayerPixel but without the resolutions of parent layers (there is also a LocalScreenPixel which has a similar relationship to ScreenPixel).

The patch also updates the use sites of mResolution to be mCumulativeResolution when appropriate. (I originally intended to do that in a separate patch, but since mResolution is now a different type than it was before, there is no sense in that as the intermediate code wouldn't compile).

Kats, in addition to looking at the places I replaced mResolution with mCumulativeResolution (which can be seen in the patch), it would be reassuring if you could also look at the places where I kept it as mResolution, notably:

http://mxr.mozilla.org/mozilla-central/source/gfx/layers/client/ClientTiledThebesLayer.cpp#89
http://mxr.mozilla.org/mozilla-central/source/gfx/layers/composite/ThebesLayerComposite.cpp#187
http://mxr.mozilla.org/mozilla-central/source/gfx/layers/composite/ThebesLayerComposite.cpp#221
Attachment #800805 - Attachment is obsolete: true
Attachment #800974 - Flags: review?(bugmail.mozilla)
Attachment #800974 - Attachment description: Store both cumulative and non-cumulative resolutions in FrameMetrics and use whichever is appropriatebug912806.patch → Store both cumulative and non-cumulative resolutions in FrameMetrics and use whichever is appropriate
Comment on attachment 800974 [details] [diff] [review]
Store both cumulative and non-cumulative resolutions in FrameMetrics and use whichever is appropriate

Review of attachment 800974 [details] [diff] [review]:
-----------------------------------------------------------------

I was thinking about this some more just now and I realized that the LocalLayer terminology is very confusing to understand. I came up some better type names that I think will make the code more readable:

struct ParentLayerPixel {};

typedef gfx::ScaleFactor<LayoutDevicePixel, ParentLayerPixel> LayoutDeviceToParentLayerScale;
typedef gfx::ScaleFactor<ParentLayerPixel, LayerPixel> ParentLayerToLayerScale;

typedef gfx::ScaleFactor<ParentLayerPixel, ScreenPixel> ParentLayerToScreenScale;

ParentLayerToLayerScale mResolution;
LayoutDeviceToLayerScale mCumulativeResolution;

LayoutDeviceToParentLayerScale GetParentResolution() {
    return mCumulativeResolution / mResolution;
}

ViewTransform::mScale should end up a ParentLayerToScreenScale, which makes sense to me (it's the scale that needs to be applied on top of whatever the parent layer has done, in order to get to the desired screen scale). In the terms of my blog post, in a FrameMetrics for layer L, LayerPixel would be the layer pixels for layer L and ParentLayerPixel would be the pixels for layer M (the parent layer) and the scales are set up so that LayoutDeviceToParentLayerScale * ParentLayerToLayerScale == LayoutDeviceToLayerScale.

I also looked through the rest of the patch and have some additional comments below.

::: gfx/layers/composite/AsyncCompositionManager.cpp
@@ +578,5 @@
>      geckoScroll = scrollOffsetLayerPixels;
>    }
>  
>    LayerPoint translation = (userScroll / zoomAdjust) - geckoScroll;
> +  treeTransform = gfx3DMatrix(ViewTransform(-translation, LayoutDeviceToLocalScreenScale(metrics.mResolution.scale)));

This bit is wrong. Note that the call to syncViewportInfo can (and will) modify userZoom on Fennec, so the value of userZoom down here is not the same as it was up where it was assigned.

::: gfx/layers/ipc/AsyncPanZoomController.cpp
@@ +1113,5 @@
> +
> +  // The scale from local screen -> screen coordinates is always the same as
> +  // the scale from local layer -> layer coordinates.
> +  LocalScreenToScreenScale parentResolution = LocalScreenToScreenScale(mFrameMetrics.GetParentResolution().scale);
> +  auto localZoom = mFrameMetrics.mZoom / parentResolution;

I would avoid using "auto". Although it's easy to write, it makes the code harder to read, particularly for long-lived variables (which isn't the case here, but I think it still applies).

With the new types I suggested this code block would basically pass "mFrameMetrics.mZoom / mLastContentPaintMetrics.mDevPixelsPerCSSPixel / mFrameMetrics.GetParentResolution()" to the ViewTransform, which should be in the expected ParentLayerToScreenScale type.

@@ +1351,5 @@
>  }
>  
>  void AsyncPanZoomController::SetZoomAndResolution(const CSSToScreenScale& aZoom) {
>    mMonitor.AssertCurrentThreadIn();
> +  auto parentResolution = mFrameMetrics.GetParentResolution();

Ditto. With the new types I suggested this would be ParentLayerToLayerScale which is pretty straightforward I think.

@@ +1358,5 @@
>    // what's currently visible on the screen. This is effectively turning
>    // the async zoom amount into the gecko zoom amount.
> +  mFrameMetrics.mCumulativeResolution = aZoom / mFrameMetrics.mDevPixelsPerCSSPixel * ScreenToLayerScale(1);
> +  // The parent resolution will not have changed.
> +  mFrameMetrics.mResolution = mFrameMetrics.mCumulativeResolution / parentResolution;

I don't know if this is still true but on older mobile hardware divisions used to be a lot more expensive than multiplications. (It's also cognitively easier to process a multiplication for me). In a previous life I would have written this to calculate mResolution first and then multiply in parentResolution to get mCumulativeResolution. Take that for what you will :)
Attachment #800974 - Flags: review?(bugmail.mozilla) → review-
(In reply to Botond Ballo [:botond] from comment #11)
> Kats, in addition to looking at the places I replaced mResolution with
> mCumulativeResolution (which can be seen in the patch), it would be
> reassuring if you could also look at the places where I kept it as
> mResolution, notably:
> 
> http://mxr.mozilla.org/mozilla-central/source/gfx/layers/client/
> ClientTiledThebesLayer.cpp#89
> http://mxr.mozilla.org/mozilla-central/source/gfx/layers/composite/
> ThebesLayerComposite.cpp#187
> http://mxr.mozilla.org/mozilla-central/source/gfx/layers/composite/
> ThebesLayerComposite.cpp#221

I believe these are correct - all of them iterate through FrameMetrics in an ancestor chain and multiply together the mResolution values so it's effectively recomputing mCumulativeResolution. We could in theory replace the loops with mCumulativeResolution but I don't want to make that change in this bug, so leave those bits as-is is good.
Assignee: bugmail.mozilla → botond
Updated patch to address kats' review comments.

I can't actually test this right now (although I did check that it compiles), so I'm posting it for f+ for now.
Attachment #800974 - Attachment is obsolete: true
Attachment #801854 - Flags: feedback?(bugmail.mozilla)
Updated patch to remove some leftover bits from the older version.
Attachment #801854 - Attachment is obsolete: true
Attachment #801854 - Flags: feedback?(bugmail.mozilla)
Attachment #801859 - Flags: feedback?(bugmail.mozilla)
Blocks: 898478
Attachment #801859 - Flags: feedback?(bugmail.mozilla) → feedback+
Comment on attachment 801859 [details] [diff] [review]
Store both cumulative and non-cumulative resolutions in FrameMetrics and use whichever is appropriate

Review of attachment 801859 [details] [diff] [review]:
-----------------------------------------------------------------

There's a call to utils->SetResolution in TabChild::ProcessUpdateFrame that needs to be updated as well. Currently the argument being passed in is aFrameMetrics.mZoom / aFrameMetrics.mDevPixelsPerCSSPixel * ScreenToLayerScale(1) but it should be updated to aFrameMetrics.mZoom / aFrameMetrics.mDevPixelsPerCSSPixel / aFrameMetrics.GetParentResolution().
Updated patch to address comment #18, and tested on B2G - looks good. Marking for review.
Attachment #801859 - Attachment is obsolete: true
Attachment #802385 - Flags: review?(bugmail.mozilla)
FYI I pushed this patch to try: https://tbpl.mozilla.org/?tree=Try&rev=e25c7532ed09 since I'm a little concerned about unexpected side-effects on Fennec.
Comment on attachment 802385 [details] [diff] [review]
Store both cumulative and non-cumulative resolutions in FrameMetrics and use whichever is appropriate

Review of attachment 802385 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me, but I would hold off on landing it until the try run comes back green. I will also download the Fennec build from the try run and check it to see if it fixes the original visual regression this bug was filed for (hopefully it fixes the Talos regressions too but I'm less concerned about that).

::: gfx/layers/FrameMetrics.h
@@ +21,5 @@
> +struct ParentLayerPixel {};
> +
> +typedef gfx::ScaleFactor<LayoutDevicePixel, ParentLayerPixel> LayoutDeviceToParentLayerScale;
> +typedef gfx::ScaleFactor<ParentLayerPixel, LayerPixel> ParentLayerToLayerScale;
> +

nit: remove blank line
Attachment #802385 - Flags: review?(bugmail.mozilla) → review+
Updated patch to fix nit. Carrying r+.
Attachment #802385 - Attachment is obsolete: true
Attachment #802426 - Flags: review+
The try build I pushed at comment 20 never finished the android builds due to all the problems with try yesterday. I had another try push at https://tbpl.mozilla.org/?tree=Try&rev=058cede885cd which included this patch along with a couple of others of mine, and when I test that on Fennec the behaviour is even worse than it was before. I suspect this patch does in fact regress fennec further. I am building this locally to see if I can debug it.
Attached patch Additional Fennec patch (obsolete) — Splinter Review
Applying this on top of botond's patch makes Fennec behave well (and fixes the original visual regression this bug was filed for). Try run with both patches at https://tbpl.mozilla.org/?tree=Try&rev=ec2ddcd801f4
Attachment #799894 - Attachment is obsolete: true
Attachment #799895 - Attachment is obsolete: true
Attachment #803097 - Flags: review?(tnikkel)
Comment on attachment 803097 [details] [diff] [review]
Additional Fennec patch

Hmm, don't we want to do this only for the root content document? And maybe even the root scroll frame only?
This works fine too but I don't really know if it's any safer in terms of other side effects.
Attachment #803097 - Attachment is obsolete: true
Attachment #803097 - Flags: review?(tnikkel)
Attachment #803285 - Flags: review?(tnikkel)
Attachment #803285 - Flags: review?(tnikkel) → review+
The try run from comment 24 shows a large regression in tcheck2. An earlier try run at https://tbpl.mozilla.org/?tree=Try&rev=87c967b3fd90 also has a really big number for tcheck2. However I think it's more important to land this to fix the visual regression, and then go about fixing tcheck2 afterwards. I suspect that the new numbers might just be the new normal because tcheck2 uses some of these values to compute its score. (Would have to double-check that).
https://hg.mozilla.org/mozilla-central/rev/1002b028b6c4
https://hg.mozilla.org/mozilla-central/rev/ce483735e803
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → Firefox 26
See Also: → 915387
tracking-fennec: ? → ---
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: