Closed Bug 1039818 Opened 6 years ago Closed 5 years ago

[B2G][Browser] Disney layout does not fit the screen

Categories

(Core :: Layout, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla40
Tracking Status
firefox40 --- fixed
b2g-v1.3 --- affected
b2g-v1.4 --- affected
b2g-v2.0 --- affected
b2g-v2.1 --- affected
b2g-v2.2 --- affected
b2g-master --- affected

People

(Reporter: psiphantong, Assigned: botond)

References

Details

(Whiteboard: [273MB-Flame-Support] [2.0-exploratory])

Attachments

(5 files, 1 obsolete file)

Attached file di.txt
Description:
The Disney layout does not fit the screen 

Setup Steps:
1) Flame device is set to 273mb

Repro Steps:
1) Update a Flame device to BuildID: 20140714000202
2) Go to Browser
3) Go to disney.com


Actual:
layout does not fit the screen 

Expected:
layout fits the screen 

Environmental Variables:
Device: Flame 2.0
BuildID: 20140716000201
Gaia: 5f8b1b8a2da9e3b531eee817a669f57fa4d9b9c6
Gecko: 913827496f65
Version: 32.0a2 (2.0) 
Firmware Version: v122
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0


Repro frequency: 100%
See attached: screenshot, logcat,
Attached image 2014-07-16-15-33-30.png
This issue also reproduces on the Flame 2.1 (273mb), Buri 2.1, Flame 2.0(512mb), Buri 2.0, Flame 1.4(273mb), Buri 1.4, and the Buri 1.3. layout does not fit the screen 


Flame 2.1 (273mb)

Environmental Variables:
Device: Flame Master
Build ID: 20140716040207
Gaia: d29773d2a011825fd77d1c0915a96eb0911417b6
Gecko: 691ffea49efb
Version: 33.0a1 (Master)
Firmware Version: v122
User Agent: Mozilla/5.0 (Mobile; rv:33.0) Gecko/33.0 Firefox/33.0

Buri 2.1

Environmental Variables:
Device: Buri Master
Build ID: 20140716040207
Gaia: d29773d2a011825fd77d1c0915a96eb0911417b6
Gecko: 691ffea49efb
Version: 33.0a1 (Master)
Firmware Version: v1.2device.cfg
User Agent: Mozilla/5.0 (Mobile; rv:33.0) Gecko/33.0 Firefox/33.0

Flame 2.0(512mb)

Environmental Variables:
Device: Flame 2.0
BuildID: 20140716000201
Gaia: 5f8b1b8a2da9e3b531eee817a669f57fa4d9b9c6
Gecko: 913827496f65
Version: 32.0a2 (2.0) 
Firmware Version: v122
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0

Buri 2.0

Environmental Variables:
Device: Buri 2.0
Build ID: 20140716000201
Gaia: 5f8b1b8a2da9e3b531eee817a669f57fa4d9b9c6
Gecko: 913827496f65
Version: 32.0a2 (2.0)
Firmware Version: v1.2device.cfg
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0

Flame 1.4(273MB)

Environmental Variables:
Device: Flame 1.4
Build ID: 20140716000202
Gaia: 393d72937727ad20e82b2ff7b13e3d7ff077a9f0
Gecko: 932c37978d37
Version: 30.0 (1.4)
Firmware Version: v122
User Agent: Mozilla/5.0 (Mobile; rv:30.0) Gecko/30.0 Firefox/30.0

Buri 1.4

v1.4 Environmental Variables:
Device: Buri v1.4 MOZ ril
BuildID: 20140716000202
Gaia: 393d72937727ad20e82b2ff7b13e3d7ff077a9f0
Gecko: 932c37978d37
Version: 30.0
Firmware Version: v1.2-device.cfg
User Agent: Mozilla/5.0 (Mobile; rv:30.0) Gecko/30.0 Firefox/30.0

Buri 1.3

Environmental Variables:
Device: Buri 1.3
Build ID: 20140716024003
Gaia: 23f55be856cef53c6604a6fe4aeb09061afbc897
Gecko: b9087513a198
Version: 28.0 (1.3)
Firmware Version: v1.2device.cfg
User Agent: Mozilla/5.0 (Mobile; rv:28.0) Gecko/28.0 Firefox/28.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Component: Gaia::Browser → Layout
Product: Firefox OS → Core
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Doesn't reproduce in Simulator (2.0 or 1.4), or in Firefox for Android. (The site layout fits the screen there.)

I can confirm on Flame (w/ updated version of the stock Firefox OS, which is 1.3-based).

Also: this seems to be just an issue with (re)-painting. In particular:
 (1) At least 3 times when I turned the screen off (or let it switch off automatically), the problem was fixed when I turned the screen on again. (However, this didn't reliably work, and switching tabs / apps does not fix it, so I don't know how much to pay attention to this).

 (2) For the purposes of event-handling, the layout seems to be correct. At least, I can swipe left & right on the full top ~third of the screen (including the white area), and I can see the swipe being handled in the miniature header. (which has a sliding banner that changes if you swipe sideways) And if I tap the extreme-top-right of the white area (where the miniature search icon should be), it activates a miniature search box.

(Vertical scroll events only seem to be recognized in the upper-left corner of the page, though.)
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage+][lead-review+]
This is still an issue, FWIW -- I just triggered this again, using a Flame w/ a B2G 2.1 version (from a few weeks ago), on my first & third attempt. (My second attempt didn't hit the bug; it's not 100% reproducible.)

digitarald reported the same issue happening at http://www.rtve.es/infantil/ , too, and bug 1059483 seems to be the same as well.  We should probably take a crack at this, since it's affecting multiple sites and we don't really know what's going on.

botond / kats: this seems to be a coordinate-system-scale / viewport-sizing issue, so potentially up your alley -- perhaps one of you could take a look at this?
Flags: needinfo?(botond)
We have an active relation with RTVE for Firefox OS and Firefox Marketplace. They reported the bug and are looking for a solution on their side. As this affects 2.0+ which can't be patched, any workarounds that they could apply would help!
Do we know whether this affects 2.2 or 3.0?
digiterald reported (over IRC, a few days ago) that this affects 2.2. Not sure about 3.0.
I can repro on 2.2 but not on 3.0. Can we get a fix window?
Flags: needinfo?(bugmail.mozilla)
My bad, I do see it in 3.0 as well. Just not as consistently.
Attached file APZ/TabChild logging
Here's APZ/TabChild logging for this issue. The relevant APZ is aaa85800 and timestamp 08:42:40.904 is where the APZ picks up the bad zoom value from layout. Not yet sure what the root cause here is though.
I think it's just a bug in the code at [1]. The APZ has a metrics with:
[cb=(x=0.000000, y=0.000000, w=480.000000, h=743.000000)]
v=(x=0.000000, y=0.000000, w=320.000000, h=495.333344)]
[z=(ld=1.500 r=1.000 cr=1 z=1.5 er=1)]

And then it gets a NotifyLayersUpdated call with:
[cb=(x=0.000000, y=0.000000, w=480.000000, h=809.000000)]
[v=(x=0.000000, y=0.000000, w=320.000000, h=539.333313)]
[z=(ld=1.500 r=0.327 cr=0.327 z=0.49 er=1)]

In this call to NotifyLayersUpdated, the "viewportUpdated" bool remains false, the code at [2] is run, and the APZ keeps the z=1.5 that it had before. Then, the APZ gets another NotifyLayersUpdated call with the exact same metrics again. This time, however, the viewportUpdated bool becomes true, because the composition bounds were updated in the previous call. Then we go through [3] and the APZ updates to z=0.49.

So I'm pretty sure this code is wrong, but I'm not sure what it's supposed to be.

[1] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/apz/src/AsyncPanZoomController.cpp?rev=5c982f0795de#2789
[2] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/apz/src/AsyncPanZoomController.cpp?rev=5c982f0795de#2850
[3] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/apz/src/AsyncPanZoomController.cpp?rev=5c982f0795de#2855
I guess the other question is why the layout-side resolution is 0.327 in the NLU call I mentioned in comment 11. Doesn't see to be any good reason for that.
(In reply to Harald Kirschner :digitarald from comment #5)
> We have an active relation with RTVE for Firefox OS and Firefox Marketplace.
> They reported the bug and are looking for a solution on their side. As this
> affects 2.0+ which can't be patched, any workarounds that they could apply
> would help!

If the RTVE issue is the same as the disney one, then the only workaround I can think of is to move the meta-viewport tag to as early in the document as possible. It should make the race condition less likely to trigger.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #12)
> I guess the other question is why the layout-side resolution is 0.327 in the
> NLU call I mentioned in comment 11. Doesn't see to be any good reason for
> that.

0.327 = 320 / 980, so I guess TabChild is performing a computation based on kDefaultViewportSize?
Flags: needinfo?(botond)
Hmm, the layout zoom of 0.327 does not appear to be coming from TabChild.
(In reply to Botond Ballo [:botond] from comment #15)
> Hmm, the layout zoom of 0.327 does not appear to be coming from TabChild.

Correction, it does. I was looking in the wrong place.

The first time this zoom is introduced into the system is when TabChild::HandlePossibleViewportChange calls APZCCallbackHelper::UpdateRootFrame, which then sets it as the pres shell resolution.

In this HPVC call, the viewport width is calculated as 980, so I guess the meta-viewport tag hasn't been processed yet. The question is, when it gets processed, why isn't the zoom updated correctly?
OK, here's what seems to be happening:

  - Initially, TabChild calculates and sets a pres shell resolution of 0.327 
    based on a default viewport width of 980. That percolates to APZ via NLU.

  - APZ dispatches a repaint request with the 0.327 resolution.

  - Content is busy processing the page content (including the meta-vieport
    tag), so this repaint request isn't processed for a while.

  - TabChild picks up the meta-viewport tag and sets the resolution to 1.
    This also makes it to APZ via NLU, overwriting APZ's previously set 0.327.

  - TabChild finally processes the repaint request with the 0.327 resolution,
    and sets in the pres shell.

  - The next NLU brings to APZ the freshly set 0.327 resolution, overwriting
    APZ's previously set 1.

  - Nothing else changes the resolution, so the 0.327 sticks.

So basically, we have an instance of an old value incorrectly clobbering a new one. I wonder if we should use sequence numbers to prevent this sort of thing, the way we do for scroll offset updates.
(In reply to Botond Ballo [:botond] from comment #17)
> So basically, we have an instance of an old value incorrectly clobbering a
> new one. I wonder if we should use sequence numbers to prevent this sort of
> thing, the way we do for scroll offset updates.

I think we can do something simpler (and more uplift-friendly):

It seems to me that the bad player in the above sequence of events is APZCCallbackHelper::UpdateRootFrame setting the stale 0.327 resolution, when the resolution stored in layout is 1.

The purpose of the SetResolutionAndScaleTo call in UpdateRootFrame is to apply an *async zoom* to the resolution, not to overwrite the base resolution with a different value. I think that, insofar as the base resolution (metrics.GetPresShellResolution()) in the repaint request differs from the one stored in Layout, we should discard the repaint request as old.
Attached file MozReview Request: bz://1039818/botond (obsolete) —
/r/6671 - Bug 1039818 - Do not allow an older APZ repaint request to clobber a newer pres shell resolution in Layout. r=kats

Pull down this commit:

hg pull -r f66be5a62fb5ec0f537ce3b15eecc16de5e8dc9f https://reviewboard-hg.mozilla.org/gecko/
Attachment #8588786 - Flags: review?(bugmail.mozilla)
Assignee: nobody → botond
This patch is simple enough that it can be uplifted (with trivial changes) as far back as we're prepared to take it. I'll prepare uplift patches once we've agreed on the approach.
(In reply to Botond Ballo [:botond] from comment #20)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=74dc3dc919e4

04-06 23:45:22.066 F/MOZ_Assert( 810): Assertion failure: nsContentUtils::IsCallerChrome(), at ../../../gecko/dom/base/nsDOMWindowUtils.cpp:537

Now that is a surprise...

Does this mean that in the situation exercised by the failing test, the corresponding SetResolutionAndScaleTo call [1] has been erroring out [2], and we've been ignoring this error? :O

[1] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/apz/util/APZCCallbackHelper.cpp?rev=600015044b60#212
[2] http://mxr.mozilla.org/mozilla-central/source/dom/base/nsDOMWindowUtils.cpp?rev=e60e056a230c#517
(In reply to Botond Ballo [:botond] from comment #22)
> Does this mean that in the situation exercised by the failing test, the
> corresponding SetResolutionAndScaleTo call [1] has been erroring out [2],
> and we've been ignoring this error? :O

Huh, it would appear so. And we likely hit this problem in production too. We should fix this.
(Also, I'm not convinced the approach you're taking in your patch is safe - I need to think about it more)
Ok, so I thought about it some and haven't come up with any cases where this breaks, but I still think this carries non-trivial amount of risk. We can land it on master and let it bake but I don't want to uplift it unless we have an actual 2.2 blocker.
Attachment #8588786 - Flags: review?(bugmail.mozilla) → review+
Comment on attachment 8588786 [details]
MozReview Request: bz://1039818/botond

https://reviewboard.mozilla.org/r/6669/#review5579

r+ but we should fix the IsCallerChrome issue in a separate bug before landing (otherwise this will get backed out right away).
Depends on: 1152479
Based on bug 1152479 comment 9, I'm going to patch up GetResolution only here, and deal with other nsIDOMWindowUtils functions in that bug; no need to block this bug on that one - with regards to those other functions, things will be precisely as broken as they have been before.
No longer depends on: 1152479
This is basically the original fix, plus some plumbing to get the nsIPresShell to UpdateRootFrame.

Kats: UpdateRootFrame taking both an nsIDOMWindowUtils and an nsIPresShell is temporary. I will remove the nsIDOMWindowUtils in bug 1152479, when I make the same fix to SetResolutionAndScaleTo, SetScrollPositionClampingScrollPortSize, and SetDisplayPortMarginsForElement, that I'm doing to GetResolution here; I just wanted to make the minimum necessary changes in this bug.
Attachment #8588786 - Attachment is obsolete: true
Attachment #8591153 - Flags: review?(bugmail.mozilla)
Attachment #8591153 - Flags: feedback?(ehsan)
Attachment #8591153 - Flags: review?(bugmail.mozilla) → review+
Attachment #8591150 - Flags: review?(ehsan) → review+
Attachment #8591153 - Flags: feedback?(ehsan) → feedback+
https://hg.mozilla.org/mozilla-central/rev/dfdb7109538d
https://hg.mozilla.org/mozilla-central/rev/6816cc58e19b
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
As per comment 25, not requesting uplift unless this bug becomes marked 2.2+.
RTVE fixed it by moving their viewport meta tag up.
You need to log in before you can comment on or make changes to this bug.