Closed Bug 1134459 Opened 9 years ago Closed 9 years ago

Intermittent layout/reftests/scrolling/image-1.html | image comparison (==), max difference: 255, number of differing pixels: 93280 with vsync refresh driver on b2g emulator

Categories

(Core :: Layout, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox37 --- wontfix
firefox38 --- wontfix
firefox39 --- fixed
b2g-v2.2 --- fixed
b2g-master --- fixed

People

(Reporter: mchang, Assigned: mchang)

References

Details

Attachments

(4 files, 1 obsolete file)

+++ This bug was initially created as a clone of Bug #1133786 +++

https://bugzilla.mozilla.org/show_bug.cgi?id=1133786#c26

This intermittent occurs with the vsync refresh driver.

From https://treeherder.mozilla.org/#/jobs?repo=try&revision=ce095320d96e - Also is unrelated to tiling / black boxes and the other reftest failures with tiling.

Find and fix the issue.
Attached file reftest log
Hmm, when we fail, the reftest-content.js logs a warning that:

INFO -  REFTEST INFO | [CONTENT] MozInvalidateEvent didn't invalidate

Right before the failure. This warning doesn't occur when the tests pass.
Still digging, but something odd is occurring. During Element::SetScrollTop, we sometimes don't actually set the scrolltop to 1, which is what this test requires. Instead we set it to 0.

I've tracked it down to the scrolled overflow rect being a different size [1] during the failing test. In good test runs, the scrollable frame's overflow rect should be:

GetScrolledRect overflow (0, 0) (w: 47040, h: 48300), size: (w: 47040, h: 18000)

In bad runs, it is:

GetScrolledRect overflow (0, 0) (w: 47040, h: 18000), size: (w: 47040, h: 18000)

Because the height is different, when we calculate the scroll range[3], we don't scroll at all. What's interesting is that above[1], there is a time when the rect is inaccurate which may be occurring here.

[1] https://dxr.mozilla.org/mozilla-central/source/layout/generic/nsIFrame.h?from=GetScrollableOverflowRect&case=true#
[2] https://dxr.mozilla.org/mozilla-central/source/layout/generic/nsGfxScrollFrame.cpp#3250
Two interesting things here. 

1) The <block> after the <HTMLScroll> has a scroll and visual overflow area:

Block(div)(1)@44aa9330 {0,0,47040,18000} vis-overflow=0,0,47040,48300 scr-overflow=0,0,47040,48300 [state=0000108008d00200] [content=44a1a420] [sc=44aa9148:-moz-scrolled-content]<

2) The image png size is correct I think:

  ImageFrame(img)(3)@44aa98f0 {0,300,18000,48000} [state=0000008000200000] [content=4460cc00] [sc=44aa9540^44aa4db0^44aa3628^44aa3300] [src=]

The whole browser size should be 800X1000 pixels, which is 48000 x 60000 app unit pixels.
This version is quite odd in two regards:

1) The image is really small: 1440X1440 app units.

 ImageFrame(img)(3)@44a9f8f0 {0,300,1440,1440} [state=0000008000200000] [content=446088e0] [sc=44a9f540^44a9adb0^44a99628^44a99300] [src=]


2) The block div is not detected as a scrollable overflow area.

Block(div)(1)@44a9f330 {0,0,47040,18000} [state=0000108008d00200] [content=44a43600] [sc=44a9f148:-moz-scrolled-content]<
The summary version of the bug is that we set element.scrollTop and then re-read the scrollTop back in this test. The test just has 2 divs, 1 with a long 5px yellow bar and another with a png. On some iterations of the test, the image in the frame tree is only 1440 app pixels and is determined to not overflow into the block div above it from comment 3. This causes the element.scrollTop to fail since there is no scrollable overflow region, so the scroll goes to (0, 0) at [1]. Since the scroll fails, the test fails. 

Interestingly, the frame tree error occurs regardless of the number of times the refresh driver ticks, but there is usually the same refresh driver tick order in both the passing and failing case. I don't think this has to do anything with the actual refresh driver tick sequence versus when the refresh driver occurs causes something else in layout to shift.

@Matt - Any thoughts on what else I could check to see why this image frame has shrunk dimensions? Do we decode images on another thread?

Thanks!

[1] https://dxr.mozilla.org/mozilla-central/source/layout/generic/nsGfxScrollFrame.cpp#2079
Flags: needinfo?(matt.woodrow)
I think we do now, Seth would know more.
Flags: needinfo?(matt.woodrow) → needinfo?(seth)
See Also: → 1019840
(In reply to Mason Chang [:mchang] from comment #5)
> @Matt - Any thoughts on what else I could check to see why this image frame
> has shrunk dimensions? Do we decode images on another thread?

Yes, we always decode images on another thread.

Neither the page load event nor the image's load event should fire until the image's size is available. I glanced at the test in question, and it seems like it does not wait for either of those events. (Unless I missed it somewhere.) That may well be the entire problem here.
Flags: needinfo?(seth)
After talking with Seth, it looks like this is a problem with the test. It should wait until the page load event fires before starting the script.
Change the reftest scrolling infrastructure to not start a scroll until the onload event fires.

Tested locally and it seems to be working, but the timing was always very finicky so we might just have gotten lucky. Will test on try before pushing as well.
Attachment #8568125 - Flags: review?(seth)
Ensuring that we wait for a page load until scrolling seems to have done the trick:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ceb3fe8e8261
Comment on attachment 8568125 [details] [diff] [review]
Wait for pageload event before starting scrolls in reftests

Review of attachment 8568125 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!

Sorry for the slow review on this one.

::: layout/reftests/scrolling/scrolling.js
@@ +18,5 @@
>    }
>  }
>  
> +// bug 1134459, images are decoded off main thread
> +// so it's not safe to scroll until the page actually loads

Nit: this was true even before images were decoded off-main-thread, because they were always decoded asynchronously. This test just worked due to fortunate timing in the past.

We're not really waiting for images to decode here, just for the load event. (It's quite likely that they *haven't* been decoded when the load event fires.) I'd say the right comment here is something closer to "Wait for the load event so we know all images have loaded."
Attachment #8568125 - Flags: review?(seth) → review+
Carrying r+. Updated with feedback from comment 12.
Attachment #8568125 - Attachment is obsolete: true
Attachment #8571406 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/5456f25a2ee1
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
Comment on attachment 8571406 [details] [diff] [review]
Wait for pageload event before starting scrolls in reftests

NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Silk refresh driver, bug 1123762. It was causing an intermittent that was actually a problem with the test. The test was reading a value too early.
User impact if declined: We won't be able to enable the vsync refresh driver from Silk, the last part of Silk.
Testing completed: Manual testing, try
Risk to taking this patch (and alternatives if risky): Low - This fixes a problem in the reftest that was invalid.
String or UUID changes made by this patch:
Attachment #8571406 - Flags: approval-mozilla-b2g37?
Forgot the last part of comment 16:

String or UUID changes made by this patch: None
Comment on attachment 8571406 [details] [diff] [review]
Wait for pageload event before starting scrolls in reftests

low risk fix helping with intermittent test failure, approving to land.
Attachment #8571406 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: