Closed Bug 1299908 Opened 3 years ago Closed 3 years ago

While scrolling in washingtonpost.com article, the page gets filled with cascades

Categories

(Core :: Graphics, defect)

49 Branch
defect
Not set

Tracking

()

VERIFIED FIXED
mozilla51
Tracking Status
firefox49 blocking verified
firefox50 --- verified
firefox51 blocking verified

People

(Reporter: Abe_LV, Assigned: mattwoodrow)

References

()

Details

(Keywords: regression)

Attachments

(3 files)

Attached image cascades.PNG
Affected Version   : 49.0b9
Unaffected Version : 48.0b8
Affected platforms : Windows XP

[Steps to Reproduce]
1. Download and install beta 49.0b9 from https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&selectedJob=1566414
2. Go to https://www.washingtonpost.com/ and open top article
3. Scroll down and up

[Actual result]
Cascades appear in the right and left side of the page.

[Expected Result]
Page should display the actual content, not cascades.

Regression Window: I’ll follow up with a regression range shortly

--
Version 	49.0b9
Build ID 	20160831113322
Update Channel 	beta
User Agent 	Mozilla/5.0 (Windows NT 5.1; rv:49.0) Gecko/20100101 Firefox/49.0
See Also: → 1294161
(In reply to Abe - QA from comment #0)
> Affected Version   : 49.0b9
> Unaffected Version : 48.0b8

Based on our IRC discussion I assume you meant "49.0b8" here, rather than "48.0b8"
Version: Trunk → 49 Branch
[Tracking Requested - why for this release]: Seems like a regression in beta, we should figure out what caused it and if it's ok to ship (looks pretty bad from the screenshot)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #1)
> (In reply to Abe - QA from comment #0)
> > Affected Version   : 49.0b9
> > Unaffected Version : 48.0b8
> 
> Based on our IRC discussion I assume you meant "49.0b8" here, rather than
> "48.0b8"

Yes, correct. 49.0b8 is working fine but not 49.0b9.
This is non-e10s issue. 
It is also reproducible in windows-10 and Ubuntu 16.04. But it is clearly visible and worse in XP.

Link to screen capture: https://testing-1.tinytake.com/sf/OTQ4Nzc3XzM5ODEwMjg

Here is the regression window:
Last good revision: 4f90fc4e60631bb9220a856a8d87e05f6ff36f40
First bad revision: 90a4db6c929deff452cf420e819b6a7325af328d
Pushlog: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=4f90fc
4e60631bb9220a856a8d87e05f6ff36f40&tochange=90a4db6c929deff452cf420e819b6a7325af
328d
Flags: needinfo?(bugmail)
OS: Windows XP → All
Hardware: x86 → All
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #2)
> [Tracking Requested - why for this release]: Seems like a regression in
> beta, we should figure out what caused it and if it's ok to ship (looks
> pretty bad from the screenshot)

I think this should be a ship blocker. We need to fix this one way or other, even if it means backing out a whole chain of patches.
Another instance of this bug was reported at https://twitter.com/johndrinkwater/status/771319123414581251 .
Seen in Firefox Developer (2016-08-31 & 2016-09-01) and some tweets that include a video of the behaviour on the first 49b https://twitter.com/johndrinkwater/status/771321501014523905 the latest has less corruption in page, but things like the tab bar has horizontal corruption now.
Seen on twitter.com and community.letsencrypt.org
Almost certainly regression from bug 1296793 which was uplifted to beta recently as well. Matt, can you look into this/back out the regressing patch?

ni? to :lizzard as well as FYI for potential ship blocker.
Blocks: 1296793
Flags: needinfo?(matt.woodrow)
Flags: needinfo?(lhenry)
Flags: needinfo?(bugmail)
I think this may be from bug 1294161, which needs to be backed out.
Flags: needinfo?(lhenry)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #8)
> I think this may be from bug 1294161

I think that's rather unlikely. I think kats is right in comment 7 and it was caused by bug 1296793.
I agree. We should probably back this out of the branches and I'll fix it on Nightly.
Flags: needinfo?(matt.woodrow)
I've backed out bug 1296793 from Aurora and Beta. Let's get some QA verification on the CI builds once they finish.

https://hg.mozilla.org/releases/mozilla-aurora/rev/ff1e5019f0c9
https://hg.mozilla.org/releases/mozilla-beta/rev/7579892529ee
Flags: needinfo?(andrei.vaida)
Flags: in-testsuite?
Duplicate of this bug: 1300096
I triggered new nightlies on Aurora to hopefully help with incoming dupes. Should be ready in a few hours.
(In reply to Ryan VanderMeulen [:RyanVM] from comment #13)
> I triggered new nightlies on Aurora to hopefully help with incoming dupes.
> Should be ready in a few hours.

As a local user who hit this on Aurora, I can confirm that after these retriggered nightlies Ubuntu 15.10 is no longer exhibiting the issue in my duped-over bug.
Marking this fixed from the backouts for 49 and 50.  
We should verify the fix in 49 on Monday (or earlier - from the beta 10 builds)

Matt, I'm leaving 51 as affected for now to give you a chance to fix - but if not can you please back out for 51 before it goes to aurora? I marked it as blocker for 51.  Thanks.
Flags: needinfo?(matt.woodrow)
Now, it is fixed in this build (comment 11).
We have tested it on Windows XP, Ubuntu 16.04, windows 8 x64 and Mac 10.11 and
the cascading issue is not reproducible anymore in beta and aurora.

---
Version 	49.0b10
Build ID 	20160902073149
Update Channel 	beta
User Agent 	Mozilla/5.0 (Windows NT 5.1; rv:49.0) Gecko/20100101 Firefox/49.0
Abe: Best to leave the status as blocking since that's for mozilla-central (currently 51) and we didn't land a backout for 51. In general, please don't change "blocking" or tracking flags (except to mark them tracking:? if you want relman to have a look at them)
Although we could not reproduce the initial issue from washingtonpost, we did though ran into the behavior reported reported in https://twitter.com/johndrinkwater/status/771321501014523905 on Firefox 49 beta 9, only on Ubuntu 16.04 32-bit. 
Through our extensive testing across platforms Mac OS X 10.9.5, Windows 7 x64, Windows 10 x64 and Ubuntu 16.04 x86 targeting Firefox 49 beta 10, we only encountered issues on Ubuntu and we've logged a separate bug on that (bug 1300540). I'll go ahead and mark it verified on 49 and deal with the new bug separately since I am not entirely sure that it's related to this bug.
Flags: needinfo?(andrei.vaida)
I have a patch for this, just working on a reduced reftest.
Flags: needinfo?(matt.woodrow)
Assignee: nobody → matt.woodrow
Whenever we used the 'real' AGR for a FORCE_ACTIVE item we would clobber the topLeft variable and the normal flattening path wasn't fixing it.
Attachment #8788716 - Flags: review?(mstange)
This shows the bug really clearly when loaded, but doesn't actually catch the bug in the reftest harness.

Things you need:

* A FORCE_ACTIVE layer (the 3d transform).
* Text on top of the active layer to trigger flattening (the word 'Hello').
* A scrollable background, so that flattening picks the scrolled frame as the flattening AGR (the 5k high div).
* The FORCE_ACTIVE to have a different AGR to the flattened one (position:fixed on the transform).
* Some flattened content that has the same AGR as the FORCE_ACTIVE one (the position:fixed blue square).

Normally, in this situation we'd expect the position:fixed blue square to invalidate on scroll, since it's been flattened into the layer containing the scrolled content.
The value of topLeft changes on scroll (since it's the position of the scrolled frame), and DLBI detects that the blue square has moved relative to this, and invalidates.
With this bug, we accidentally clobber topLeft to be (0,0) (the viewport frame position) when processing the 3d transform and we still have that value when processing the blue square. Since this is now constant between scroll positions we no longer trigger the invalidation, and the blue square scrolls with the content. We still invalidate the newly scrolled in area though, which draws the blue square in its correct position, as well as having it scrolled down.

The reftest doesn't work because the reftest harness only takes a snapshot of the invalidated area, which is only the newly scrolled in area (and is correct) and it doesn't notice the broken content that has been blurred down the page. I'm also worried about scrollbar hiding breaking this test on some platforms.
(In reply to Matt Woodrow (:mattwoodrow) from comment #21)
> The reftest doesn't work because the reftest harness only takes a snapshot
> of the invalidated area, which is only the newly scrolled in area (and is
> correct) and it doesn't notice the broken content that has been blurred down
> the page.

Can you use reftest-snapshot-all?

> I'm also worried about scrollbar hiding breaking this test on some
> platforms.

If you use overflow:hidden and tickle scrollTop a little, we still create an AGR for the scrolled contents. We have lots of tests that rely on that.
(In reply to Markus Stange [:mstange] from comment #22)

> Can you use reftest-snapshot-all?

That doesn't seem to help. It's possible that I diagnosed it wrong.


> If you use overflow:hidden and tickle scrollTop a little, we still create an
> AGR for the scrolled contents. We have lots of tests that rely on that.

Making the body overflow:hidden breaks the test, I haven't looked into why yet.
Attachment #8788716 - Flags: review?(mstange) → review+
Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/23530934541a
Make sure we set the correct AGR to reference frame offset when flattening layers. r=mstange
https://hg.mozilla.org/mozilla-central/rev/23530934541a
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Do we want to consider uplifting this and bug 1296793 to Aurora now?
Flags: needinfo?(matt.woodrow)
Comment on attachment 8788716 [details] [diff] [review]
Reset the topLeft variable each time

Approval Request Comment
[Feature/regressing bug #]: Bug 1296793, assuming we uplift that as well
[User impact if declined]: Incorrect invalidation on some pages when we don't have hardware acceleration.
[Describe test coverage new/current, TreeHerder]: Manually test.
[Risks and why]: Very low risk.
[String/UUID change made/needed]: None
Flags: needinfo?(matt.woodrow)
Attachment #8788716 - Flags: approval-mozilla-aurora?
Comment on attachment 8788716 [details] [diff] [review]
Reset the topLeft variable each time

Gecko 50 is on Beta now.
Attachment #8788716 - Flags: approval-mozilla-aurora? → approval-mozilla-beta?
Comment on attachment 8788716 [details] [diff] [review]
Reset the topLeft variable each time

Fixes a recent regression, Beta50+
Attachment #8788716 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: qe-verify+
Verified as fixed on windows XP, windows 10, Ubuntu 16.04 and OS X 10.11.
There is no more cascading issue with this build.

--  
Version 	50.0b2
Build ID 	20160922172259
Update Channel 	beta
User Agent 	Mozilla/5.0 (Windows NT 5.1; rv:50.0) Gecko/20100101 Firefox/50.0
I've reproduced the initial issue only on Ubuntu 14.04 x64 using Firefox 49.0b9. 

I've tested on Windows 10 x64, Ubuntu 14.04 x64 and mac OSX 10.11 using Firefox 51.0a2 (buildID:  	20160926004000) and I didn't see the issue anymore. I'll mark this verified on 51. 

Also, I'll mark this verified on 50 based on the above comment.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.