web extension popup does not repaint

VERIFIED FIXED in Firefox -esr60

Status

defect
P1
normal
VERIFIED FIXED
2 years ago
10 months ago

People

(Reporter: bed-boy66, Assigned: rhunt)

Tracking

({regression})

59 Branch
mozilla62
All
Windows
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox-esr60 verified, firefox59 wontfix, firefox60 wontfix, firefox61 verified, firefox62 verified)

Details

Attachments

(9 attachments)

Reporter

Description

2 years ago
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36

Steps to reproduce:

OS: Windows 10
Firefox Nightly: 59.0a1 (2017-11-26) (64-bit)
WebExtension: in attachment

1 [review]. Add Extension from attachment
2 [review]. Open https://develope8.planfix.ru/?action=task&id=10316916 (login: qwe, password: qwe)
3. Open extension
4. Click at last row "Админ. работа"
5. Scroll content


Actual results:

popup does not repaint


Expected results:

popup should properly repaint
Reporter

Comment 1

2 years ago
Posted image ff-problem.gif
Reporter

Comment 2

2 years ago
Comment on attachment 8932055 [details]
ff-problem.gif

problem
Reporter

Comment 3

2 years ago
Comment on attachment 8932055 [details]
ff-problem.gif

problem

Updated

2 years ago
Flags: needinfo?(amckay)

Comment 4

2 years ago
Following your steps it works well for me on OS X. 

We've seen issues around having APZ turned on, could you go to about:config and check the value of apz.allow_zooming please?
Flags: needinfo?(amckay) → needinfo?(bed-boy66)
Reporter

Comment 5

2 years ago
apz.allow_zooming is turned off
Flags: needinfo?(bed-boy66)

Updated

2 years ago
Flags: needinfo?(mconca)
Another suspect from current nightlies are retained display lists. They can be toggled by layout.display-list.retain
Reporter

Comment 7

2 years ago
Without changes..
I can absolutely reproduce this issue on my Windows 10 machine using 59 nightly.  It looks just like the attached GIF.
Flags: needinfo?(mconca)

Updated

2 years ago
Priority: -- → P1

Comment 10

a year ago
We aren't sure if this is happening at our level, or lower down, could you suggest anyone to help us debug why it might be happening?
Flags: needinfo?(milan)
If you set layers.mlgpu.enabled to false in about:config, and restart Firefox, is the problem still there?

Comment 12

a year ago
I am also able to recreate this bug on Windows 10 (after testing it on Linux and macOS). None of the workarounds suggested which alter configuration properties in about:config have done the trick. It also has troubles in the mainline Firefox, and not only Nightly.

Comment 13

a year ago
Possible duplicate of https://bugzilla.mozilla.org/show_bug.cgi?id=1418677, which has been moved to P2.
@milan Changing layers.mlgpu.enabled will not affect the bug on Win 10 with Firefox 57.0.4 and 58
Flags: needinfo?(milan)
See Also: → 1440667
Stephen, can you shed some light here?
Flags: needinfo?(spohl.mozilla.bugs)
Unfortunately, I have nothing to add here. :kmag?
Flags: needinfo?(spohl.mozilla.bugs) → needinfo?(kmaglione+bmo)
Bas, in the context of bug 1356317, anything that you can add here?
Flags: needinfo?(bas)
(In reply to Milan Sreckovic [:milan] from comment #17)
> Bas, in the context of bug 1356317, anything that you can add here?

This is very weird, the only explanation here is two overlapping layers, since it's not like it's not just clearing the white background, there is one layer actually moving. I don't see how bug 1356317 is related, but in any case, as far as I can tell this ought to be a layout issue.
Flags: needinfo?(bas)
kmag is a loop via 1416505, so removing NI.
Flags: needinfo?(kmaglione+bmo)
See Also: → 1416505
Flags: needinfo?(mconca)
(In reply to Mike Conca [:mconca] (Denver, CO, USA UTC-6) from comment #8)
> I can absolutely reproduce this issue on my Windows 10 machine using 59
> nightly.  It looks just like the attached GIF.

I have been unable to reproduce this on 60 nightly under Windows 10 and a clean profile. However, the website in the original STR may not be returning enough data to cause a scrollbar to reappear, so I am unsure.  I also tried the add-on mentioned in Comment 9 and it seems to be working fine.

Asking both add-ons if they can please try to reproduce using the latest nightly and respond as to whether this is still an issue.
Flags: needinfo?(mconca)
Flags: needinfo?(camillebarneaud)
Flags: needinfo?(bed-boy66)
Still here with Firefox 60.0a1 (2018-03-08) (64 bits), Win 10 & Wappalyzer 5.4.8.
Easy to reproduce on http://www.lemonde.fr/ for example to have enough content to be able to scroll down.

It is maybe only some placebo effect but I feel it is better than before.
Flags: needinfo?(camillebarneaud)
Added a needinfo assigned to Krupa, so that we can get QA to verify if it is still reproducible in a recent Nightly build.
Flags: needinfo?(kraj)
I wait eagerly an official confirmation! :)
With Firefox Nightly 61.0a1 (2018-03-22) (64 bits) I can still reproduce on my part.
Do not hesitate to tell me if I can help in any other way.

Comment 24

a year ago
Posted image cropped fields.jpg
I have tested on Windows 10 64Bit, with Firefox 59.0a1 (20171126220311) and Firefox 61.0a1 (20180322220118)and what is attached is what I get from every test, I tried to meddle around with the extension and the website but the list still won't populate. If it is populated for you, can you please let me know what you did to populate it(i don't speak or read Russian) and attached a gif of your reproduction of the bug? Try to reproduce the issue with a clean profile as well pls.
As for testing with Wappalyzer i was unable to reproduce the issue, the extensions pop-up is not cropped on my system.
Flags: needinfo?(kraj) → needinfo?(camillebarneaud)
Posted image Release 59.0.1.gif
The problem is not that the pop-up is cropped but that it does not repaint correctly.
Gif n°1 : release behavior
Flags: needinfo?(camillebarneaud)
Posted image Nightly 240318.gif
Gif n°2: Release

Please try to do the same with something like http://screentogif.com/ & add back kraj@mozilla.com in needinfo: why did you remove him ?
Flags: needinfo?(marius.santa)
(Kraj's team is already represented by Marius. Only one NI is necessary for QA.)

Comment 28

a year ago
Posted image repaint results.gif
I retested on Windows 10 64Bit Firefox 61.0a1 (20180325220121) and was able to reproduce the glitchy repaint.
I have attached a gif, with the same webext installed on Chrome(left) and Firefox(right).
Flags: needinfo?(marius.santa)

Comment 29

a year ago
I was only able to reproduce the build on Windows OS, not reproducible on MacOS.
Reproduced in Firefox: Release 59.0.2 (20180323154952), Beta 60.0b7 (20180326164103) and Nightly 61.0a1 (20180326100107), setting the appropriate flags and marking as new.
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Unspecified → Windows
Hardware: Unspecified → All
(In reply to marius.santa from comment #28)
> Created attachment 8962277 [details]
> repaint results.gif
> 
> I retested on Windows 10 64Bit Firefox 61.0a1 (20180325220121) and was able
> to reproduce the glitchy repaint.
> I have attached a gif, with the same webext installed on Chrome(left) and
> Firefox(right).

I can't see anything glitchy in that gif, certainly not anything like attachment 8932055 [details]
While looking into the other P2s I noticed that Bug 1418677 has a small test extension attached that has been created specifically to trigger the same rendering issues on the extension popups that we are investigating here, the test extension is attached in Bug 1418677 comment 19.

I briefly tried it locally and I've been able to trigger it on a recent Nightly build on windows (to be fair it seems that many of the issues reported in the comments of Bug 1418677 are the same issue tracked by this one (I'm not 100% sure about the one reported in the issue summary, and so I didn't closed Bug 1418677 as a duplicate of this one yet, but I added to the "See Also" list)
Flags: needinfo?(bed-boy66)
See Also: → 1418677
@mixedpuppy Maybe with a little help you can see it :)
Markus, we're stuck here a bit; anything you could add, perhaps based on comment 31?
Flags: needinfo?(mstange)
Maybe this is caused by bug 1414033?

If turning off mlgpu and commenting out the optimization at https://searchfox.org/mozilla-central/rev/78dbe34925f04975f16cb9a5d4938be714d41897/gfx/layers/composite/ContainerLayerComposite.cpp#263 fixes it, then bug 1414033 is most likely the cause. I may be able to try this later today.
Flags: needinfo?(mstange)
I can reproduce glitches when scrolling in the Wappalyzer on http://www.lemonde.fr/ . It looks like a buffer rotation bug, as if we're failing to clear the area that wraps around during scrolling.

I cannot reproduce the glitches in the attached planfix-timer extension, because as mentioned in comment 20, something has changed so that there isn't enough content in the panel anymore to cause scrolling.

Judging by the attached screen recordings, I think the two types of glitches could be caused by two different bugs. The planfix-timer glitch could be bug 1414033 (can't verify) and the Wappalyzer glitch could be bug in buffer rotation.
Assignee

Updated

a year ago
Flags: needinfo?(rhunt)
note bug 1416505 comment 17, not sure if that would have any effect here.
For anyone who can reproduce this, it might be worth disabling transparency in the popup browsers and trying again. I imagine that would simplify the rendering process considerably, and could easily fix some of these glitches.

Those popups were originally made transparent so that they could take advantage of the (usually dialog) background color of the panels they were loaded into, but bug 1449933 removed that possibility by making the popups white when the browser had a transparent background, rather than just making the browsers opaque.
Assignee

Comment 38

a year ago
This problem is indeed with rotated buffer, but also with FLB.

What's happening is that when scrolling, the content (excluding a background) of the scrollable div is put into a painted layer. The display items do not cover the whole layer, but an opaque background color is found so the layer is marked as opaque [1].

Then when scrolling, the buffer will rotate and previously painted content will be in the region that is scrolled into view. The content client will then tell FLB to paint into a complex region of just the display items. FLB will then paint the forced background color, but only in the region of the display items [2].

This will cause the artifacts seen above as previous content will be visible if they aren't repainted over by the display items.

This is not a problem with tiling as tiling will never rotate existing content into view, and when allocating a texture client from the pool it will be cleared and painted over.

[1] https://searchfox.org/mozilla-central/rev/795575287259a370d00518098472eaa5b362bfa7/layout/painting/FrameLayerBuilder.cpp#3439
[2] https://searchfox.org/mozilla-central/rev/795575287259a370d00518098472eaa5b362bfa7/layout/painting/FrameLayerBuilder.cpp#6516
Flags: needinfo?(rhunt)
Does FLB not invalidate the part that scrolled offscreen? The previous content that shows up unexpectedly should be neither in the valid nor in the visible region of the layer.
Assignee

Comment 40

a year ago
I forgot to add that commenting out these lines [1] to verify that this is the problem, worked.

(In reply to Markus Stange [:mstange] from comment #39)
> Does FLB not invalidate the part that scrolled offscreen? The previous
> content that shows up unexpectedly should be neither in the valid nor in the
> visible region of the layer.

The part that is scrolled offscreen will be invalidated, but that doesn't mean we'll clear it from the rotated buffer. For transparent layers, we clear the area of the rotated buffer that was scrolled into view before painting [2]. For opaque layers, we assume that everything will be painted over and skip that work.

The problem is that here we have a layer without display items covering the full visible region, marked opaque, so we don't do any clearing before painting. And FLB will attempt to paint the forced background color before painting, but it clips it to the visible region so that doesn't fix this issue. [3]

We could change content client to always clear the region of the buffer that was scrolled into view before painting, but that seems wasteful in the case we actually have opaque content without a forced background color.

[1] https://searchfox.org/mozilla-central/rev/795575287259a370d00518098472eaa5b362bfa7/layout/painting/FrameLayerBuilder.cpp#3438-3443

[2] https://searchfox.org/mozilla-central/rev/ce9ff94ffed34dc17ec0bfa406156d489eaa8ee1/gfx/layers/client/ContentClient.cpp#395
[3] https://searchfox.org/mozilla-central/rev/795575287259a370d00518098472eaa5b362bfa7/layout/painting/FrameLayerBuilder.cpp#6513
Assignee

Comment 41

a year ago
I don't exactly understand how we determine to use a forced background color, but when this is the case, could we expand the visible region to be the bounds of the whole layer?

That way FLB will be able to fill in the whole region with the forced background color.
FLB shouldn't need to clear this area because it's not part of the valid or the visible region of the layer. It's the ContentClient's job to make sure that pixels outside of union(visible region, valid region) do not reach the screen.
Assignee

Comment 43

a year ago
Okay, so for this case we have a painted layer like:

> ClientPaintedLayer (0x1ef0e696c00) [transform=[ 1 0; 0 1; 0 18; ]] [effective-transform=[ 1 0; 0 1; 0 18; ]] [visible=< (x=30, y=77, w=519, h=57); (x=29, y=134, w=520, h=22); (x=30, y=156, w=519, h=218); (x=305, y=390, w=180, h=8); >] [opaqueContent] [valid=< (x=29, y=77, w=520, h=297); (x=305, y=390, w=180, h=8); >]

And because it's marked as [opaqueContent], ContentClient will not clear the 'scrolled into' region of the rotated buffer before it's painted because it assumes that it will be overwritten completely with opaque content. 

But in this case, the visible region doesn't fill the whole 'scrolled into' region, so we do need to clear the part of it outside the visible region, and mark it as invalid so FLB will fill it with the forced background color.

If that makes sense, I can write a patch for that in ContentClient.
Assignee

Comment 44

a year ago
I've got a patch that does the above and it seems to work well. I'll test more and put it up for review.
Assignee: nobody → rhunt
(In reply to Ryan Hunt [:rhunt] from comment #43)
> But in this case, the visible region doesn't fill the whole 'scrolled into'
> region, so we do need to clear the part of it outside the visible region,
> and mark it as invalid so FLB will fill it with the forced background color.

It should be marked as invalid, yes. But it doesn't necessarily have to become part of the "region to paint" - it's not really necessary for FLB to fill this part of the buffer with the color. There may just be holes in the valid region of the layer after painting, and those "valid region holes" are allowed to contain garbage. Parts of the buffer that are not within the valid region of the layer should be prevented from reaching the screen at compositing time: For example, instead of telling the Compositor to draw a quad with the entire buffer, the layer may have to be drawn as a series of smaller quads that each are completely within the layer's valid region.

Actually, "it should be marked as invalid" might even be the wrong way to put it. These areas were never part of the valid region to begin with, in layer space. They only happened to have been part of the valid region in rotated buffer space, but it's the job of the rotated buffer to make that distinction.
Assignee

Comment 46

a year ago
(In reply to Markus Stange [:mstange] from comment #45)
> (In reply to Ryan Hunt [:rhunt] from comment #43)
> > But in this case, the visible region doesn't fill the whole 'scrolled into'
> > region, so we do need to clear the part of it outside the visible region,
> > and mark it as invalid so FLB will fill it with the forced background color.
> 
> It should be marked as invalid, yes. But it doesn't necessarily have to
> become part of the "region to paint" - it's not really necessary for FLB to
> fill this part of the buffer with the color. There may just be holes in the
> valid region of the layer after painting, and those "valid region holes" are
> allowed to contain garbage. Parts of the buffer that are not within the
> valid region of the layer should be prevented from reaching the screen at
> compositing time: For example, instead of telling the Compositor to draw a
> quad with the entire buffer, the layer may have to be drawn as a series of
> smaller quads that each are completely within the layer's valid region.
> 
> Actually, "it should be marked as invalid" might even be the wrong way to
> put it. These areas were never part of the valid region to begin with, in
> layer space. They only happened to have been part of the valid region in
> rotated buffer space, but it's the job of the rotated buffer to make that
> distinction.

Interesting, does our compositor actually only paint the valid region of the
painted layer? If so then I think the problem might be in the compositor, as
content client is painting correctly within the visible region.

I've noticed that the popup only uses a D3D11 compositor even if we're using
MLGPU so there could be a bug there. I'll keep digging a bit more.

I did write up a patch that makes sure we validate the whole region that was
scrolled into of the rotated buffer. It passes try, but expands how much we
paint in this situation. And if this is supposed to be handled by the
compositor then this isn't the right fix. [1]

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=6bb17c98661b9e8fcc3658a681ddd881de93e081
(In reply to Ryan Hunt [:rhunt] from comment #46)
> Interesting, does our compositor actually only paint the valid region of the
> painted layer? If so then I think the problem might be in the compositor, as
> content client is painting correctly within the visible region.

The Compositor API itself is on a somewhat lower level; the caller of Compositor::DrawGeometry needs to handle this. For example, here's what our tiled content client does:
https://searchfox.org/mozilla-central/rev/c0d81882c7941c4ff13a50603e37095cdab0d1ea/gfx/layers/composite/TiledContentHost.cpp#584-586
https://searchfox.org/mozilla-central/rev/c0d81882c7941c4ff13a50603e37095cdab0d1ea/gfx/layers/composite/TiledContentHost.cpp#517-530
Assignee

Comment 48

a year ago
Okay I think I know what's going on here.

The ContentHost will expand the region it paints from the visible region to the visible bounds if aLayer->MayResample() is true [1]. ContentClient will validate the visible bounds if it sees that aLayer->MayResample() is true as well [2].

The problem here is that MayResample() checks to see if its ancestor layers have CONTENT_MAY_CHANGE_TRANSFORM, and that can be different between the content and compositor side. For example, if the ancestor layers for a ref layer that contains a painted layer are marked as CONTENT_MAY_CHANGE_TRANSFORM.

The client layer will have no idea about the container layers for the ref layer it will be under, so it thinks it won't be resampled and doesn't validate the visible bounds. The shadow layer thinks that it will be resampled because of its ancestors and so it will paint the whole visible bounds. And we get this artifact.

That being said, I'm not sure why an ancestor layer in the chrome layer tree is getting marked with CONTENT_MAY_CHANGE_TRANSFORM, so I'm not sure if that's a bug or not. I'm not sure what the web extension pop up is doing that could trigger this.

It seems like in this case we either need to inform content that it may be resampled somehow, or ignore ancestor layers above a ref layer when computing MayResample().

[1] https://searchfox.org/mozilla-central/rev/c0d81882c7941c4ff13a50603e37095cdab0d1ea/gfx/layers/composite/ContentHost.cpp#77
[2] https://searchfox.org/mozilla-central/rev/c0d81882c7941c4ff13a50603e37095cdab0d1ea/gfx/layers/client/ContentClient.cpp#483
Assignee

Comment 49

a year ago
Now that I think some more about this, anding the visible bounds with the valid region in the ContentHost might get around this edge case as well. I don't have time to test that though as I am on PTO.
Whoa, that's subtle. The layer probably has CONTENT_MAY_CHANGE_TRANSFORM because of this will-change: transform that I added to work around bug 1414033 :(

https://searchfox.org/mozilla-central/rev/b28b94dc81d60c6d9164315adbd4a5073526d372/toolkit/content/xul.css#436

So I suppose we haven't hit this bug before because we've never used remote iframes inside elements whose transform was animated.
Comment hidden (mozreview-request)
Comment on attachment 8973846 [details]
Bug 1420865 - Don't check for CONTENT_MAY_CHANGE_TRANSFORM across layer trees.

https://reviewboard.mozilla.org/r/242216/#review248054

Nice.
Attachment #8973846 - Flags: review?(mstange) → review+
Assignee

Comment 53

a year ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=090e631397f155c01ca177da6e5bf33fa2be7395

Looks green to me. There is one weird R1 failure, but it didn't occur again in two retriggers so I'm ignoring it.

Comment 54

a year ago
Pushed by rhunt@eqrion.net:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0a1ffd5411da
Don't check for CONTENT_MAY_CHANGE_TRANSFORM across layer trees. r=mstange

Comment 55

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/0a1ffd5411da
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62

Comment 56

a year ago
Posted image repain working.gif
I retested on Windows 10 64Bit Firefox 61.0a1 (20180325220121) and was able to reproduce the glitchy repaint.
Retested and verified in Firefox 62.0a1 (20180513220424).

Updated

a year ago
Status: RESOLVED → VERIFIED

Updated

a year ago
Blocks: 1462857
Duplicate of this bug: 1462857

Updated

a year ago
No longer blocks: 1462857
Please request Beta approval on this when you get a chance.
Flags: needinfo?(rhunt)
Assignee

Comment 59

a year ago
Comment on attachment 8973846 [details]
Bug 1420865 - Don't check for CONTENT_MAY_CHANGE_TRANSFORM across layer trees.

Approval Request Comment
[Feature/Bug causing the regression]: A mismatch in checking for a property between processes causes rendering corruption in specific circumstances with web extensions

[User impact if declined]: They may see rendering corruption when scrolling a web extension window

[Is this code covered by automated tests?]: No

[Has the fix been verified in Nightly?]: Yes

[Needs manual test from QE? If yes, steps to reproduce]: Probably not, the fix has been verified in nightly and I can't think how it wouldn't work on beta

[List of other uplifts needed for the feature/fix]: None

[Is the change risky?]: No

[Why is the change risky/not risky?]: The fix has been verified in nightly and is very conceptually simple. No other regressions have been reported from the fix

[String changes made/needed]: None
Flags: needinfo?(rhunt)
Attachment #8973846 - Flags: approval-mozilla-beta?
Comment on attachment 8973846 [details]
Bug 1420865 - Don't check for CONTENT_MAY_CHANGE_TRANSFORM across layer trees.

Fix for OOP WebExtension popups, especially useful since we're expanding OOP to OSX on 61. Approved for 61.0b8.
Attachment #8973846 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Setting esr60 to wontfix.  Feel free to reset status flag and request uplift if you disagree.

Updated

11 months ago
Product: Toolkit → WebExtensions

Comment 63

11 months ago
Would it be possible to uplift this to esr60? A couple of my extensions are affected that use the popup, such as https://addons.mozilla.org/en-US/firefox/addon/search_by_image/, and it is reproducible on all Windows devices I've tested with.
Seems safe enough. Ryan, can you please request ESR60 approval on this when you get a chance?
Flags: needinfo?(rhunt)
Assignee

Comment 65

11 months ago
Comment on attachment 8973846 [details]
Bug 1420865 - Don't check for CONTENT_MAY_CHANGE_TRANSFORM across layer trees.

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: This affects multiple extensions, is reproducible across multiple Windows devices, and has a simple low risk fix.

User impact if declined: Users will see graphics corruption on some web extensions

Fix Landed on Version: 61

Risk to taking this patch (and alternatives if risky): Low, it's a conservative fix and has shipped in 61

String or UUID changes made by this patch: None

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Flags: needinfo?(rhunt)
Attachment #8973846 - Flags: approval-mozilla-esr60?
Comment on attachment 8973846 [details]
Bug 1420865 - Don't check for CONTENT_MAY_CHANGE_TRANSFORM across layer trees.

Approved for ESR 60.2.
Attachment #8973846 - Flags: approval-mozilla-esr60? → approval-mozilla-esr60+
Flags: qe-verify+

Comment 68

10 months ago
Verified that the issue is no longer reproducible on the latest Release 61 build(20180704003137) on Win10 x64, Ubuntu 16.04 and MacOS 10.13
Duplicate of this bug: 1426317

Comment 70

10 months ago
Verified that the issue is no longer reproducible on build 60.1.1esr(BuildID 20180802180640) from https://tools.taskcluster.net/index/gecko.v2.mozilla-esr60.latest.firefox on Win10 x64, Ubuntu 16.04 and MacOS 10.13.
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.