Closed Bug 1481466 Opened 2 years ago Closed 2 years ago

Themes are incorrectly displayed in preview mode

Categories

(Firefox :: Toolbars and Customization, defect, P1)

defect

Tracking

()

VERIFIED FIXED
Firefox 63
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox61 --- unaffected
firefox62 --- unaffected
firefox63 --- verified

People

(Reporter: tbabos, Assigned: Paolo)

References

Details

(Keywords: regression)

Attachments

(4 files, 1 obsolete file)

Attached image Gif. of the issue
[Affected versions]:
Latest Nightly 63.0a1(2018-08-07)

[Affected Platforms]:
Windows 10 x64
Windows 7 x64
Mac OS 10.13

[Steps to reproduce]: 
1. Launch Firefox
2. Enter Customize mode
3. Click on Themes to view the pop-up
4. Hover over the themes to preview them

[Expected result]:
The themes are correctly displayed in the preview mode.

[Actual result]:
Half of the screen is cut during the preview mode of every theme.

[Notes]:
It can be reproduced only once in a new profile and will not persist after quitting preview mode. 

[Regression-range]:
Last good revision: 8580f0233058dd0c9ed4353f8dcb366eb5a624de
First bad revision: 2e8624a457a61ad282e163741c7b1fad91df4434
Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=8580f0233058dd0c9ed4353f8dcb366eb5a624de&tochange=2e8624a457a61ad282e163741c7b1fad91df4434
Keywords: regression
ni?ing paolo, since this looks like fallout from bug 1461793.

Setting to P1, since this is a pretty brutal visual glitch.
Blocks: 1461793
Flags: needinfo?(paolo.mozmail)
Priority: -- → P1
I'll look into this.
Assignee: nobody → paolo.mozmail
Status: NEW → ASSIGNED
Flags: needinfo?(paolo.mozmail)
This looks like a race condition somewhere in layout code. It's definitely intermittent, and can be reproduced multiple times in the same profile. I tried to comment out the code that applies the theme here:

https://dxr.mozilla.org/mozilla-central/rev/4e56a2f51ad739ca52046723448f3129a58f1666/toolkit/modules/LightweightThemeConsumer.jsm#178-213

While the issue seemed to be gone at first, it happened again when hovering one of the theme buttons, even without the lightweight theme changing. So, it doesn't look related to the changes applied by the theme manager, and it may some sort of timing issue.

Flipping the checkbox on any CSS property in the Inspector fixed the glitch, probably by forcing layout again.

Backing out bug 1461793 fixes the issue permanently. That bug landed with a workaround for another timing issue, the empty "data:text/css,", so we may be in a similar situation here.

I don't think I have the expertise to debug this further, I was thinking of Neil but I see he's away now. Mike, can you recommend someone whose help we can ask for? Worst case we can back out the bug for now, but at some point we should probably have the underlying issue investigated.
Flags: needinfo?(mconley)
If it does have the same root cause that led to the "data:text/css," fix, it may be easier for debugging to remove that line (https://searchfox.org/mozilla-central/rev/ca869724246f4230b272ed1c8b9944596e80d920/toolkit/content/widgets/popup.xml#135) and then debug the test_arrowpanel.xul failure.
Curious, does removing the "data:text/css," resource make this issue go away? I assume not, but worth checking.
(In reply to Brian Grinstead [:bgrins] from comment #5)
> Curious, does removing the "data:text/css," resource make this issue go
> away? I assume not, but worth checking.

Well, actually that was a good suggestion! I didn't think about trying, but removing the <resources> fixes the issue.

Maybe it is just the "data" scheme. I haven't tried linking to an empty CSS file, but even if that works, as far as workarounds go, I'm not sure whether to do that or just disable the failing portion of the arrowpanel test.

If there was someone available to investigate bug 1470880 we may also be able to fix the issue properly.
> Maybe it is just the "data" scheme. I haven't tried linking to an empty CSS file, but even if that works, as far as workarounds go, I'm not sure whether to do that or just disable the failing portion of the arrowpanel test.

Worth testing that - IMO switching to an empty CSS file would be better than backing out Bug 1461793. Of course, the best case would be fixing Bug 1470880.
Unfortunately, that doesn't work. Also moving <resources> to the base binding doesn't help.

Note that if we can't fix bug 1470880, the alternative I suggest is to disable the failing portion of the arrowpanel test, rather than backing out bug 1461793.
(In reply to :Paolo Amadini from comment #8)
> Note that if we can't fix bug 1470880, the alternative I suggest is to
> disable the failing portion of the arrowpanel test, rather than backing out
> bug 1461793.

That seems like a reasonable start to me, if we can target a small part of the test to skip. I'm assuming the failing portion isn't that important to the browser window, since it doesn't seem to manifest in any mochitest-browser tests. But we'd want to confirm exactly what's failing and if that has an affect on the chrome.
I'll try to pick apart what's happening here tomorrow.
I've reduced this about as far as I can, but I'm at the point where I think I'm going to need some insight from some of our layout engineers.

STR for macOS:

1. Apply attachment 9002081 [details] and build. This disables a bunch of the lightweight theme backend, but keeps the behaviour inside browser-compacttheme.js, which toggles the enabled state of a StyleSheet in the browser.xul document (compacttheme.css).

2. Start up the browser, and open up Customize Mode
3. Click on "Themes" at the bottom, and move the mouse straight up to either "Dark" or "Light".
4. Wait a second - about the time it would take for the tooltip to appear.
5. The bug should appear. If this doesn't happen, retry from Step 2. Make sure to take the most direct route possible with the mouse.

Timing is key - see video: https://www.screencast.com/t/ARKErVAGyy

ER:

A tooltip to appear.

AR:

Layout goes a bit funky. The XUL "customization-container" box, which has flex="1" (but also display: flex), shrinks itself down for some reason.


The compacttheme.css file is key here - removing the (empty!) -moz-lwtheme rules makes the bug go away. Also, remember, this is a regression from bug 1461793... so likely this has something to do with tooltips?
Flags: needinfo?(mconley) → needinfo?(emilio)
So far I haven't made a _lot_ of progress on this (sorry, got sidetracked!).

In any case, with a bit of logging, I can see that the only style change that is present in the 'bug' version and in the non-bug version is the following:

<menupopup id="BMB_bookmarksPopup" class="cui-widget-panel cui-widget-panelview..." placespopup="true" context="placesContext" openInTabs="children" onmouseup="BookmarksEventHandler.onMouseUp(event);" oncommand="BookmarksEventHandler.onCommand(event);" onclick="BookmarksEventHandler.onClick(event, ..." onpopupshowing="BookmarkingUI.onPopupShowing(event); ..." tooltip="bhTooltip" popupsinherittooltip="true" flip="both" side="top" position="bottomcenter topright" style="pointer-events: none;"> (0x7f0f639fe4c0)
 > Style changed (reset_only: true)
 > damage: GeckoRestyleDamage(nsChangeHint(8388614))

(The rest are the expected repaints from hovering over the 'Dark' menu item).

I'll keep looking at this, but does that menupopup ring a bell to anyone? It does seem related to the bug that regressed this.
So, I have good news and bad news. :)

Good news:

 * I'm pretty sure this is not a style system bug.
 * I sort of understand the problem here.

Bad news:

 * The problem lies in XUL layout code, and is triggered by a very specific set of conditions (node insertions, removals, and this margin reflow I'll mention below).
 * And thus we're not ultra-likely to fix it anytime soon.

So, to summarize... The only difference between the 'good' (expected results) and the 'bad' (actual results) is a style change on the bookmarks popup (above are the different rules). Given the style change, the side="top" attribute was added to the popup.

This is a margin change which makes that some of the other elements (including the one that ends up mis-sized) get reflowed, and the bug lies there on reflow.

The rest of style changes are expected (from hovering the theme item) and don't cause any layout. The style change that matters is attached here. There are other content insertions and removals which do cause reflow, and are the trigger of the bug along with the margin change.

Now, some questions:

 * How is it that you can reproduce it with empty :-moz-lwtheme rules, but not with other rules?

We optimize stylesheet insertions and removals, so you need to specify a selector that the invalidation code doesn't understand in order to trigger a full restyle and presumably make the next driver tick slow enough that the timing matches. That's why emptying the stylesheet / commenting out the stylesheet enabled change "fixes" it.

Re the :-moz-lwtheme selectors, that was a red herring. I can reproduce the bug just with:

  :root { }
  :root { }

Or with:

  * { }
  * { }

for that matter.

(I haven't managed to reproduce the bug with just one rule, which is _really_ weird, but I'm pretty sure at this point that that's not the root cause of the issue).

 * How can it be fixed / worked around?

I have no idea of where this attribute setting happens, but we probably need to ensure it happens at a time where other insertions and removals don't trigger this bug.

Presumably dholbert would be the right person to look at the actual layout side of this. The issue is that the XUL box frame is reflowing the flex container (#customization-container) with indefinite width and height, so the flex container gets the height of the sum of its children, and then nothing goes along to reflow it with the right size.

What's going on is that some code calls GetXULBoxAscent on the parent of #customization-container. That ends up entering in RefreshSizeCache:

  https://searchfox.org/mozilla-central/rev/ef8b3886cb173d5534b954b6fb7eb2d94a9473d0/layout/generic/nsFrame.cpp#10130

Which ends up reflowing #customization-container with unconstrained width and height:

  https://searchfox.org/mozilla-central/rev/ef8b3886cb173d5534b954b6fb7eb2d94a9473d0/layout/generic/nsFrame.cpp#10181

But then nothing actually tells us to reflow again, so we end up with the bogus height. So at this point I'm moderately sure this is a XUL layout bug.

Sorry I cannot provide a platform-side fix.
Flags: needinfo?(emilio)
Not the prettiest, but this fixes the issue here. Note that I can only reproduce with your reduction here, but assuming the same style change is the problematic one, this should work around it.

Does this fix the bug for you Mike / Paolo?
Attachment #9002168 - Flags: feedback?(paolo.mozmail)
Attachment #9002168 - Flags: feedback?(mconley)
(In reply to Emilio Cobos Álvarez (:emilio) from comment #14)
> I have no idea of where this attribute setting happens, but we probably need
> to ensure it happens at a time where other insertions and removals don't
> trigger this bug.

adjustArrowPosition from the arrowpanel binding is responsible for this:

https://searchfox.org/mozilla-central/rev/ef8b3886cb173d5534b954b6fb7eb2d94a9473d0/toolkit/content/widgets/popup.xml#182,185,199,202

It gets called from a popuppositioned event:

https://searchfox.org/mozilla-central/rev/ef8b3886cb173d5534b954b6fb7eb2d94a9473d0/toolkit/content/widgets/popup.xml#261
(In reply to Emilio Cobos Álvarez (:emilio) from comment #15)
> Does this fix the bug for you Mike / Paolo?

This fixes it for me. Since there's an underlying platform bug that we're unlikely to be able to fix anyways, I'm fine with landing this workaround, with a comment, although there is still an open question below.

(In reply to Dão Gottwald [::dao] from comment #16)
> It gets called from a popuppositioned event:

I'm wondering why we raise this event for the bookmarks popup, which is never shown? The JavaScript stack trace doesn't show anything before that event. Not sure if related, but if I place the Bookmarks Menu Button in the toolbar, the bug doesn't occur after restarting.

Also note that for me at least, to reproduce the bug I don't have to wait for the time required for the tooltip to appear.

(In reply to Emilio Cobos Álvarez (:emilio) from comment #14)
> [...] XUL box frame [...] flex container [...] I'm moderately sure this is a XUL layout bug.

Quoting what Brian said in a numerous other occasions, this is a good reminder of why we're doing the work of removing XUL. :-)
Comment on attachment 9002168 [details] [diff] [review]
Avoid the style change to workaround XUL reflow bug.

Review of attachment 9002168 [details] [diff] [review]:
-----------------------------------------------------------------

I'm happy with this if it sidesteps the XUL layout bug. Yep, another textbook example of us getting burned by ol' mysterious XUL layout.
Attachment #9002168 - Flags: feedback?(mconley) → feedback+
Attachment #9002168 - Flags: feedback?(paolo.mozmail) → review+
Pushed by paolo.mozmail@amadzone.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c0f5b64fe563
Avoid a style change to work around a XUL reflow bug. r=paolo
I'm having a hard time understanding your workaround. Care to explain? side="top" should already be set by default:

https://searchfox.org/mozilla-central/rev/ef8b3886cb173d5534b954b6fb7eb2d94a9473d0/toolkit/content/widgets/popup.xml#138
Flags: needinfo?(paolo.mozmail)
Flags: needinfo?(emilio)
Attached file Stack of the change.
Ugh, XBL, never surprised... Just out of curiosity I recorded the attribute change. That attribute change is the problematic one (the one where we sync the <content> attributes back to the bound element), as can be seen from this stack.

That's breaking invariants, fwiw... We're setting the attribute without notifying, but the attribute changes style of something we're constructing frames for already.

That's very much unsound in general. For example, what if we had a rule that said:

  menupopup[side="top"] { -moz-biding: url(not-popup.xml) }

We'd never end up loading that binding, until we randomly recomputed its style for other reasons.

Anyway, we just happen to notice the style and layout change caused by that attribute set sometime later while recomputing style of the <menupopup> for an unrelated reason.

The workaround still avoids this.
Flags: needinfo?(emilio)
Thanks Emilio!
Flags: needinfo?(paolo.mozmail)
https://hg.mozilla.org/mozilla-central/rev/c0f5b64fe563
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 63
I have reproduced this bug with Nightly 63.0a1 (2018-08-07) with Windows 10, 64 Bit!
This bug's fix is verified with latest Nightly!

Build ID 	20180827100129
User Agent 	Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:63.0) Gecko/20100101 Firefox/63.0
QA Whiteboard: [bugday-20180829]
Attachment #9002081 - Attachment is obsolete: true
Ba(In reply to Mohammad Maruf Rahman from comment #24)
> I have reproduced this bug with Nightly 63.0a1 (2018-08-07) with Windows 10,
> 64 Bit!
> This bug's fix is verified with latest Nightly!
> 
> Build ID 	20180827100129
> User Agent 	Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:63.0)
> Gecko/20100101 Firefox/63.0

Awesome, thanks!
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.