stylo: hover state style doesn't refresh correctly

VERIFIED FIXED in Firefox 57

Status

()

defect
P1
normal
VERIFIED FIXED
2 years ago
2 years ago

People

(Reporter: jkt, Assigned: heycam)

Tracking

(Blocks 1 bug)

Trunk
mozilla57
x86_64
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox55 unaffected, firefox56 wontfix, firefox57 fixed)

Details

Attachments

(3 attachments, 1 obsolete attachment)

Posted image Selection_719.png
STR

1. Go to: https://github.com/w3c/web-platform-tests/
2. Hover over the files area

Expected

Only current rows to hightlight

Actual

Sometimes rows or cells stay styled as if hovered (see screenshot attachment)

- layout.css.servo.enabled = true
- Linux Ubuntu 16.04
- 56.0a1 (2017-07-17) (64-bit) en-gb Nightly
One thing that may have compounded this issue is all my CPU cores were maxed out compiling Firefox.
It doesn't seem to be related to a busy CPU. I can reproduce this on the page reliably.

It could either that we don't restyle correctly, or we don't generate the correct hint.
I haven't been able to repro so far, and I've tried quite a few times :/
I was able to repro just one time since this was reported. I tried a full style flush and nothing changes, so this is not a style invalidation issue per se...

We could either be generating incorrect changehints, or this is some other paint invalidation issue that happens to be wallpapered in current Gecko because it restyles more frames... Unless there's strong evidence, I'd think it's probably the former.
Priority: -- → P3
I can reproduce this occasionally, when I madly move the mouse over the table.  In the devtools, it shows that a cell that has the mistaken highlighting matches the "table.files tr.navigation-focus td" rule, while the one next to it in the same row doesn't.  And the DOM inspector shows that class="navigation-focus" is not on the <tr>.  So it seems like a restyling bug rather than a change hint / invalidation bug.
If I set STYLO_THREADS=1 this is very easy to reproduce, basically every new cell I hover over leaves some incorrectly highlighted cells.
Disabling the style sharing cache makes the problem go away.
I'll take a deeper look.  (And try to learn a bit more about the style sharing cache in the process.)
Assignee: nobody → cam
Status: NEW → ASSIGNED
Priority: P3 → P2
(In reply to Cameron McCormack (:heycam) from comment #5)
> I can reproduce this occasionally, when I madly move the mouse over the
> table.  In the devtools, it shows that a cell that has the mistaken
> highlighting matches the "table.files tr.navigation-focus td" rule, while
> the one next to it in the same row doesn't.  And the DOM inspector shows
> that class="navigation-focus" is not on the <tr>.  So it seems like a
> restyling bug rather than a change hint / invalidation bug.

Huh, nice catch. FWIW, we're supposed to reject stuff with different classes in here:

  http://searchfox.org/mozilla-central/rev/09c065976fd4f18d4ad764d7cb4bbc684bf56714/servo/components/style/sharing/mod.rs#667

But will leave it to your investigation to see what's going on. Thanks for taking a closer look!
In the test, just before the final toggling of class names, we have this:

  <div id=first class=red>
    <span></span>
  </div>
  <div id=second>
    <span></span>
  </div>

and at this point, the two <div>s share ComputedValues, since at the time they were first styled, neither had any class on them.  Toggling the classes change it to this:

  <div id=first>
    <span></span>
  </div>
  <div id=second class=red>
    <span></span>
  </div>

and with the propagation of invalidations, we only restyle the two <span>s.  When styling the second <span>, we choose the first <span> as a candidate.  They have different parents, but the parents have the same ComputedValues, so we think we think it's safe to share their styles.  But the differing class="" on the parents really should prevent that.

Can we somehow efficiently mark the <div>s as no longer being suitable as a parent for style sharing purposes?
Yeah, so this is tricky.

Read the IRC chat between Cameron and I in:

 http://logs.glob.uno/?c=mozilla%23servo&s=31+Jul+2017&e=31+Jul+2017#c722855

We should also look into why we're not managing to share styles with the parallel traversal (at least in the reduced test-case), looks like we should be able to.

But basically, this is a cousin sharing bug because we assume that if the parents have the same style two children may not have

But anyway, the best options so far I think are:

 * Check the WAS_RESTYLED flag on the parent -> Implies disabling cousin sharing for the initial style case.
 * Add a WAS_INITIALLY_STYLED flag -> Implies having to also clear restyle state in the frame constructor.

I think the second option is nicer, because it'd allow us to remove [1]. However it may need more work during frame construction, which is already hot...

Bobby, Boris, any opinion on the two approaches?

[1]: http://searchfox.org/mozilla-central/rev/09c065976fd4f18d4ad764d7cb4bbc684bf56714/layout/base/ServoRestyleManager.cpp#565
Flags: needinfo?(bzbarsky)
Flags: needinfo?(bobbyholley)
(In reply to Emilio Cobos Álvarez [:emilio] from comment #12)
> But basically, this is a cousin sharing bug because we assume that if the
> parents have the same style two children may not have

Err, forgot to finish typing this.

The thing is that the style sharing cache assumes that if the parents have the same style, they also match the same (non-revalidation) selectors on the page, and this doesn't hold when you have a restyle that affect only the children (like the test-case).
I'm not sure I understand the two proposals from comment 12.  First of all, what are we actually proposing to check on what elements on what point?  Second, why do those checks fix the problem?
Flags: needinfo?(bzbarsky) → needinfo?(emilio+bugs)
(In reply to Boris Zbarsky [:bz] from comment #14)
> I'm not sure I understand the two proposals from comment 12.  First of all,
> what are we actually proposing to check on what elements on what point? 
> Second, why do those checks fix the problem?

So, (1) would meant to check that if the parents differ, both parents contains the WAS_RESTYLED flag, to ensure that we've tried to match the revalidation selectors, and we've tried to insert it into the cache. That has the drawback of effectively disabling cousin sharing for initial style, since that flag isn't set there (since we don't clean it up).

(2) would mean to add another flag (and do the relevant cleanup in the frame constructor), and check for those.
Flags: needinfo?(emilio+bugs)
> both parents contains the WAS_RESTYLED flag

Ah, and since that's per-traversal it really does give us the invariant we need here, right?

Disabling cousing sharing for initial style seems pretty bad, though.

> (2) would mean to add another flag 

With what semantics?
(In reply to Boris Zbarsky [:bz] from comment #16)
> > both parents contains the WAS_RESTYLED flag
> 
> Ah, and since that's per-traversal it really does give us the invariant we
> need here, right?

Right.

> Disabling cousing sharing for initial style seems pretty bad, though.

Yeah...

> > (2) would mean to add another flag 
> 
> With what semantics?

Pretty much the same, just need to clear it from the frame constructor in order to properly clean it up when requesting the style there.
> Pretty much the same

In the "per-traversal" sense?  But not in the "only set for non-initial traversals" sense?

Why do we need to involve the frame constructor?  I'd think the traversal could handle this flag itself; I feel like I'm missing something.
> We should also look into why we're not managing to share styles with the parallel traversal

I filed bug 1385982 on this bit.
(In reply to Boris Zbarsky [:bz] from comment #18)
> > Pretty much the same
> 
> In the "per-traversal" sense?  But not in the "only set for non-initial
> traversals" sense?

Right, in the per-traversal sense.

> Why do we need to involve the frame constructor?  I'd think the traversal
> could handle this flag itself; I feel like I'm missing something.

So we can set that flag, but it needs to get reset somewhere. I suppose we could do that on the next traversal, assuming we only use it for this, given that if the traversal never arrives at the element we wouldn't care about it.

But it feels slightly saner to guarantee that it's only set during the first traversal, and can't get set indefinitely. For that we'd need something to clear it up, which would be the frame constructor whenever it processes those elements.
> So we can set that flag, but it needs to get reset somewhere.

I still feel like I'm missing something.  Why can't we just have a second flag on RestyleData?  Or some other data structure that hangs off ElementData and we clear when we finish the initial traversal?  This would be done by the traversal machinery, just like the RestyleData clearing, right?  Or is that not cleared by traversal?  I still feel like I'm missing something here....

The other problem I'm having is that if we _could_ share the styles of the two kids (e.g. they both got the "red" class) we'd still not do that with either of the proposed solutions, right?  I'm not sure what we can do about that yet.  Still trying to update my restyle cache mental model to non-toplevel traversals and the resulting violations of what I thought were invariants.
(In reply to Boris Zbarsky [:bz] from comment #21)
> > So we can set that flag, but it needs to get reset somewhere.
> 
> I still feel like I'm missing something.  Why can't we just have a second
> flag on RestyleData?  Or some other data structure that hangs off
> ElementData and we clear when we finish the initial traversal?  This would
> be done by the traversal machinery, just like the RestyleData clearing,
> right?  Or is that not cleared by traversal?  I still feel like I'm missing
> something here....

So the WAS_RESTYLED flag is not cleared by the traversal itself, but by the post-traversal, as we peek up change-hints.

The point is that the initial traversal doesn't have a corresponding post-traversal (well, the frame construction is something like that, that's why I think clearing that stuff there would be nice, to also remove the special-case of posting reconstruct hints and having to clear all the restyle state from the subtree before-hand).

> The other problem I'm having is that if we _could_ share the styles of the
> two kids (e.g. they both got the "red" class) we'd still not do that with
> either of the proposed solutions, right?

Right
(In reply to Emilio Cobos Álvarez [:emilio] from comment #13)
> The thing is that the style sharing cache assumes that if the parents have
> the same style, they also match the same (non-revalidation) selectors on the
> page, and this doesn't hold when you have a restyle that affect only the
> children (like the test-case).

In what way does it make this assumption? It seems to me that it doesn't matter so much what the parents match per se, but rather that the parents should be indistinguishable as far as parent/ancestor selectors are concerned (when matching the children). Unless that's what you meant?

Anyway, just to make sure I understand correctly: the style sharing algorithm assumes that any ancestors that shared styles must also have identical class+id+state+preshints+revalidationValues. But the invalidation algorithm avoids avoids restyling the parent on class/state changes if the style of the parent would be unaffected.

It seems like we really want a really specific bit: one that says "we cleverly avoided restyling this element with our invalidator magic", and have the style sharing cache check that bit when making assumptions about parents with pointer-equal styles. This may be what emilio meant with (2).

This additional state doesn't seem to me like it would pose any additional post-traversal headaches, since we already have various other bits that need to be cleared. I'm going to try to improve the setup for that stuff a bit with the other bugs that I'm working on, but if we have to clear some bits already it doesn't hurt to add another.
Flags: needinfo?(bobbyholley)
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #23)
> (In reply to Emilio Cobos Álvarez [:emilio] from comment #13)
> > The thing is that the style sharing cache assumes that if the parents have
> > the same style, they also match the same (non-revalidation) selectors on the
> > page, and this doesn't hold when you have a restyle that affect only the
> > children (like the test-case).
> 
> In what way does it make this assumption? It seems to me that it doesn't
> matter so much what the parents match per se, but rather that the parents
> should be indistinguishable as far as parent/ancestor selectors are
> concerned (when matching the children). Unless that's what you meant?
> 
> Anyway, just to make sure I understand correctly: the style sharing
> algorithm assumes that any ancestors that shared styles must also have
> identical class+id+state+preshints+revalidationValues. But the invalidation
> algorithm avoids avoids restyling the parent on class/state changes if the
> style of the parent would be unaffected.

This is right.

> It seems like we really want a really specific bit: one that says "we
> cleverly avoided restyling this element with our invalidator magic", and
> have the style sharing cache check that bit when making assumptions about
> parents with pointer-equal styles. This may be what emilio meant with (2).

Kind of, but not really. The problem with this (and on why we need the more general-purpose bit) is that this would only mark one element. We would need to also mark every other element in the parent chain between the element when the invalidation starts... Maybe that's fine too.

> This additional state doesn't seem to me like it would pose any additional
> post-traversal headaches, since we already have various other bits that need
> to be cleared. I'm going to try to improve the setup for that stuff a bit
> with the other bugs that I'm working on, but if we have to clear some bits
> already it doesn't hurt to add another.

Yeah, this may be another option... Has the drawback of introducing hacks (and extra bits) on the invalidation code itself to fullfill the purpose of the style sharing cache though, which I'd rather avoid.

What I propose with (2) would mostly be equivalent, assuming we move part of the restyle-state clearing into the frame constructor, and wouldn't have this issue afaict.
(In reply to Emilio Cobos Álvarez [:emilio] from comment #24)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #23)
> > (In reply to Emilio Cobos Álvarez [:emilio] from comment #13)
> > > The thing is that the style sharing cache assumes that if the parents have
> > > the same style, they also match the same (non-revalidation) selectors on the
> > > page, and this doesn't hold when you have a restyle that affect only the
> > > children (like the test-case).
> > 
> > In what way does it make this assumption? It seems to me that it doesn't
> > matter so much what the parents match per se, but rather that the parents
> > should be indistinguishable as far as parent/ancestor selectors are
> > concerned (when matching the children). Unless that's what you meant?
> > 
> > Anyway, just to make sure I understand correctly: the style sharing
> > algorithm assumes that any ancestors that shared styles must also have
> > identical class+id+state+preshints+revalidationValues. But the invalidation
> > algorithm avoids avoids restyling the parent on class/state changes if the
> > style of the parent would be unaffected.
> 
> This is right.
> 
> > It seems like we really want a really specific bit: one that says "we
> > cleverly avoided restyling this element with our invalidator magic", and
> > have the style sharing cache check that bit when making assumptions about
> > parents with pointer-equal styles. This may be what emilio meant with (2).
> 
> Kind of, but not really. The problem with this (and on why we need the more
> general-purpose bit) is that this would only mark one element. We would need
> to also mark every other element in the parent chain between the element
> when the invalidation starts... Maybe that's fine too.

You mean between the element and where the invalidation starts and the nearest descendant that actually gets restyled? This is what I was imagining, but I don't know the invalidator setup so I don't exactly know how hard it is.

> 
> > This additional state doesn't seem to me like it would pose any additional
> > post-traversal headaches, since we already have various other bits that need
> > to be cleared. I'm going to try to improve the setup for that stuff a bit
> > with the other bugs that I'm working on, but if we have to clear some bits
> > already it doesn't hurt to add another.
> 
> Yeah, this may be another option... Has the drawback of introducing hacks
> (and extra bits) on the invalidation code itself to fullfill the purpose of
> the style sharing cache though, which I'd rather avoid.
> 
> What I propose with (2) would mostly be equivalent, assuming we move part of
> the restyle-state clearing into the frame constructor, and wouldn't have
> this issue afaict.

I don't follow this. It seems like we need to clear all the restyle state somewhere. We're currently doing it in the Post-Traversal, which means it doesn't really cost us any complexity to add another bit. How/why would we move that work to the frame constructor?
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #25)
> You mean between the element and where the invalidation starts and the
> nearest descendant that actually gets restyled? This is what I was
> imagining, but I don't know the invalidator setup so I don't exactly know
> how hard it is.

Not too hard. Slightly harder to verify its soundness when new stuff is inserted to the document in the same restyle I suppose...

> > What I propose with (2) would mostly be equivalent, assuming we move part of
> > the restyle-state clearing into the frame constructor, and wouldn't have
> > this issue afaict.
> 
> I don't follow this. It seems like we need to clear all the restyle state
> somewhere. We're currently doing it in the Post-Traversal, which means it
> doesn't really cost us any complexity to add another bit. How/why would we
> move that work to the frame constructor?

Right, I'm not proposing to move _all_ clearing to the frame constructor, just make the frame constructor _also_ clear it, which is pretty much addressing this: http://searchfox.org/mozilla-central/rev/bbc1c59e460a27b20929b56489e2e55438de81fa/layout/base/ServoRestyleManager.cpp#593
Priority: P2 → --
So I'm still thinking this through, but I'd like us to seriously consider the revalidation selector approach and not dismiss it out of hand.

We'd only need to do this for class selectors that are not in the rightmost complex selector.  And the bloom filter would filter things out in common cases, I'd think.

One issue with that approach is that it's not very robust.  It would fix this for class changes, but not, say, state changes, or various other things that we check right now before deciding to share style...  Maybe that's the best argument against it.
Priority: -- → P1
I think I really want to land something here for now. I'm going to send the more minimal patch that fixes the bug so taking a decision on the fix and reverting it when we implement it is easy...
(In reply to Emilio Cobos Álvarez [:emilio] from comment #29)
> Created attachment 8893419 [details]
> Bug 1381821: Prevent cousin sharing if we haven't restyle the parents.

s/restyle/restyled on the commit message of course. And the reftest is going to land too.
Comment on attachment 8893419 [details]
Bug 1381821: Prevent cousin sharing if we haven't restyled the parents.

https://reviewboard.mozilla.org/r/164526/#review169816

::: servo/components/style/sharing/checks.rs:43
(Diff revision 2)
> +    let second_data = second.borrow_data().unwrap();
> +
> +    // FIXME(emilio): This check effectively disables cousin style sharing
> +    // on the initial style.
> +    //
> +    // It's kind of unfortunate, but probably not too bad. An option to avoid

Why would it not be too bad?  Do we have any evidence for this?

Put another way, doesn't this disable cousin sharing on initial pageload on a bunch of pages?

I really don't see why this is ok.  If we're going to have any prayer of stylo being anywhere close to Gecko's memory usage, we really need to have working cousin sharing.
Attachment #8893419 - Flags: review?(bzbarsky) → review-
Comment on attachment 8893419 [details]
Bug 1381821: Prevent cousin sharing if we haven't restyled the parents.

https://reviewboard.mozilla.org/r/164526/#review169816

> Why would it not be too bad?  Do we have any evidence for this?
> 
> Put another way, doesn't this disable cousin sharing on initial pageload on a bunch of pages?
> 
> I really don't see why this is ok.  If we're going to have any prayer of stylo being anywhere close to Gecko's memory usage, we really need to have working cousin sharing.

I guess if this is a temporary measure because this is blocking dogfooding I could live with this.  But it really better be very very temporary, with a P1 bug filed to fix this properly.
Comment on attachment 8893419 [details]
Bug 1381821: Prevent cousin sharing if we haven't restyled the parents.

https://reviewboard.mozilla.org/r/164526/#review169816

> I guess if this is a temporary measure because this is blocking dogfooding I could live with this.  But it really better be very very temporary, with a P1 bug filed to fix this properly.

I was thinking on the full restyles we do when a stylesheet is added to the document... But that's fair I guess.
Comment on attachment 8893433 [details]
Bug 1381821: reftest.

https://reviewboard.mozilla.org/r/164532/#review169826

::: layout/reftests/bugs/1381821.html:9
(Diff revision 1)
> +.red > span { color: red; }
> +</style>
> +<div id="first"><span></span></div>
> +<div id="second"><span>This text should be green.</span></div>
> +<script>
> +document.body.offsetTop;

Using something like "second.offsetTop" has a better chance of not being optimized out in the future.

Or better yet, getComputedStyle(second.firstChild).color, getComputedStyle(first.firstChild).color, since those are the things we're really trying to flush out, right?
Attachment #8893433 - Flags: review?(bzbarsky) → review+
Comment on attachment 8893419 [details]
Bug 1381821: Prevent cousin sharing if we haven't restyled the parents.

https://reviewboard.mozilla.org/r/164526/#review169816

> I was thinking on the full restyles we do when a stylesheet is added to the document... But that's fair I guess.

The thing is, when we add a sheet to the document in the common case (sheet in <head>, etc) we have nothing to restyle because we have not inited the presshell yet.  For pages that are coded properly (don't start throwing in sheets after a bunch of content has loaded) we don't end up doing full-page restyles.

Per IRC discussion, I'm ok with this landing for the moment to unbreak nighlies, with the understanding that we need to fix this properly ASAP and that all memory measurements will continue to be useless until we do...
Comment on attachment 8893419 [details]
Bug 1381821: Prevent cousin sharing if we haven't restyled the parents.

https://reviewboard.mozilla.org/r/164526/#review169816

> The thing is, when we add a sheet to the document in the common case (sheet in <head>, etc) we have nothing to restyle because we have not inited the presshell yet.  For pages that are coded properly (don't start throwing in sheets after a bunch of content has loaded) we don't end up doing full-page restyles.
> 
> Per IRC discussion, I'm ok with this landing for the moment to unbreak nighlies, with the understanding that we need to fix this properly ASAP and that all memory measurements will continue to be useless until we do...

Oh, and please take out the "probably not too bad" bit and replace it with a comment that describes the actual situation (it's bad), with a link to the followup bug.
Comment on attachment 8893419 [details]
Bug 1381821: Prevent cousin sharing if we haven't restyled the parents.

https://reviewboard.mozilla.org/r/164526/#review169834
Attachment #8893419 - Flags: review- → review+
See Also: → 1387116
Duplicate of this bug: 1387040
Duplicate of this bug: 1387224
Attachment #8893419 - Attachment is obsolete: true
Duplicate of this bug: 1387321
(In reply to Emilio Cobos Álvarez [:emilio] from comment #43)
> Servo bits at https://github.com/servo/servo/pull/17968

Verified fixed in Nightly 57 x64 20170804100354 @ Debian Testing.
Can no longer reproduce comment 0.
Will mark this as verified when the reftest landed.
No longer blocks: 1385948
Duplicate of this bug: 1385948
https://hg.mozilla.org/mozilla-central/rev/11c744cde428
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Verified in comment 46
Status: RESOLVED → VERIFIED
Version: unspecified → Trunk
status-firefox56 = affected
Blocking bug 1381147 because we'd like to uplift this fix before starting the Stylo experiment on Beta 56.
(In reply to Chris Peterson [:cpeterson] from comment #51)
> Blocking bug 1381147 because we'd like to uplift this fix before starting the Stylo experiment on Beta 56.

This seems forgotten. (for comparison, beta 56 got a wontfix in bug 1385831 comment 18).
Thanks. I don't think we will uplift this to Beta 56. We've decided to run a smaller experiment on Beta than originally planned, so we won't be uplifting as many fixes.
You need to log in before you can comment on or make changes to this bug.