Closed Bug 1371955 Opened 2 years ago Closed 2 years ago

stylo: Figure out why we reflow the whole document when restyling the HTML spec when we shouldn't.

Categories

(Core :: CSS Parsing and Computation, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox56 --- fixed

People

(Reporter: emilio, Assigned: emilio)

References

(Blocks 1 open bug)

Details

Attachments

(5 files, 1 obsolete file)

See the numbers in bug 1368240 comment 35. When downloading the HTML spec and adding the following script:

<script>
document.onclick = function() {
  document.documentElement.offsetTop;
  var start = Date.now();
  document.documentElement.classList.toggle('bad');
  document.documentElement.offsetTop;
  alert(Date.now() - start);
}
</script>

Stylo takes roughly double than Gecko. See the profile linked there, there's just 1+ seconds of reflow that aren't there on stock Gecko.

The only rule .bad in the page is:

  .bad, .bad *:not(.X\58X) { color: gray; border-color: gray; background: transparent; }

That shouldn't make us reflow afaict.

I haven't debugged further.
See Also: → 1368240
See Also: → 1371975
So after looking a bit more closely, this isn't that we reflow the whole document... But we reflow quite a few extra bits. In particular, we reframe about every <li> and <table> on the page, and quite a few <span>s. Also, we reframe a few <pre>s due to ::first-line, but that's not what causes most of the time to be spent there.
Note that this is logging only the normal change hint stored on the element, so it can't be due to the bullet frame / anon box setup afaict.
Attached file testcase
The good part is that it repros with the following, which will probably make it way more easy to debug :).

Need to sleep now, though.
Attachment #8876453 - Attachment mime type: text/plain → text/html
So I see us queuing up a bunch of reflows targeted at <li> elements, from what we think is a style change on them.

The reason that happens is that in nsStyleList::CalcDifference we find a difference in mCounterStyle.

And the reason _that_ happens is that we're comparing resolved to unresolved counter styles!  mCounterStyle is resolved, the type is eCounterStyle, and its mStyle is NS_STYLE_LIST_STYLE_DISC.  But aNewData.mCounterStyle is unresolved (type is eUnresolvedAtom) and the atom is for the string "disc", which would end up resolving to precisely NS_STYLE_LIST_STYLE_DISC.

Looks like we don't resolve counter styles until nsStyleList::FinishStyle which happens after nsStyleList::CalcDifference?
We could make those counter style resolutions happen off a SequentialTask, if we need.  But I think we should be able to just compare unresolved to resolved counter styles, based on the name.  Xidorn?
Flags: needinfo?(xidorn+moz)
Well, I debugged a bit more. The <li> reflows are because the comparison we do in nsStyleList::CalcStyleDifference are shallow, and each time we restyle a <li> element Servo the value of mCounterStyle::mRaw seems to change.

Xidorn, any chance you could investigate why? I didn't follow the counter-style work closely.

Also, the way we convert from Servo's CounterStyleOrNone::Symbols(..) to Gecko's representation is terribly inefficient, but that's another matter.
Err, I should stop writing comments and sending "Submit my comment" when there are collisions...
I really really won't investigate the <table> and <span> bits for today... I promise I'm going to sleep :)
(In reply to Cameron McCormack (:heycam) from comment #5)
> We could make those counter style resolutions happen off a SequentialTask,
> if we need.  But I think we should be able to just compare unresolved to
> resolved counter styles, based on the name.  Xidorn?

Yeah, at least for the case that it matters (when they haven't actually changed) it seems doable to look it up.
(In reply to Emilio Cobos Álvarez [:emilio] from comment #1)
> So after looking a bit more closely, this isn't that we reflow the whole
> document... But we reflow quite a few extra bits. In particular, we reframe
> about every <li> and <table> on the page, and quite a few <span>s. Also, we
> reframe a few <pre>s due to ::first-line, but that's not what causes most of
> the time to be spent there.

The first "reframe" should be just "reflow" here, btw...
(In reply to Cameron McCormack (:heycam) from comment #5)
> We could make those counter style resolutions happen off a SequentialTask,
> if we need.  But I think we should be able to just compare unresolved to
> resolved counter styles, based on the name.  Xidorn?

There are cases that this would cause problems:

Firstly, if the counter style rule changes, we really need to reflow the affected elements. This probably doesn't happen a lot in real world (since we are the only implementation so far) but I bet we have test for that which I would prefer we don't regress.

Secondly, when the value of list-style-type refers to a non-existing counter style, it gets resolved to decimal, in which case we aren't really able to get the original specified name from the resolved counter style. When that happen, we would still end up reflowing the elements. This may not be very common either, but it could happen based on bug 1027647 and its duplications after we implemented that spec.

Based on these reasons, I would probably prefer we do the resolution before CalcStyleDifference rather than compaing unresolved to resolved counter styles. We always do that in Gecko when computing the value of list-style-type, and in most of time, it is just a single hashtable lookup on a mostly static table with a very limited number of result. So it shouldn't be an additional cost compare with Gecko, nor should it be expensive.

Another idea is that, we do the hashtable lookup in parallel traversal, and keep the value unresolved if the lookup fails. That way, in most of time we would only need to resolve values after our first restyle. And further restyling would simply just get the right pointer we want during parallel traversal.

There is another related optimization here that, we currently trigger reflow whenever list-style-type changes. However, if list-style-position is "outside", we really only need to update overflow (and do repaint). This is probably worth a different bug, though.

(In reply to Emilio Cobos Álvarez [:emilio] from comment #6)
> Also, the way we convert from Servo's CounterStyleOrNone::Symbols(..) to
> Gecko's representation is terribly inefficient, but that's another matter.

I think this is always the case for Gecko as well... And you would always have this problem when you have a list to convert, no?

One problem is that when we restyle the element, we get a new anonymous counter style, which triggers another reflow. We can probably make CalcDifference to compare the internal data of anonymous counter style rather than just the pointer. But at this moment I don't think it's worth going deeper on this stuff.
Flags: needinfo?(xidorn+moz)
I filed bug 1371976 on the counterstyle thing.

Just manually commenting out that check drops the reflow time by 1/3, but there's still a bunch of stuff happening.

For some <span>s I'm seeing a padding change (in nsStylePadding) from 0 for all paddings to either (288, 0, 115, 0) or (115, 0, 115, 0), with those numbers in app units.

Also some <span>s I'm seeing a filter (in nsStyleEffects) change, from none to NS_STYLE_FILTER_GRAYSCALE.

And also for some <table>s I'm seeing mUnicodeBidi change (in nsStyleTextReset), from NS_STYLE_UNICODE_BIDI_NORMAL to NS_STYLE_UNICODE_BIDI_ISOLATE, which is what triggers the reflow for the tables.

This last is confusing to me.  I checked, and I see 96 <table>s on this page.  All of them have "unicode-bidi: isolate"; it comes from a rule in our UA stylesheet (html.css sets that style on <table>).  So where is the "old" struct with mUnicodeBidi = 0 coming from?

OK, so I went back to look at <span> bits and the page does in fact have a bunch of "filter: grayscale(100%)" and "filter: grayscale(50%)" etc on some spans.  And there are rules for "padding: 0.2em 0" and "padding-top: 0.5em" and whatnot.  Those could be ending up as my "288" and "115" respectively.  Certinly 288/115 is quite close to 2.5, and I think would correspond to a font-size of 9.6px or so.

Something seems fishy here.
Attached file Test-case for the <span> thing. (obsolete) —
Curiously enough, this only happens if the span has "display: table", otherwise it doesn't happen...
Attachment #8876468 - Attachment is obsolete: true
We're doing also a bunch of unneeded visited stuff... I'm debugging why.
(In reply to Emilio Cobos Álvarez [:emilio] from comment #16)
> We're doing also a bunch of unneeded visited stuff... I'm debugging why.

This one is a false alarm, we're logging debug!("Cascade primary for {:?}, visited: {:?}", self, cascade_visited); before bailing out if there are no rules.
Ok, I know what's going on....
Assignee: nobody → emilio+bugs
Attachment #8876469 - Attachment mime type: text/plain → text/html
Nice catch!

Two comments:

1)  We should be able to write a test for this using the reflow counter stuff, right?

2)  Why are we doing this "get the style context" thing at all, given that we have the right ComputedValues in old_values anyway in compute_style_difference?  Seems like we could diff those directly...  Specifically, seems like we could make nsStyleContext::CalcStyleDifferenceInternal a static function that gets two arguments, templated on both their types.  Then we could directly diff two FakeStyleContexts based on the two ComputedValues, and not have to do this "grovel about for a style context" thing at all.  Is there a reason we're not doing that?
(In reply to Boris Zbarsky [:bz] (if a patch has no decent message, automatic r-) from comment #20)
> Nice catch!
> 
> Two comments:
> 
> 1)  We should be able to write a test for this using the reflow counter
> stuff, right?

Right, will do

> 2)  Why are we doing this "get the style context" thing at all, given that
> we have the right ComputedValues in old_values anyway in
> compute_style_difference?  Seems like we could diff those directly... 
> Specifically, seems like we could make
> nsStyleContext::CalcStyleDifferenceInternal a static function that gets two
> arguments, templated on both their types.  Then we could directly diff two
> FakeStyleContexts based on the two ComputedValues, and not have to do this
> "grovel about for a style context" thing at all.  Is there a reason we're
> not doing that?

So the main reason for it is to avoid appending change hints for stuff that doesn't get used. Before cam implemented the "stop the cascade" thing, we also avoided diffing unused structs. Now we diff all of them, so not sure how much perf benefit it yields in practice, I guess changing stuff that isn't used shouldn't be that common...

Probably we need to implement a ComputedValues vs. ComputedValues thing for :visited, ::-first-letter and ::-first-child, I guess, so probably once that's implemented we can see if the avoidance of posting change hints is worth it.
> we need to implement a ComputedValues vs. ComputedValues thing for :visited, ::-first-letter and ::-first-child, I guess

s/child/line/ I assume.  I'll add such a thing in bug 1372318 and then we'll see.
(In reply to Emilio Cobos Álvarez [:emilio] from comment #21)
> > 2)  Why are we doing this "get the style context" thing at all, given that
> > we have the right ComputedValues in old_values anyway in
> > compute_style_difference?  Seems like we could diff those directly... 
> > Specifically, seems like we could make
> > nsStyleContext::CalcStyleDifferenceInternal a static function that gets two
> > arguments, templated on both their types.  Then we could directly diff two
> > FakeStyleContexts based on the two ComputedValues, and not have to do this
> > "grovel about for a style context" thing at all.  Is there a reason we're
> > not doing that?

Let's follow up about this in bug 1372318.


(In reply to Xidorn Quan [:xidorn] UTC+10 from comment #11)
> Based on these reasons, I would probably prefer we do the resolution before
> CalcStyleDifference rather than compaing unresolved to resolved counter
> styles. We always do that in Gecko when computing the value of
> list-style-type, and in most of time, it is just a single hashtable lookup
> on a mostly static table with a very limited number of result. So it
> shouldn't be an additional cost compare with Gecko, nor should it be
> expensive.

This sounds good to me.
Comment on attachment 8876477 [details]
Bug 1371955: Diff against the correct style, not the table wrapper style.

https://reviewboard.mozilla.org/r/147830/#review152690
Attachment #8876477 - Flags: review?(cam) → review+
Comment on attachment 8876533 [details]
Bug 1371955: Test.

https://reviewboard.mozilla.org/r/147858/#review152692

Maybe add a sub-test for display:table on a ::before, too?

::: layout/style/test/test_restyle_table_wrapper.html:4
(Diff revision 1)
> +<!doctype html>
> +<meta charset="utf-8">
> +<title>
> +  Test for bug 1371955: We don't incorrectly thing that a table wrapper style

*think
Attachment #8876533 - Flags: review?(cam) → review+
Comment on attachment 8876534 [details]
Bug 1371955: Keep the test manifest sorted and clarify a comment in test_stylesheet_additions.html.

https://reviewboard.mozilla.org/r/147860/#review152698
Attachment #8876534 - Flags: review?(cam) → review+
Pushed by ecoal95@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/8e206f279b44
Diff against the correct style, not the table wrapper style. r=heycam
https://hg.mozilla.org/integration/autoland/rev/92451ad1ada7
Test. r=heycam
https://hg.mozilla.org/integration/autoland/rev/d31cb2f2e020
Keep the test manifest sorted and clarify a comment in test_stylesheet_additions.html. r=heycam
Priority: -- → P1
You need to log in before you can comment on or make changes to this bug.