Poor style calculation performance for attribute selectors compared to class selectors
Categories
(Core :: CSS Parsing and Computation, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox89 | --- | fixed |
People
(Reporter: nlawson, Assigned: emilio)
References
()
Details
Attachments
(3 files)
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36
Steps to reproduce:
- Go to http://bl.ocks.org/nolanlawson/raw/40ad8bfdf8e50705a5ab35599bccf826/?mode=attributes&rules=1000
- Click "Force style recalc"
Actual results:
The duration of style/layout/render as measured by the benchmark is ~190ms on my MacBook Pro running Firefox 87.0.
Expected results:
In Safari 14.0.3, the same benchmark measures ~15ms on the same machine.
Below is a description of the benchmark:
This page is a small benchmark that demonstrates that attribute selectors are much slower than class selectors in terms of the style/layout/render step, when using the following patterns:
Attributes:
[data-foo] div { }
Classes:
.foo div { }
To represent a non-trivial page, the benchmark constructs a DOM tree with ~3,000 div elements. It also adds 1,000 (non-matching) CSS rules with the given pattern to a <style> tag.
In the upper-right-hand corner, the page reports the number of milliseconds it took to run the style/layout/render process (measured using rAF+postMessage). Clicking "Force style recalc" also forces a recalc and re-measures.
The benchmark allows you to vary the number of rules, and also to use class selectors rather than attribute selectors. E.g. here is 100 rules using class selectors: http://bl.ocks.org/nolanlawson/raw/40ad8bfdf8e50705a5ab35599bccf826/?mode=classes&rules=100
This benchmark demonstrates that Firefox has a steep performance cliff for attribute selectors, but not class selectors:
Attributes:
- 100 rules: ~25ms
- 1000 rules: ~190ms
- 10000 rules: ~1800ms
Classes:
- 100 rules: ~5ms
- 1000 rules: ~7ms
- 10000 rules: ~7ms
Whereas Safari has much less of a performance cliff for attribute selectors:
- 100 rules: ~10ms
- 1000 rules: ~20ms
- 10000 rules: ~72ms
Note that in the above, I'm measuring the performance when clicking "Force style recalc." The initial style/layout/render costs may be different.
Also note that I filed the same bug on Chrome, as their performance is worse than Firefox's: https://bugs.chromium.org/p/chromium/issues/detail?id=1196474 .
Attached is an index.html file containing the same benchmark, but note that the query params matter when loading the page.
Assignee | ||
Comment 1•3 years ago
|
||
So this is for the same reason as the Chrome bug: We don't add attribute names to the bloom filter we use to fast-reject selectors. I guess if sites depend on this, then we should do this, but it seems a bit unfortunate to do more work for all elements.
Also, we should probably still keep prioritizing classes / ids over attribute names when collecting hashes, or something.
Assignee | ||
Comment 2•3 years ago
|
||
Safari does this. This reduces the runtime in the example linked from
comment 0 quite a lot (40ms on a local opt build, from ~130ms on a
release nightly build).
I added a pref because there's a slight chance of performance
regressions on pages that do not use attribute selectors, as we're now
doing more unconditional work per element (adding the attributes to the
bloom filter). But the trade-off should be worth it, I think.
Updated•3 years ago
|
Assignee | ||
Comment 3•3 years ago
|
||
On my local build (no LTO / PGO / etc) the time goes down from 220ms to 40ms, so the time should be quite a bit lower on a release build :)
Reporter | ||
Comment 4•3 years ago
|
||
Thank you very much for looking into this! It will be interesting to see if there are tradeoffs, and if the WebKit folks found any problems with that.
For the record, we ran into this at Salesforce and were considering switching from attribute selectors to class selectors due to the performance issues in Firefox and Chrome. But if Firefox and Chrome had equivalent performance to Safari (and it looks like your patch does this!), then that would save us a ton of time rewriting our code. So again, thanks for your help! :)
Assignee | ||
Comment 5•3 years ago
|
||
I need to use these from Rust to insert mapped attribute names in the
bloom filter, and this would save me handrolling even more bindings :)
Pushed by ealvarez@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/590b23237958 Make nsMappedAttributes more similar to AttrArray. r=smaug https://hg.mozilla.org/integration/autoland/rev/3ddc924e8fa5 Add attribute names to the bloom filter. r=boris
Comment 7•3 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/590b23237958
https://hg.mozilla.org/mozilla-central/rev/3ddc924e8fa5
Reporter | ||
Comment 8•3 years ago
|
||
It's amazing to see such a fast turnaround time for this bug! Firefox engineers always amaze me. Thank you! :)
Looking at the implementation, though, it seems this will only apply to attribute names, not values, correct? E.g. it would apply to:
[data-foo] div { }
but not:
[data-foo="bar"] div { }
My benchmark can also test attribute values (e.g. this url), and it seems that attribute values show a high perf cost for all three browser engines (times are in ms):
Number of CSS rules | Chrome 89 | Firefox 87 | Safari 14 |
---|---|---|---|
100 | 117 | 29 | 26 |
1000 | 1002 | 209 | 364 |
10000 | 9899 | 2243 | 3478 |
(Note Firefox actually wins at 1,000 and 10,000 rules. :) )
Is there a large perf tradeoff with adding attribute values to the Bloom filter as well?
Assignee | ||
Comment 9•3 years ago
|
||
(In reply to Nolan Lawson from comment #8)
(Note Firefox actually wins at 1,000 and 10,000 rules. :) )
Yeah we have parallel style recalc and that really shows for big recalcs :)
Is there a large perf tradeoff with adding attribute values to the Bloom filter as well?
Yeah attribute values are probably not worth it, because there's a lot of attributes that aren't stored as interned strings (as attribute values are not always commonly repeated), and thus we don't have the precomputed hash around and would need to spend time hashing stuff for every recalc. I could look in whether it's feasible to intern all attr values, but I suspect the answer will be no. Also attr values can be pretty huge compared to attribute names.
Finally, the bloom filter optimization could only help with exact matches (so [foo=bar]
kind of selectors), and only when matched case-sensitively. So off hand it seems to me it's probably not going to be worth it.
Perhaps with some sort of artificial string limit on the maximum attribute length we intern or put in the filter? I'm on my phone now but I'll try to file a bug to investigate tomorrow... Seems a tad more complex though.
Assignee | ||
Comment 10•3 years ago
|
||
Note that we store the attribute name even for [data-foo="bar"] kind of selectors btw, so this bug also helps to avoid work on elements without data-foo ancestors. But yeah avoiding the matching when you have different attribute values is more complex as per the above.
Reporter | ||
Comment 11•3 years ago
|
||
Interesting, thanks for the response! I hadn't realized that this patch would affect [data-foo="bar"]
selectors as well, at least insofar as it can check for the presence or absence of data-foo
in the ancestor chain. So this patch should also improve my attribute values benchmark (causing Firefox to pull even further ahead 🙂).
It also seems like this patch would be a big enough perf win on [data-foo="bar"]
selectors in most cases – the exception would be cases where the data-foo
attribute is applied widely throughout the DOM while varying the attribute value… but this doesn't seem like a common use case.
Would this patch also have an impact on attribute value selectors like [data-foo^="bar"]
, [data-foo*="bar"]
, and [data-foo$="bar"]
? If it improves [data-foo="bar"]
by fast-pathing the check of whether the ancestor chain includes data-foo
, it seems like it should improve these as well.
Assignee | ||
Comment 12•3 years ago
|
||
Yes, it should, but in the "attribute values" mode of your benchmark, effectively ~all nodes have ancestors with data-color
attributes so perf should be unchanged, afaict.
Reporter | ||
Comment 13•3 years ago
|
||
My mistake, yes, you're right; that's how I wrote the benchmark. A more realistic version would not look like that.
Reporter | ||
Comment 14•3 years ago
|
||
Yep, you're exactly right. In a more realistic benchmark (IMO) where most DOM nodes don't have an ancestor with data-foo
, the results for [data-foo="bar"]
look similar to the results for [data-foo]
:
Number of CSS rules | Chrome 89 | Firefox 87 | Safari 14 |
---|---|---|---|
100 | 116 | 29 | 11 |
1000 | 1020 | 205 | 17 |
10000 | 9655 | 2210 | 59 |
I imagine Firefox with the patch would look more like Safari in this benchmark. 🙂
In that case, this patch is great for our use case (and probably most websites using this pattern, I imagine). Thanks again for your diligence and hard work!
Assignee | ||
Comment 15•3 years ago
|
||
Yeah, I get 7 / 20 / 128 on Nightly on this machine, with Safari scoring 14 / 20 / 75.
Reporter | ||
Comment 16•3 years ago
|
||
Awesome, thank you!
Description
•