Closed Bug 1704551 Opened 2 months ago Closed 2 months ago

Poor style calculation performance for attribute selectors compared to class selectors

Categories

(Core :: CSS Parsing and Computation, defect)

Firefox 87
defect

Tracking

()

RESOLVED FIXED
89 Branch
Tracking Status
firefox89 --- fixed

People

(Reporter: nlawson, Assigned: emilio)

References

()

Details

Attachments

(3 files)

Attached file index.html

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36

Steps to reproduce:

  1. Go to http://bl.ocks.org/nolanlawson/raw/40ad8bfdf8e50705a5ab35599bccf826/?mode=attributes&rules=1000
  2. Click "Force style recalc"

Actual results:

The duration of style/layout/render as measured by the benchmark is ~190ms on my MacBook Pro running Firefox 87.0.

Expected results:

In Safari 14.0.3, the same benchmark measures ~15ms on the same machine.

Below is a description of the benchmark:

This page is a small benchmark that demonstrates that attribute selectors are much slower than class selectors in terms of the style/layout/render step, when using the following patterns:

Attributes:

[data-foo] div { }

Classes:

.foo div { }

To represent a non-trivial page, the benchmark constructs a DOM tree with ~3,000 div elements. It also adds 1,000 (non-matching) CSS rules with the given pattern to a <style> tag.

In the upper-right-hand corner, the page reports the number of milliseconds it took to run the style/layout/render process (measured using rAF+postMessage). Clicking "Force style recalc" also forces a recalc and re-measures.

The benchmark allows you to vary the number of rules, and also to use class selectors rather than attribute selectors. E.g. here is 100 rules using class selectors: http://bl.ocks.org/nolanlawson/raw/40ad8bfdf8e50705a5ab35599bccf826/?mode=classes&rules=100

This benchmark demonstrates that Firefox has a steep performance cliff for attribute selectors, but not class selectors:

Attributes:

  • 100 rules: ~25ms
  • 1000 rules: ~190ms
  • 10000 rules: ~1800ms

Classes:

  • 100 rules: ~5ms
  • 1000 rules: ~7ms
  • 10000 rules: ~7ms

Whereas Safari has much less of a performance cliff for attribute selectors:

  • 100 rules: ~10ms
  • 1000 rules: ~20ms
  • 10000 rules: ~72ms

Note that in the above, I'm measuring the performance when clicking "Force style recalc." The initial style/layout/render costs may be different.

Also note that I filed the same bug on Chrome, as their performance is worse than Firefox's: https://bugs.chromium.org/p/chromium/issues/detail?id=1196474 .

Attached is an index.html file containing the same benchmark, but note that the query params matter when loading the page.

So this is for the same reason as the Chrome bug: We don't add attribute names to the bloom filter we use to fast-reject selectors. I guess if sites depend on this, then we should do this, but it seems a bit unfortunate to do more work for all elements.

Also, we should probably still keep prioritizing classes / ids over attribute names when collecting hashes, or something.

Status: UNCONFIRMED → NEW
Component: Untriaged → CSS Parsing and Computation
Ever confirmed: true
Product: Firefox → Core

Safari does this. This reduces the runtime in the example linked from
comment 0 quite a lot (40ms on a local opt build, from ~130ms on a
release nightly build).

I added a pref because there's a slight chance of performance
regressions on pages that do not use attribute selectors, as we're now
doing more unconditional work per element (adding the attributes to the
bloom filter). But the trade-off should be worth it, I think.

Assignee: nobody → emilio
Status: NEW → ASSIGNED

On my local build (no LTO / PGO / etc) the time goes down from 220ms to 40ms, so the time should be quite a bit lower on a release build :)

Thank you very much for looking into this! It will be interesting to see if there are tradeoffs, and if the WebKit folks found any problems with that.

For the record, we ran into this at Salesforce and were considering switching from attribute selectors to class selectors due to the performance issues in Firefox and Chrome. But if Firefox and Chrome had equivalent performance to Safari (and it looks like your patch does this!), then that would save us a ton of time rewriting our code. So again, thanks for your help! :)

I need to use these from Rust to insert mapped attribute names in the
bloom filter, and this would save me handrolling even more bindings :)

Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/590b23237958
Make nsMappedAttributes more similar to AttrArray. r=smaug
https://hg.mozilla.org/integration/autoland/rev/3ddc924e8fa5
Add attribute names to the bloom filter. r=boris
Status: ASSIGNED → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → 89 Branch

It's amazing to see such a fast turnaround time for this bug! Firefox engineers always amaze me. Thank you! :)

Looking at the implementation, though, it seems this will only apply to attribute names, not values, correct? E.g. it would apply to:

[data-foo] div { }

but not:

[data-foo="bar"] div { }

My benchmark can also test attribute values (e.g. this url), and it seems that attribute values show a high perf cost for all three browser engines (times are in ms):

Number of CSS rules Chrome 89 Firefox 87 Safari 14
100 117 29 26
1000 1002 209 364
10000 9899 2243 3478

(Note Firefox actually wins at 1,000 and 10,000 rules. :) )

Is there a large perf tradeoff with adding attribute values to the Bloom filter as well?

(In reply to Nolan Lawson from comment #8)

(Note Firefox actually wins at 1,000 and 10,000 rules. :) )

Yeah we have parallel style recalc and that really shows for big recalcs :)

Is there a large perf tradeoff with adding attribute values to the Bloom filter as well?

Yeah attribute values are probably not worth it, because there's a lot of attributes that aren't stored as interned strings (as attribute values are not always commonly repeated), and thus we don't have the precomputed hash around and would need to spend time hashing stuff for every recalc. I could look in whether it's feasible to intern all attr values, but I suspect the answer will be no. Also attr values can be pretty huge compared to attribute names.

Finally, the bloom filter optimization could only help with exact matches (so [foo=bar] kind of selectors), and only when matched case-sensitively. So off hand it seems to me it's probably not going to be worth it.

Perhaps with some sort of artificial string limit on the maximum attribute length we intern or put in the filter? I'm on my phone now but I'll try to file a bug to investigate tomorrow... Seems a tad more complex though.

Note that we store the attribute name even for [data-foo="bar"] kind of selectors btw, so this bug also helps to avoid work on elements without data-foo ancestors. But yeah avoiding the matching when you have different attribute values is more complex as per the above.

Interesting, thanks for the response! I hadn't realized that this patch would affect [data-foo="bar"] selectors as well, at least insofar as it can check for the presence or absence of data-foo in the ancestor chain. So this patch should also improve my attribute values benchmark (causing Firefox to pull even further ahead 🙂).

It also seems like this patch would be a big enough perf win on [data-foo="bar"] selectors in most cases – the exception would be cases where the data-foo attribute is applied widely throughout the DOM while varying the attribute value… but this doesn't seem like a common use case.

Would this patch also have an impact on attribute value selectors like [data-foo^="bar"], [data-foo*="bar"], and [data-foo$="bar"]? If it improves [data-foo="bar"] by fast-pathing the check of whether the ancestor chain includes data-foo, it seems like it should improve these as well.

Yes, it should, but in the "attribute values" mode of your benchmark, effectively ~all nodes have ancestors with data-color attributes so perf should be unchanged, afaict.

My mistake, yes, you're right; that's how I wrote the benchmark. A more realistic version would not look like that.

Yep, you're exactly right. In a more realistic benchmark (IMO) where most DOM nodes don't have an ancestor with data-foo, the results for [data-foo="bar"] look similar to the results for [data-foo]:

Number of CSS rules Chrome 89 Firefox 87 Safari 14
100 116 29 11
1000 1020 205 17
10000 9655 2210 59

I imagine Firefox with the patch would look more like Safari in this benchmark. 🙂

In that case, this patch is great for our use case (and probably most websites using this pattern, I imagine). Thanks again for your diligence and hard work!

Yeah, I get 7 / 20 / 128 on Nightly on this machine, with Safari scoring 14 / 20 / 75.

Awesome, thank you!

You need to log in before you can comment on or make changes to this bug.