stylo: store simple selectors and combinators inline

RESOLVED FIXED

Status

()

Core
CSS Parsing and Computation
P1
normal
RESOLVED FIXED
9 months ago
9 months ago

People

(Reporter: bholley, Assigned: bholley)

Tracking

(Blocks: 2 bugs)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments, 1 obsolete attachment)

(Assignee)

Description

9 months ago
The current representation is bad for caching. By storing combinators within SimpleSelector and representing the entire selector as a single contiguous array, we can eliminate several levels of cache misses.

This almost doubles the performance of single-threaded uncached stylo on the testcase in [1], putting it at almost 2x Gecko's performance.

[1] https://github.com/heycam/style-perf-tests/pull/1
(Assignee)

Comment 1

9 months ago
This may have a small negative impact on parsing performance, but I think it's worth it.
(Assignee)

Comment 3

9 months ago
Created attachment 8859819 [details] [diff] [review]
Store selectors and combinators inline in a single sequence. v1

This improves cache locality and reduces allocations during parsing.

MozReview-Commit-ID: EQtaQD822so
Attachment #8859819 - Flags: review?(emilio+bugs)
(Assignee)

Comment 4

9 months ago
CCing some people who might be interested to see Servo's style system getting even faster. All the testing here is done without parallelism and without the style sharing cache, so the rest is just gravy. :-)
Looks good at a glance (not a detailed review), just one nit-pick, sorry :)

The SimpleSelector and ComplexSelector types originally matched spec concepts with the same names, but they don’t anymore. So I think we should not use these names. I suggest removing ComplexSelector and use ArcSlice<…> directly instead, and renaming SimpleSelector to something like SelectorComponent or Component. (If disambiguation is needed it could be used as selectors::Component.)
(Assignee)

Comment 6

9 months ago
Doesn't ComplexSelector store the same information now as before, just in a different format? I always thought that 'compound' was the sequence of simple selectors and complex was the sequence of sequences with combinators.
Comment on attachment 8859819 [details] [diff] [review]
Store selectors and combinators inline in a single sequence. v1

Review of attachment 8859819 [details] [diff] [review]:
-----------------------------------------------------------------

::: servo/components/selectors/parser.rs
@@ +268,5 @@
>  
> +/// A ComplexSelectors stores a sequence of simple selectors and combinators. The
> +/// iterator classes allow callers to iterate at either the raw sequence level or
> +/// at the level of sequences of simple selectors separated by combinators. Most
> +/// callers want the higher-level iterator.

It would be nice to state the order the selectors are stored in (from right to left).

But actually, is there a particular reason it needs to be that way? Seems like using iterators in the reverse order in iter() and iter_raw() should not be a big deal performance-wise, and would avoid the reversing of the slice while parsing.

@@ +402,5 @@
> +impl Combinator {
> +    /// Returns true if this combinator is a child or descendant combinator.
> +    pub fn is_ancestor(&self) -> bool {
> +        matches!(*self, Combinator::Child | Combinator::Descendant)
> +    }

Let's add is_sibling too and replace the `matches!` in the other places, for symmetry?

@@ +406,5 @@
> +    }
> +}
> +
> +/// A CSS simple selector or combinator. We store both in the same enum for
> +/// optimal packing and cache performance see [1].

missing a comma before "see".

@@ +417,3 @@
>  #[derive(Eq, PartialEq, Clone, Hash)]
>  pub enum SimpleSelector<Impl: SelectorImpl> {
> +    Combinator(Combinator),

The "problem" with this vs making an ArcSlice<(SimpleSelector, Option<Combinator>)>, is that now a combinator takes so much space as the biggest of the selectors, which is at least three words.

I don't think it's a huge deal (it's likely that we waste more memory just using different allocations as we were doing now, and that the extra dummy combinator is a waste when there aren't any), but can we add tests in the servo part of this so the selector size doesn't go over the roof?

@@ +436,5 @@
>      AttrSubstringNeverMatch(AttrSelector<Impl>, Impl::AttrValue),  // empty value
>      AttrSuffixNeverMatch(AttrSelector<Impl>, Impl::AttrValue),  // empty value
>  
>      // Pseudo-classes
> +    Negation(Vec<ComplexSelector<Impl>>),

While you're here, we could make this Box<[ComplexSelector<Impl>]> to save a word here and in :-moz-any. Not necessary to do it here, but perhaps worth filing a followup bug?

@@ +581,5 @@
>  
>  impl<Impl: SelectorImpl> ToCss for ComplexSelector<Impl> {
>      fn to_css<W>(&self, dest: &mut W) -> fmt::Result where W: fmt::Write {
> +       for item in self.iter_raw().rev() {
> +           item.to_css(dest)?;

This is a cute simplification, actually :)

@@ +1172,5 @@
>          }
>      }
>      let mut empty = true;
> +    if !parse_type_selector(parser, input, &mut sequence)? {
> +        match parser.default_namespace() {

nit: This can use if let now.
Attachment #8859819 - Flags: review?(emilio+bugs) → review+
(Assignee)

Comment 8

9 months ago
(In reply to Emilio Cobos Álvarez [:emilio] from comment #7)
> Comment on attachment 8859819 [details] [diff] [review]
> Store selectors and combinators inline in a single sequence. v1
> 
> Review of attachment 8859819 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: servo/components/selectors/parser.rs
> @@ +268,5 @@
> >  
> > +/// A ComplexSelectors stores a sequence of simple selectors and combinators. The
> > +/// iterator classes allow callers to iterate at either the raw sequence level or
> > +/// at the level of sequences of simple selectors separated by combinators. Most
> > +/// callers want the higher-level iterator.
> 
> It would be nice to state the order the selectors are stored in (from right
> to left).
> 
> But actually, is there a particular reason it needs to be that way? Seems
> like using iterators in the reverse order in iter() and iter_raw() should
> not be a big deal performance-wise, and would avoid the reversing of the
> slice while parsing.

Nice idea! I've gone ahead and done that. It seems to improve parsing performance slightly (though it's hard to tell), and has no negative impact on matching performance.

> 
> @@ +402,5 @@
> > +impl Combinator {
> > +    /// Returns true if this combinator is a child or descendant combinator.
> > +    pub fn is_ancestor(&self) -> bool {
> > +        matches!(*self, Combinator::Child | Combinator::Descendant)
> > +    }
> 
> Let's add is_sibling too and replace the `matches!` in the other places, for
> symmetry?

Done.

> 
> @@ +406,5 @@
> > +    }
> > +}
> > +
> > +/// A CSS simple selector or combinator. We store both in the same enum for
> > +/// optimal packing and cache performance see [1].
> 
> missing a comma before "see".

Fixed.

> 
> @@ +417,3 @@
> >  #[derive(Eq, PartialEq, Clone, Hash)]
> >  pub enum SimpleSelector<Impl: SelectorImpl> {
> > +    Combinator(Combinator),
> 
> The "problem" with this vs making an ArcSlice<(SimpleSelector,
> Option<Combinator>)>, is that now a combinator takes so much space as the
> biggest of the selectors, which is at least three words.
> 
> I don't think it's a huge deal (it's likely that we waste more memory just
> using different allocations as we were doing now, and that the extra dummy
> combinator is a waste when there aren't any), but can we add tests in the
> servo part of this so the selector size doesn't go over the roof?

Well, it's a tradeoff as you note, and it depends entirely on the makeup of the selector. In the situation where we have a very long chain of single-SimpleSelectors separated by combinators, the tuple will be more memory-efficient. But in situations where the chain is short (and the wasted last combinator has a larger share of the total space) or when there are several simple selectors in each compound selector (making the internal wasted combinators more significant), the current approach will be more efficient.

Given that it depends so much on the specific selector, I don't think the usual size_of tests will be useful here. What will be useful is overall memory reporting for stylo, which njn is working on. So I'm going to defer to that.

> 
> @@ +436,5 @@
> >      AttrSubstringNeverMatch(AttrSelector<Impl>, Impl::AttrValue),  // empty value
> >      AttrSuffixNeverMatch(AttrSelector<Impl>, Impl::AttrValue),  // empty value
> >  
> >      // Pseudo-classes
> > +    Negation(Vec<ComplexSelector<Impl>>),
> 
> While you're here, we could make this Box<[ComplexSelector<Impl>]> to save a
> word here and in :-moz-any. Not necessary to do it here, but perhaps worth
> filing a followup bug?

I just went ahead and wrote a patch to do it, stamped r=you.

> 
> @@ +581,5 @@
> >  
> >  impl<Impl: SelectorImpl> ToCss for ComplexSelector<Impl> {
> >      fn to_css<W>(&self, dest: &mut W) -> fmt::Result where W: fmt::Write {
> > +       for item in self.iter_raw().rev() {
> > +           item.to_css(dest)?;
> 
> This is a cute simplification, actually :)

:-)

> 
> @@ +1172,5 @@
> >          }
> >      }
> >      let mut empty = true;
> > +    if !parse_type_selector(parser, input, &mut sequence)? {
> > +        match parser.default_namespace() {
> 
> nit: This can use if let now.

Fixed.
(Assignee)

Comment 9

9 months ago
Created attachment 8860152 [details] [diff] [review]
Store selectors and combinators inline in a single sequence. r=emilio

This improves cache locality and reduces allocations during parsing.
Attachment #8860152 - Flags: review+
(Assignee)

Updated

9 months ago
Attachment #8859819 - Attachment is obsolete: true
(Assignee)

Comment 10

9 months ago
Created attachment 8860153 [details] [diff] [review]
Part 2 - Use Boxed slices for Empty and MozAny. r=emilio

MozReview-Commit-ID: CmVK3u0vYn0
Attachment #8860153 - Flags: review+
(Assignee)

Comment 11

9 months ago
Created attachment 8860154 [details] [diff] [review]
Part 3 - Rename SimpleSelector to Component. r=SimonSapin

MozReview-Commit-ID: JfaZpHSkG8h
Attachment #8860154 - Flags: review+
(Assignee)

Comment 12

9 months ago
Created attachment 8860155 [details] [diff] [review]
Part 4 - Use a SmallVec during selector parsing. v1

We wanted to do this anyway, and this avoids needing to remove the smallvec
dependency from Selectors and then add it back again.

MozReview-Commit-ID: HW4kDxzLACp
Attachment #8860155 - Flags: review?(emilio+bugs)
(Assignee)

Comment 13

9 months ago
I've done the SimpleSelector rename. Holding off on ComplexSelector until comment 6 is answered, we can deal with it as a followup if need be.

I'm bundling Part 4 in with this patch to avoid the churn of removing and re-adding the smallvec dep. Given that it's a small change, I'm going to start the landing process, but flagging emilio for retroactive stamp.
Comment on attachment 8860155 [details] [diff] [review]
Part 4 - Use a SmallVec during selector parsing. v1

Review of attachment 8860155 [details] [diff] [review]:
-----------------------------------------------------------------

Looks fine, thanks!
Attachment #8860155 - Flags: review?(emilio+bugs) → review+
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #6)
> Doesn't ComplexSelector store the same information now as before, just in a
> different format? I always thought that 'compound' was the sequence of
> simple selectors and complex was the sequence of sequences with combinators.

Looking at https://drafts.csswg.org/selectors/#structure , you’re right. The name ComplexSelector is fine, I’ll leave it up to you to decide whether a wrapper type is useful v.s. using ArcSlice<…> directly.
(Assignee)

Comment 17

9 months ago
(In reply to Simon Sapin (:SimonSapin) from comment #15)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #6)
> > Doesn't ComplexSelector store the same information now as before, just in a
> > different format? I always thought that 'compound' was the sequence of
> > simple selectors and complex was the sequence of sequences with combinators.
> 
> Looking at https://drafts.csswg.org/selectors/#structure , you’re right. The
> name ComplexSelector is fine, I’ll leave it up to you to decide whether a
> wrapper type is useful v.s. using ArcSlice<…> directly.

I think the wrapper type is helpful because it gives us a place to impl all the iter, iter_raw, iter_raw_rev methods. So I'll keep as-is. :-)
(Assignee)

Comment 18

9 months ago
This just hit autoland.
Status: NEW → RESOLVED
Last Resolved: 9 months ago
Priority: -- → P1
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.