Closed
Bug 1357304
Opened 8 years ago
Closed 8 years ago
stylo: Store bloom filter hashes inline in the selector
Categories
(Core :: CSS Parsing and Computation, defect, P1)
Core
CSS Parsing and Computation
Tracking
()
RESOLVED
FIXED
People
(Reporter: bholley, Assigned: bholley)
References
(Blocks 1 open bug)
Details
Attachments
(3 files)
2.16 KB,
patch
|
emilio
:
review+
|
Details | Diff | Splinter Review |
42.35 KB,
patch
|
emilio
:
review+
|
Details | Diff | Splinter Review |
8.35 KB,
patch
|
emilio
:
review+
|
Details | Diff | Splinter Review |
I have patches for this. This improves stylo's sequential+uncached performance on [1] and [2] by about 4x, making stylo clock in at about 25% faster than Gecko when all is said and done.
Caching and parallelism will give additional wins here, but for the next few days I'm just focusing on raw selector matching performance. Emilio and I worked out some nice plans on Saturday, so there's more to come in that area.
[1] http://searchfox.org/mozilla-central/source/testing/talos/talos/tests/perf-reftest/bloom-basic.html
[2] http://searchfox.org/mozilla-central/source/testing/talos/talos/tests/perf-reftest/bloom-basic-ref.html
Assignee | ||
Comment 1•8 years ago
|
||
Assignee | ||
Comment 2•8 years ago
|
||
MozReview-Commit-ID: Lm25v2K21v5
Attachment #8859024 -
Flags: review?(emilio+bugs)
Assignee | ||
Comment 3•8 years ago
|
||
MozReview-Commit-ID: DxG6USsPIkh
Attachment #8859025 -
Flags: review?(emilio+bugs)
Assignee | ||
Comment 4•8 years ago
|
||
MozReview-Commit-ID: F07JkdduLaI
Attachment #8859026 -
Flags: review?(emilio+bugs)
Updated•8 years ago
|
Attachment #8859024 -
Flags: review?(emilio+bugs) → review+
Comment 5•8 years ago
|
||
Comment on attachment 8859025 [details] [diff] [review]
Part 2 - Introduce SelectorInner and use it for top-level matching. v1
Review of attachment 8859025 [details] [diff] [review]:
-----------------------------------------------------------------
::: servo/components/selectors/parser.rs
@@ +121,5 @@
> }
> }
>
> #[derive(PartialEq, Eq, Hash, Clone)]
> +pub struct SelectorInner<Impl: SelectorImpl> {
docs, please state the intention of why this type should exist, and all that :)
Attachment #8859025 -
Flags: review?(emilio+bugs) → review+
Comment 6•8 years ago
|
||
Comment on attachment 8859026 [details] [diff] [review]
Part 3 - Store bloom filter hashes inline. v1
Review of attachment 8859026 [details] [diff] [review]:
-----------------------------------------------------------------
Just curious, have you made any memory usage measurements?
::: servo/components/selectors/parser.rs
@@ +135,5 @@
> + let mut hashes = [0; NUM_ANCESTOR_HASHES];
> + {
> + // Compute ancestor hashes for the bloom filter. We precompute these
> + // and store them inline to optimize cache performance during selector
> + // matching. This matters a lot.
Put this in the member definition as a doc comment?
@@ +267,5 @@
> +
> +impl<'a, Impl: SelectorImpl> Iterator for AncestorIterator<'a, Impl> {
> + type Item = &'a Arc<ComplexSelector<Impl>>;
> + fn next(&mut self) -> Option<Self::Item> {
> + while self.curr.is_some() {
use |while let Some(selector) = self.cur.take()| instead.
Attachment #8859026 -
Flags: review?(emilio+bugs) → review+
Assignee | ||
Comment 7•8 years ago
|
||
(In reply to Emilio Cobos Álvarez [:emilio] from comment #6)
> Comment on attachment 8859026 [details] [diff] [review]
> Part 3 - Store bloom filter hashes inline. v1
>
> Review of attachment 8859026 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Just curious, have you made any memory usage measurements?
I have not. I think this is probably worth it in any case.
Assignee | ||
Comment 8•8 years ago
|
||
Comment 9•8 years ago
|
||
So the key here is to not have to pointer-chase down the selector chain to find the relevant hashes, plus not having to pointer-chase into the atoms and whatnot, right?
Assignee | ||
Comment 10•8 years ago
|
||
Correct - we don't have to examine the combinators at all.
It's worth noting that there wasn't all _that_ much pointer-chasing in the path we were hitting in this testcase. The selector in question was "span > div", which means that the steps we were doing before was:
* Deference the pointer on the Selector to get the ComplexSelector corresponding to |div| and everything to the left.
* Match on the combinator, and determine that the next selector corresponded to an ancestor.
* Dereference the |next| pointer to get the ComplexSelector corresponding to |span|.
* Dereference the |compound_selector| Vec<SimpleSelector> on the ComplexSelector to get the first SimpleSelector.
* Match on the SimpleSelector to determine it to be a LocalName.
* Dereference both the name and the lower_name.
* Check those two hashes against the bloom filter.
So ignoring the matches, we eliminated 5 dereferences, though the last two were always the same atom across all the selectors and thus likely in L1. So let's call it 3, which amounted to the difference between 1200ms and 270ms. Cache locality matters.
Comment 11•8 years ago
|
||
Right. So on that testcase, we end up doing about 50e6 bloom filter tests, I think. 930ms/50e6 is about 19ns if I'm counting my 0s right. I can totally believe that 5 derefs could take that long, unless they're all in L1...
We should remeasure bug 1348935 once this lands.
Blocks: 1348935
Assignee | ||
Comment 12•8 years ago
|
||
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #11)
> Right. So on that testcase, we end up doing about 50e6 bloom filter tests,
> I think. 930ms/50e6 is about 19ns if I'm counting my 0s right. I can
> totally believe that 5 derefs could take that long, unless they're all in
> L1...
>
> We should remeasure bug 1348935 once this lands.
I'm planning on doing more refactoring to improve cache locality in selector matching in general, so I'll remeasure that testcase once I've optimized the heck out of all my microbenchmarks. :-)
Assignee | ||
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•