Closed Bug 1375323 Opened 7 years ago Closed 7 years ago

stylo: various bloom filter optimizations

Categories

(Core :: CSS Parsing and Computation, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: bholley, Assigned: bholley)

References

(Blocks 1 open bug)

Details

Attachments

(5 files)

There are various dumb things we're doing with the bloom filter right now. The worst is that we recompute the hashes for popping rather than just storing them.

I've got some patches.
This is important because we're about to start storing a parallel list
of pushed hashes, and the current behavior here will cause mismatches
there.

We take the opportunity to bump the SmallVec size to 16, since each
entry is only a word and we really want to avoid heap-allocating. And
then we switch to drain(), because of
https://github.com/rust-lang/rust/issues/42763
Attachment #8880203 - Flags: review?(emilio+bugs)
The current code is mostly bechmarking Rust's default hash routines for
integers, which is not interesting to us. We add generating function
that jumps around "enough" and also add benchmarks for clear().

It's hard to read too much into the numbers because we can't reliably
simulate L1/L2 cache behavior with cargo bench, but the results show
about 40ns for clear() about 2ns for insert, and about 1.5ns for
contains/remove. This informs our thinking in the next patch.
Attachment #8880206 - Flags: review?(emilio+bugs)
Attachment #8880203 - Flags: review?(emilio+bugs) → review+
Comment on attachment 8880204 [details] [diff] [review]
Part 2 - Switch the bloom element list to a SmallVec and track pushed hashes. v1

Review of attachment 8880204 [details] [diff] [review]:
-----------------------------------------------------------------

::: servo/components/style/bloom.rs
@@ +47,5 @@
>      filter: Box<BloomFilter>,
>  
> +    /// The stack of elements that this bloom filter contains, along with the
> +    /// number of hashes pushed for each element.
> +    elements: SmallVec<[(SendElement<E>, usize); 16]>,

Please make this tuple a struct with named fields instead, I'd expect it to be clearer when using it below, and the named fields probably don't add much noise.

@@ +50,5 @@
> +    /// number of hashes pushed for each element.
> +    elements: SmallVec<[(SendElement<E>, usize); 16]>,
> +
> +    /// Stack of hashes that have been pushed onto this filter.
> +    pushed_hashes: SmallVec<[u32; 64]>,

nit: Ideally combining both stacks in a single type would be neat, but probably it's just overkill.

@@ +127,5 @@
>          }
>  
> +        for _ in 0..num_hashes {
> +            let hash = self.pushed_hashes.pop().unwrap();
> +            debug_assert!(expected_hashes.pop().unwrap() == hash);

nit: no big deal, but debug_assert_eq!.
Attachment #8880204 - Flags: review?(emilio+bugs) → review+
Comment on attachment 8880205 [details] [diff] [review]
Part 3 - Make selectors benchmark tests work again. v1

Review of attachment 8880205 [details] [diff] [review]:
-----------------------------------------------------------------

::: servo/components/selectors/lib.rs
@@ +2,5 @@
>   * License, v. 2.0. If a copy of the MPL was not distributed with this
>   * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
>  
> +// Make |cargo bench| work.
> +#![cfg_attr(test, feature(test))]

We run these tests on stable on Servo CI I think, so you still need to keep the feature-gate, and turn them on there instead only on nightly.
Attachment #8880205 - Flags: review?(emilio+bugs) → review+
Comment on attachment 8880206 [details] [diff] [review]
Part 4 - Improve bloom microbenchmarks. v1

Review of attachment 8880206 [details] [diff] [review]:
-----------------------------------------------------------------

::: servo/components/selectors/bloom.rs
@@ +234,5 @@
> +    struct HashGenerator(u32);
> +
> +    impl HashGenerator {
> +        fn next(&mut self) -> u32 {
> +            self.0 += (65) + (65 << super::KEY_SIZE); // Jump to the next cache line.

If you could clarify a bit more, specifying that we generate a hash that it's distributed in a way so that it causes (L1, I guess?) cache misses, it'd be nice (I was wondering how did you intend an addition to cause cache misses when I first read it :P).
Attachment #8880206 - Flags: review?(emilio+bugs) → review+
Comment on attachment 8880207 [details] [diff] [review]
Part 5 - Be smarter when clearing the bloom filter. v1

Review of attachment 8880207 [details] [diff] [review]:
-----------------------------------------------------------------

::: servo/components/selectors/bloom.rs
@@ +114,5 @@
> +    pub fn is_zeroed(&self) -> bool {
> +        self.counters.iter().all(|x| *x == 0)
> +    }
> +
> +    #[cfg(not(debug_assertions))]

I think this is an anti-pattern... But not so much as compiling it out, I guess... Maybe just keep the function also in release builds but name it something scary, like slow_is_zeroed? I don't have a strong preference. Maybe check with Simon.

::: servo/components/style/bloom.rs
@@ +155,5 @@
> +        // costing about 25 times more than remove_hash().
> +        //
> +        // One subtly to note is that remove_hash() will not touch the value
> +        // if the filter overflowed. However, overflow can only occur if we
> +        // get 255 collisions on the same hash value, and 25 < 255.

Maybe make 25 a named constant, and move this comment into a doc comment for that constant?
Attachment #8880207 - Flags: review?(emilio+bugs) → review+
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: