Closed Bug 1371949 Opened 7 years ago Closed 7 years ago

stylo: Pack bloom filter hashes better and save a word on Rule

Categories

(Core :: CSS Parsing and Computation, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED

People

(Reporter: bholley, Assigned: bholley)

References

(Blocks 1 open bug)

Details

Attachments

(4 files)

Julian had this great observation in bug 1369066 comment 15. We should be able to save another word on Rule here.
_this_ finally closes the gap (and then some) between sequential stylo and Chrome, on both configurations of bloom-basic that I've been testing. Both register around 160ms, and parallel stylo is now in the mid-thirties. :-)
Currently all Gecko and Servo do the KEY_SHIFT thing, but I think they're all
just cribbing from webkit, which just shifts by bareword 16 without explanation.

MozReview-Commit-ID: CqNi7r9e5s0
Attachment #8876440 - Flags: review?(emilio+bugs)
MozReview-Commit-ID: KbtKQzLmwVO
Attachment #8876441 - Flags: review?(emilio+bugs)
MozReview-Commit-ID: AKNTZZqke1O
Attachment #8876442 - Flags: review?(emilio+bugs)
MozReview-Commit-ID: 7B1i1g0HLTj
Attachment #8876443 - Flags: review?(simon.sapin)
Attachment #8876443 - Flags: review?(emilio+bugs)
Comment on attachment 8876440 [details] [diff] [review]
Part 1 - Shift by KEY_SIZE instead of something larger. v1

Review of attachment 8876440 [details] [diff] [review]:
-----------------------------------------------------------------

::: servo/components/selectors/bloom.rs
@@ +15,2 @@
>  const ARRAY_SIZE: usize = 1 << KEY_SIZE;
>  const KEY_MASK: u32 = (1 << KEY_SIZE) - 1;

This may impact the distribution of the hashes I guess... But assuming that hasn't made a huge impact in the amount of selectors we reject, ok.
Attachment #8876440 - Flags: review?(emilio+bugs) → review+
Attachment #8876441 - Flags: review?(emilio+bugs) → review+
Comment on attachment 8876442 [details] [diff] [review]
Part 3 - Make source_order u32 and shrink Rule. v1

Review of attachment 8876442 [details] [diff] [review]:
-----------------------------------------------------------------

r=me
Attachment #8876442 - Flags: review?(emilio+bugs) → review+
Comment on attachment 8876443 [details] [diff] [review]
Part 4 - Use even fewer bits for source order and shrink ApplicableDeclarationsBlock by another word. v1

Review of attachment 8876443 [details] [diff] [review]:
-----------------------------------------------------------------

r=me, assuming Simon is fine with it.

::: servo/components/style/stylist.rs
@@ +1525,5 @@
>                                             level: CascadeLevel)
>                                             -> ApplicableDeclarationBlock {
>          ApplicableDeclarationBlock {
>              source: StyleSource::Style(self.style_rule.clone()),
> +            source_and_level: SourceOrderAndCascadeLevel::new(self.source_order, level),

Please comment on the rule member about the higher bits being unused and truncated.

@@ +1556,5 @@
> +const CASCADE_LEVEL_SHIFT: usize = 24;
> +
> +/// Stores the source order of a block and the cascade level it belongs to.
> +#[derive(Copy, Clone, Eq, PartialEq)]
> +struct SourceOrderAndCascadeLevel(u32);

stylist.rs is already big enough, perhaps it's worth moving this to its own file? (Maybe with the whole ApplicableDeclarationBlock stuff?)

Also, that will make it a bit less error prone, currently this whole file has access to the inner field of the struct.

@@ +1573,5 @@
> +    fn level(&self) -> CascadeLevel {
> +        unsafe {
> +            // Transmute rather than shifting so that we're sure the compiler
> +            // emits a simple byte-aligned load.
> +            let as_bytes: [u8; 4] = mem::transmute(self.0);

This is nasty... let's at least debug_assert! that it's in range.

@@ +1582,5 @@
> +}
> +
> +impl Debug for SourceOrderAndCascadeLevel {
> +    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
> +        write!(f, "SourceOrderAndCascadeLevel {{ order: {:?}, level: {:?} }}",

I think Formatter has convenience methods for this[1], so you could do:

f.debug_struct("SourceOrderAndCascadeLevel")
    .field("order", self.order())
    .field("level", self.level())
    .finish()

which may be a bit nicer.

[1]: https://doc.rust-lang.org/std/fmt/struct.Formatter.html#method.debug_struct
Attachment #8876443 - Flags: review?(emilio+bugs) → review+
> but I think they're all just cribbing from webkit

Gecko is not, fwiw.  I mean, it is initially cribbed from WebKit, but it was all double-checked for sanity.

So in the case of Gecko, the Bloom filter is a template on the KeySize.  We need to produce 2*KeySize bits from the 32-bit hash.  The simplest way to do that is to always shift by 16 and static_assert that KeySize <= 16, which is what Gecko does.

We _could_ shift by KeySize (and still assert that KeySize <= 16).  Assuming the hash is a good well-distributed one, this should not matter.  How well-distributed our hashes are in practice... who knows.

Stylo's filter is not parametrized on KEY_SIZE; it's just hardcoded.  So the above considerations don't really apply.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Attachment #8876443 - Flags: review?(simon.sapin) → review+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: