Closed Bug 1398322 Opened 7 years ago Closed 7 years ago

stylo: PropertyDeclarationBlock storage is inefficient

Categories

(Core :: CSS Parsing and Computation, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
Tracking Status
firefox57 --- fixed

People

(Reporter: bholley, Assigned: mbrubeck)

References

Details

On gmail, bz measures 5.5MBs of PDBs, which is a lot. I think we can shrink it.

The definition is at [1]. The main inefficiency is that we store an Importance alongside every PropertyDeclaration, costing a word where we only need a bit. We should instead store this bit out-of-band, and use minimal storage for doing so.

My idea is as follows. Adjustments / counter-proposals welcome:
* We introduce a type called SmallBitVec, that just wraps a usize.
* If the rightmost bit of the usize is unset, then we treat it as inline storage.
* In the inline storage case, the rightmost bit is a sentinel from which len() can be derived.
* If the rightmost bit is set, then the usize is treated as heap pointer (modulo the rightmost bit, which needs to be masked off) to a buffer that is capacity + len + buffer.

The type would expose indexed access to the bits, as well as methods for computing whether all of the bits are set or none of the bits are set. This would allow us to eliminate important_count, since that's all the consumer really wants to know. This means that the inline size of PDB will not increase, which is a nice bonus.

[1] http://searchfox.org/mozilla-central/rev/44c693914255638d74bcf1ec3b0bcd448dfab2fd/servo/components/style/properties/declaration_block.rs#72
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #0)
> * In the inline storage case, the rightmost bit is a sentinel from which
> len() can be derived.
> * If the rightmost bit is set, then the usize is treated as heap pointer
> (modulo the rightmost bit, which needs to be masked off) to a buffer that is
> capacity + len + buffer.

I realize that I'm overloading "rightmost". In the first case (the sentinel), it refers to "the lowest order bit that is set". In the second case (the discriminant), it refers to "the bit of order zero".
Matt is going to take this. Thanks Matt!
Flags: needinfo?(mbrubeck)
Priority: -- → P3
Flags: needinfo?(mbrubeck)
I have slightly different idea that we introduce a new type called VecWithBit which contains a Vec and a union of usize and pointer, which serves as bitvec with length identical to the capacity of the Vec, so that we don't need any extra bits for checking whether the bitvec is inlined as well as the length of bitvec, because we can simply derive them from Vec's length.
(In reply to Xidorn Quan [:xidorn] UTC+10 from comment #3)
> I have slightly different idea that we introduce a new type called
> VecWithBit which contains a Vec and a union of usize and pointer, which
> serves as bitvec with length identical to the capacity of the Vec, so that
> we don't need any extra bits for checking whether the bitvec is inlined as
> well as the length of bitvec, because we can simply derive them from Vec's
> length.

That works too, at the cost of being slightly less reusable.

If we made it reusable, we could replace the BitVec in [1] and save some heap-allocations, which would be nice.

[1] http://searchfox.org/mozilla-central/rev/44c693914255638d74bcf1ec3b0bcd448dfab2fd/servo/components/style/sharing/mod.rs#126
(I think eliminating the BitVec for revalidation_match_results would be a nice improvement, so I'm inclined to prefer the approach in comment 0 unless we discover other downsides)
I have started implementing the SmallBitVec type at https://github.com/mbrubeck/smallbitvec
(In reply to Matt Brubeck (:mbrubeck) from comment #6)
> I have started implementing the SmallBitVec type at
> https://github.com/mbrubeck/smallbitvec

Nice! Looks like it's missing a Drop impl though?
Also might be worth adding some cargo bench microbenchmarks to compare perf against BitVec.
Also: probably worth using an unchecked get from the iterator, so that we don't have to do the somewhat-complicated len() computation on each call to next()?
(In reply to Matt Brubeck (:mbrubeck) from comment #7)
> https://github.com/servo/servo/pull/18431

This looks great on a quick skim. May be worth pushing through gecko try before landing.
Added the Drop impl, get_unchecked method, and some microbenchmarks.  Also made some tweaks to improve performance on some of the benchmarks.  Currently, SmallBitVec is about 20% slower than BitVec in benchmarks that test the `set` method, and about 200% slower in benchmarks that iterate over the vector.
(In reply to Matt Brubeck (:mbrubeck) from comment #12)
> Added the Drop impl, get_unchecked method, and some microbenchmarks.  Also
> made some tweaks to improve performance on some of the benchmarks. 
> Currently, SmallBitVec is about 20% slower than BitVec in benchmarks that
> test the `set` method, and about 200% slower in benchmarks that iterate over
> the vector.

I noticed you added a commit to iterator performance by fiddling with inlining. How much did it help? In practice, we probably only really care about the non-spilled case, so we could make that logic inline and make the spill-handling #[inline(never)] / #[cold].

Also, what about eq performance? I noticed that SmallBitVec compares via iterators, whereas BitVec compares the words. That potentially makes BitVec 64x faster on eq, which probably matters when comparing the bitvecs for revalidation selectors. Should be straightforward to add a similar eq impl?
The inlining changes improved iteration speed about 30% in both spilled and non-spilled cases.

I'll push a fix for eq performance shortly.
Thanks Matt!
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.