1398658 - stylo: Refactor some stuff in the style sharing cache

Assignee

Description

•

7 years ago

I spent the weekend working on a rule-node-based reuse scheme for ComputedValues (bug 1397976), which is analogous to how Gecko shares style contexts via the style context tree. The results there are encouraging, demonstrating a 1% topline improvement on AWSY for 4 threads per heycam's measurements. However, my measurements on the HTML spec show that we still do significantly worse than Gecko, even for STYLO_THREADS=1. From what I can tell, this is the result of eviction in the style sharing cache. We can store a maximum of 31 entries in the cache, after which point we start evicting elements. Since my scheme in bug 1397976 is based on top of the style sharing cache, this places a limit on how much reuse we can get. Gecko does have a limit of 10 [1], but it's per-parent. This isn't directly comparable with Stylo's scheme (31 shared for the whole level), and it's easy to construct scenarios where each approach shares better than the other. That said, on the HTML spec, Gecko's approach seems to be working significantly better. Luckily, I think we can avoid weighing the tradeoffs, and get the best of both worlds. Specifically, my plan is to replace the single LRUCache with a HashMap of LRUCaches, keyed off of parent. In terms of both sharing and lookup performance, this should perform no worse than either Stylo or Gecko, since we basically just trade a hashtable lookup for a reduction in elements to examine, which in turn allows us to cache more elements without increasing search time. I hacked this up today and measured it, and the results are encouraging. Here are the style context counts for the HTML spec with the various approaches: Gecko: 27k Stylo (1 thread, 3 threads): 46k, 84k Stylo + bug 1397976 (1 thread, 3 threads); 40k, 72k Stylo + bug 1397976 + this bug (1 thread, 3 threads); 31k, 61.5k So that's a quite significant improvement. The obvious downside to my proposed 2-tiered setup is that it's heavier-weight, and the extra allocation and memory traffic could bite us if we're not careful. However, I think we can make it work, especially since we already store the cache in TLS and reuse it across traversals. My proposal is the following: * Instead of just an LRUCache, the cache becomes a HashMap<*const ComputedValues, CacheEntry> * CacheEntry is an enum that is either an Element of a Box<LRUCache>. This would just be two words, so the hashmap would be small and efficient to manage. * The Element variant is an optimization for the case where we have only one entry in the cache. We can avoid storing an entire StyleSharingCandidate here, and just promote the Element to a Box<LRUCache> when somebody does a lookup for the given key, since anybody performing that lookup has a good chance of inserting a second element afterward. * To avoid a lot of allocation traffic on the LRUCache buffers, we collect them when clearing the cache, and store them in a vec for reuse, which can also persist across traversals. We may want to set a maximum number of free LRUCaches to persist across traversals in order to keep overhead down. On the whole, this scheme would increase static leaked data somewhat, but not too much. Instead of 8k per thread for the LRUCache, we'd have a multiplier for the number of LRUCaches we decided to persist (4 might be a reasonable number). Emilio, WDYT? [1] http://searchfox.org/mozilla-central/rev/70cfd6ceecacbe779456654b596bbee4f2b8890b/layout/style/GeckoStyleContext.cpp#325

Part 1 - Shorten the naming around the style sharing cache. v1 7 years ago Bobby Holley (:bholley) 9.31 KB, patch	emilio : review+	Details \| Diff \| Splinter Review
Part 2 - Fix an awful bug in LRUCache::touch(). v1 7 years ago Bobby Holley (:bholley) 1020 bytes, patch	emilio : review+	Details \| Diff \| Splinter Review
Part 3 - Encapsulate the sharing cache backend better. v1 7 years ago Bobby Holley (:bholley) 18.88 KB, patch	emilio : review+	Details \| Diff \| Splinter Review
Part 4 - Avoid memmoving ValidationData more than necessary. v1 7 years ago Bobby Holley (:bholley) 3.79 KB, patch	emilio : review+	Details \| Diff \| Splinter Review