Closed Bug 1487553 Opened 2 years ago Closed 2 years ago
Sparse Bit Set use a more compact representation
We use gfxSparseBitSet to record the character coverage of each gfxFontEntry (font face), and of gfxFontFamily (the union of the coverage of all the family's faces), as well as the lists of codepoints from unicode-range descriptors. This can add up to quite a lot of memory (easily several hundred KB), when coverage data for potentially hundreds of faces has been loaded. It looks like we could make gfxSparseBitSet more compact (while maintaining essentially the same lookup performance, which is important as testing font coverage is a hot code path in text layout). The current implementation keeps an array of pointers to the Block records that hold a bit-map of the supported characters in each 256-char Unicode block. So (assuming a 64-bit system) that means we have a minimum overhead of 8 bytes per non-empty block, in addition to the 32-byte Block itself (plus the allocation overhead for each Block being allocated separately). Note that the theoretical maximum number of Blocks that could be needed is 4352 (if a font were to include at least one codepoint for each 256-character block in Unicode). What I propose, therefore, is to store all the (non-empty) Block records needed for a given gfxSparseBitSet in a single contiguous nsTArray (of Blocks, not of Block pointers), to reduce the overhead of allocating each Block individually; and then to refer to them using an array of 16-bit indexes instead of an array of 64-bit pointers. As an example of the savings, if I check about:memory after loading the Google Fonts home page (fonts.google.com) in a fresh browser, it reports 215,184 bytes used for font-charmaps in the content process. With the change proposed here, this is reduced to 169,024 bytes. On font-heavy sites, the figures can go substantially higher than this (note that although the Google Fonts site loads a number of fonts, it loads only small font subsets rather than resources with large character sets).
Pushed to try to make sure it doesn't break anything: https://treeherder.mozilla.org/#/jobs?repo=try&revision=684343312f374ac05a6866aa0829096725e6b0c5.
Comment on attachment 9005561 [details] [diff] [review] Use a more compact representation for gfxSparseBitSet So the key change here is that we replace the array of UniquePtrs to blocks with an array of 16-bit block indexes, so it occupies 75% less RAM (assuming a 64-bit build; 50% less in the case of 32-bit). To make this possible, we allocate all the Block records in one contiguous array instead of as separate objects. This means we'll be moving somewhat more stuff around when that array grows and needs reallocation; but that's offset by the fact that we're doing fewer individual allocations. (Coincidentally (or not!) this is very similar to the structure I plan to use when moving charmaps into shared memory for the Fission font-management work.)
Attachment #9005561 - Flags: review?(lsalzman)
Attachment #9005561 - Flags: review?(lsalzman) → review+
Looking at the numbers from bug 1487146 (which are in a 64-bit build), that situation has 94 bitsets each with 34,824 bytes of pointers-to-blocks storage (looks like that's "all the blocks"). On average they seem to have 400 blocks allocated per bitset. For purposes of the calculations below I'll assume they're all at 400 blocks. With the old setup, that uses 3,465,216 + 1,204,096 = 4,669,312 bytes once malloc slop is included With the new setup each bitset will have an array of 400 blocks. 400 * 32 (block size) = 12,800, which jemalloc will round up to 16384 bytes, unfortunately. There will also be an array of indices, which will be 8712 bytes, I think, which gets rounded up to 12,288. Adding up and multiplying by 94, we get 94 * (12288 + 16384) = 2,695,168 bytes, which is definitely a significant win!
Pushed by email@example.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/5a98a0c9c578 Use a more compact representation for gfxSparseBitSet. r=lsalzman
You need to log in before you can comment on or make changes to this bug.