There is no evacuation of dead entries in cairo_hash_tables if the size of the table never changes, so, when entries are regularly added and removed, eventually the hash table has no free entries. Further, free (unused) entries are preferred over dead entries for use when inserting new live entries, so dead entries quickly fill the table. When there are no free entries to terminate a search, checking that a key is not in the table requires probing every entry in the table, calling cairo_hash_keys_equal_func_t on each live entry, so the lookup is O(n).
Created attachment 336403 [details] [diff] [review] dead entry management 1. To avoid the number of dead entries growing as fast as possible, _cairo_hash_table_lookup_internal is modified to return the first available (dead or free) entry rather than the first never-used entry. 2. _cairo_hash_table_insert can use key_unique = TRUE when calling _cairo_hash_table_lookup_internal as the rules don't permit inserting an entry with an existing key. This saves on having to search through dead entries (as the first dead can be used), and means that the cairo_hash_keys_equal_func_t need not be called on live entries. Modification 2 actually makes modification 1 unnecessary, but 1 makes the code correct. Either of the above modifications significantly slows the filling of the table with dead entries, but tables still become full eventually, so 3. Ensure that at least 25% of the entries in the table are free (search terminations). This is achieved by reshuffling the table when the number of free entries becomes small. One iteration of reshuffling doesn't necessarily clear out all dead entries, but does clear out most of them, so the number of reshuffles remains small. The number of add and remove cycles between reshuffles should be O(n), where n is the table size. With attachment 335322 [details] [diff] [review] and attachment 336396 [details] [diff] [review], this patch reduces the proportion of CPU_CLK_UNHALTED events for the Firefox process occurring in _cairo_hash_table_lookup_internal from 4.8% to 0.17%, in _cairo_scaled_glyph_keys_equal from 0.9% to 0.06%, and in _cairo_xlib_hook_keys_equal from 0.08% to 0.002%
I've opened a bug against cairo to get Keith Packard who wrote the cache code to review this. https://bugs.freedesktop.org/show_bug.cgi?id=17399
jimb suggested an optimization: What if a probe notes the first dead entry they encounter, and swaps the found entry with that? When hash_table->keys_equal returns true if _cairo_hash_table_lookup_internal, it could easily swap entry with first available. That's probably another sensible optimization, but it still wouldn't solve the problem re looking for a entry that is not there.
Try server shows ~5% Talos Tp improvement with this patch.
Karl, can you hop in to #cairo and try to grab keithp/behdad/etc? Some comments when I mentioned the patch: is10:20 < ickle> vlad_: I'd like to see the chain-coalescing part as a separate patch (just because I think that's more useful, but I don't have anything number upon which to base my opinion on... ;) 10:23 < keithp> vlad_: splitting it out so that you fixed one bug at a time would be helpful 10:23 < keithp> vlad_: the first (obvious) bug is to have lookup return the first dead entry along the chain 10:24 < keithp> with that in place, I'm not sure I see the need to explicitly remove dead entries at other times though
I don't know what timezone keithp is in, but i never seem to get a response on irc, so I'll try more communication through bugs.freedesktop. There are now 3 bugs on freedesktop for the 3 parts of comment 1. http://bugs.freedesktop.org/show_bug.cgi?id=18166 http://bugs.freedesktop.org/show_bug.cgi?id=18165 http://bugs.freedesktop.org/show_bug.cgi?id=17399
So if this doesn't get any motion on the cairo side in the next week or so, we should just take the patch in our tree, IMO.
Can someone send a summary to cairo list?
Chris pushed part 2 (from comment 1) upstream (thank you): http://gitweb.freedesktop.org/?p=cairo;a=commitdiff;h=d15fb9344bf86dd52cda0b43d3dfc49397fd84ec This should provide a bit over half the savings mentioned here (and makes part 1 unnecessary). Insertions into full tables should be much (O(n)->O(1)) faster and entries should be positioned optimally more often so that lookups for existing entries are faster. This is consistent with a 3% improvement seen with this patch on the tryserver. The lookup of an non-existing entry in a full table is still O(n) and so the tryserver still measures a 2% gain to be made from part 3, the patch in http://bugs.freedesktop.org/show_bug.cgi?id=17399 It seems this affects East Asian languages mostly, probably because the large number of glyphs used from a single font exceeds the capacity of the scaled glyph caches. (Other changes to cairo-hash that have landed upstream may change these numbers slightly: http://gitweb.freedesktop.org/?p=cairo;a=commitdiff;h=cebc84f367a81eedebf7ab0b6b082691923c3ef7 )
Another part has been pushed to master in commit aaa10fbf125a80e5379172b8564384a945728cba Author: Andrea Canciani <firstname.lastname@example.org> Date: Tue Aug 2 10:50:00 2011 +0200 hash: Improve handling of dead entries When there are no free entries to terminate a search, checking that a key is not in the table requires probing every entry in the table, i.e. it degenerates in an O(n) operation. Rehashing when the number of free entries is less than 25% makes the expected lookup time O(1). The hash-table micro benchmark become 4-6x faster. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=17399 As stated in the bug report, the committed patch does not do in-place rehashing (as the patch attached to this bug did), but reuses the existing rehash function. This is sufficient to guarantee O(1) average access time, but might result in different performance (worse because of less locality, better because of simpler rehashing). Please open another report in cairo bug tracker if the hash performance is found to be unsatisfactory and/or worse than with in-place rehashing (in cairo benchmarks it was essentially the same, but I guess firefox has a more extensive benchmark suite).
Thanks very much Andrea. That should solve the issues I saw. We can close this Mozilla bug when mozilla-central's cairo is updated.