Closed Bug 1747037 Opened 2 years ago Closed 2 years ago

7.32% Images (Windows) regression on Sat December 18 2021

Categories

(Core :: CSS Parsing and Computation, defect)

defect

Tracking

()

RESOLVED FIXED
97 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox95 --- unaffected
firefox96 --- unaffected
firefox97 + fixed

People

(Reporter: davehunt, Assigned: emilio)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: perf, perf-alert, regression)

Attachments

(2 files)

Perfherder has detected a awsy performance regression from push b66f40aa15426fbe92264c3fe08c6d4c4cb518be. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
7% Images windows10-64-2004-shippable-qr tp6 7,786,924.32 -> 8,356,889.83

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(emilio)

Is this Windows only somehow?

Flags: needinfo?(dave.hunt)

See bug 1744102 comment 16. Looking at the results for macOS and Linux, there may be a similar regression but there's too much noise in the data to be sure. The Windows regression wasn't noticed by our algorithm but only spotted on manual inspection by request. It seems these results became a lot noisier in May.

Flags: needinfo?(dave.hunt)

Set release status flags based on info from the regressing bug 1744102

Given that this showed up on Linux the last time we tried to land hashbrown (bug 1633410), I would suspect it's platform-independent and just a question of noise.

I think this should be relatively straightforward to debug. The subtest analysis [1] shows this happening on fresh start, so we should be able to just start the browser (with and without the patch) and reproduce the effect in about:memory, without needing to load the pageset or anything. Interestingly, "fresh start + 30s" seems pretty stable, but regular fresh start appears to go from stable to bimodal with this patch.

Once we've confirmed the effect locally, we can narrow it down by hashtable instance. I'm kinda surprised the effect is so large, so I wonder if it's an across-the-board change or just some specific large table that we should maybe tweak in some other way.

[1] https://treeherder.mozilla.org/perfherder/comparesubtest?framework=4&originalProject=autoland&originalSignature=3773908&newProject=autoland&newSignature=3773908&originalRevision=b9258df6e7810a879ec130f25d0741e015b9c267&newRevision=b66f40aa15426fbe92264c3fe08c6d4c4cb518be&page=1

Assignee: nobody → emilio

So this is measuring only "explicit" (memory-reported) memory right? Do we have the heap-unclassified measurement as well out of curiosity?

Given this is reported memory something like this should help to figure out. From a quick look it does seem like hashbrown more aggressively reserves memory. E.g., before, what is presumably the same hashmap:

HashMap<style::gecko_string_cache::Atom, smallvec::SmallVec<[style::stylist::Rule; 1]>>: len = 431, cap = 466, s = 36864, estimate = 33552
HashMap<style::gecko_string_cache::Atom, smallvec::SmallVec<[style::stylist::Rule; 1]>>: len = 451, cap = 466, s = 36864, estimate = 33552
HashMap<style::values::AtomIdent, smallvec::SmallVec<[style::stylist::Rule; 1]>>: len = 57, cap = 59, s = 8192, estimate = 4248
HashMap<style::values::AtomIdent, smallvec::SmallVec<[style::stylist::Rule; 1]>>: len = 46, cap = 59, s = 8192, estimate = 4248

After:

HashMap<style::gecko_string_cache::Atom, smallvec::SmallVec<[style::stylist::Rule; 1]>>: len = 431, cap = 448, s = 36864, estimate = 32256
HashMap<style::gecko_string_cache::Atom, smallvec::SmallVec<[style::stylist::Rule; 1]>>: len = 451, cap = 896, s = 69632, estimate = 64512
HashMap<style::values::AtomIdent, smallvec::SmallVec<[style::stylist::Rule; 1]>>: len = 57, cap = 112, s = 12288, estimate = 8064
HashMap<style::values::AtomIdent, smallvec::SmallVec<[style::stylist::Rule; 1]>>: len = 46, cap = 56, s = 8192, estimate = 4032

Yeah, so hashbrown just doubles the capacity, while hashglobe used a 110% load factor:

    /// A hash map's "capacity" is the number of elements it can hold without
    /// being resized. Its "raw capacity" is the number of slots required to
    /// provide that capacity, accounting for maximum loading. The raw capacity
    /// is always zero or a power of two.
    #[inline]
    fn raw_capacity(&self, len: usize) -> usize {
        if len == 0 {
            0
        } else {
            // 1. Account for loading: `raw_capacity >= len * 1.1`.
            // 2. Ensure it is a power of two.
            // 3. Ensure it is at least the minimum size.
            let mut raw_cap = len * 11 / 10;
            assert!(raw_cap >= len, "raw_cap overflow");
            raw_cap = raw_cap
                .checked_next_power_of_two()
                .expect("raw_capacity overflow");
            raw_cap = max(MIN_NONZERO_RAW_CAPACITY, raw_cap);
            raw_cap
        }
    }

So this is sort-of expected, and I guess we can either:

  • shrink-to-fit the hash maps after stylist rebuild.
  • somehow try to alter hashbrown's growth policy.
Flags: needinfo?(emilio)

Hashbrown grows a lot sometimes making us waste a lot of memory. Shrink
some of these maps after CascadeData rebuild / stylesheet collection
invalidation.

Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/4c248995e513
Shrink maps if needed after stylist rebuilds. r=bholley
Blocks: 1747075
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 97 Branch
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: