Closed Bug 1951925 Opened 1 year ago Closed 1 year ago

7.27 - 3.55% Base Content Heap Unclassified / Base Content Resident Unique Memory + 1 more (Windows) regression on Wed March 5 2025

Categories

(Core :: IPC, defect)

defect

Tracking

()

RESOLVED FIXED
138 Branch
Tracking Status
firefox-esr128 --- unaffected
firefox136 --- unaffected
firefox137 --- unaffected
firefox138 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: afranchuk)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Attachments

(1 file)

Perfherder has detected a awsy performance regression from push 7a43eb20fb7c112e190821dec3c58fb8727e7b25. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
7% Base Content Heap Unclassified windows11-64-24h2-shippable fission 1,934,130.67 -> 2,074,806.00
5% Base Content Explicit windows11-64-24h2-shippable fission 8,173,794.00 -> 8,576,967.33
4% Base Content Resident Unique Memory windows11-64-24h2-shippable fission 11,453,781.33 -> 11,860,821.33

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run all of these tests on try with ./mach try perf --alert 44199

The following documentation link provides more information about this command.

For more information on performance sheriffing please see our FAQ.

If you have any questions, please do not hesitate to reach out to fbilt@mozilla.com.

Flags: needinfo?(afranchuk)

Set release status flags based on info from the regressing bug 1942129

Duplicate of this bug: 1951923

heap-unclassified and layout/style-sheet-cache/unshared seem to account for the increased sizes. At least the latter may be due to some subtlety in the changed layout/style/GlobalStyleSheetCache, though superficially the code is identical to the old code. I am investigating the cause. Given that the absolute deltas aren't prohibitively large, I'd say hold off on backing out the patch while we get down to the bottom of this regression.

Flags: needinfo?(afranchuk)

This disarms a footgun in the FreezableMapping::Freeze API, which was
causing read-only mappings in the global stylesheet cache to fail. These
failures caused content processes to re-load the style sheets,
increasing memory usage.

Assignee: nobody → afranchuk
Status: NEW → ASSIGNED

Both heap-unclassified and layout/style-sheet-cache/{un,}shared seem to have returned to prior values with the given patch.

Pushed by afranchuk@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/dca810679260 Fix base content memory usage regressions r=nika

Backed out for causing build bustages @TestSharedMemory.cpp.

Flags: needinfo?(afranchuk)

Ugh, that was fixed but I didn't push it.

Flags: needinfo?(afranchuk)
Pushed by afranchuk@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/017c9f9793ec Fix base content memory usage regressions r=nika
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 138 Branch

Looks like atleast some of the numbers are back to pre-regression

The bug here was subtle:

  1. The FreezableMapping::Freeze method returned a tuple of a MutableMapping and a ReadOnlyHandle.
  2. In GlobalStyleSheetCache.cpp, we were doing a destructured binding, binding only the ReadOnlyHandle. However, c++ bindings have different semantics than value assignment, and the returned tuple (including the MutableMapping) was only destructed at the end of the enclosing scope, even though the mapping wasn't bound to an identifier.
  3. In the same scope, we tried to map the ReadOnlyHandle to the same fixed address to which the MutableMapping was mapped. Since the MutableMapping was not destructed yet, this mapping failed.

This isn't an immediate error, but it prevents the mapping from being shared with child processes, so the child processes were all loading the stylesheets themselves rather than using the shared-memory mapping holding the loaded stylesheets from the parent process. The child processes loading the stylesheets is what caused the memory regression (and performance regressions of bug 1952917.

(In reply to Alex Franchuk [:afranchuk] from comment #12)

  1. In GlobalStyleSheetCache.cpp, we were doing a destructured binding, binding only the ReadOnlyHandle. However, c++ bindings have different semantics than value assignment, and the returned tuple (including the MutableMapping) was only destructed at the end of the enclosing scope, even though the mapping wasn't bound to an identifier.

This is subtly incorrect: the mapping was bound to an identifier. An example:

  auto [_, readOnlyHandle] = std::move(mMem).Freeze();

As of C++17, _ has no special semantics; it's a perfectly valid identifier. (This contrasts with, probably most relevantly, Rust.)

(In C++26 that may change somewhat; there is a proposal to allow use of _ as a special dummy identifier. Unfortunately, it still wouldn't change the semantics of the above code: it allows redeclaring _ without hassle, but any values bound to it are still live until end-of-scope — they're just occluded.)

(In reply to Ray Kraesig [:rkraesig] from comment #13)

(In reply to Alex Franchuk [:afranchuk] from comment #12)

  1. In GlobalStyleSheetCache.cpp, we were doing a destructured binding, binding only the ReadOnlyHandle. However, c++ bindings have different semantics than value assignment, and the returned tuple (including the MutableMapping) was only destructed at the end of the enclosing scope, even though the mapping wasn't bound to an identifier.

This is subtly incorrect: the mapping was bound to an identifier. An example:

  auto [_, readOnlyHandle] = std::move(mMem).Freeze();

As of C++17, _ has no special semantics; it's a perfectly valid identifier. (This contrasts with, probably most relevantly, Rust.)

(In C++26 that may change somewhat; there is a proposal to allow use of _ as a special dummy identifier. Unfortunately, it still wouldn't change the semantics of the above code: it allows redeclaring _ without hassle, but any values bound to it are still live until end-of-scope — they're just occluded.)

Ah of course :) Thanks for clarifying. But yeah, I figure if that proposal is accepted it'll always be a footgun :/

(In reply to Pulsebot from comment #9)

Pushed by afranchuk@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/017c9f9793ec
Fix base content memory usage regressions r=nika

Perfherder has detected a awsy performance change from push 017c9f9793ecaed3435ad4fd2674fb1d6a9e6358.

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
7% Base Content Heap Unclassified windows11-64-24h2-shippable fission 2,076,175.33 -> 1,935,502.67
5% Base Content Explicit windows11-64-24h2-shippable fission 8,582,291.33 -> 8,181,934.00
3% Base Content Resident Unique Memory windows11-64-24h2-shippable fission 11,829,930.67 -> 11,485,354.67

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 44317

For more information on performance sheriffing please see our FAQ.

(In reply to Cristina Horotan [:chorotan] from comment #10)

https://hg.mozilla.org/mozilla-central/rev/017c9f9793ec

Perfherder has detected a talos performance change from push 017c9f9793ecaed3435ad4fd2674fb1d6a9e6358.

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
4% cpstartup content-process-startup windows11-64-24h2-shippable e10s fission stylo webrender 44.12 -> 42.42

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 44342

For more information on performance sheriffing please see our FAQ.

Regressions: 1954740
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: