7.27 - 3.55% Base Content Heap Unclassified / Base Content Resident Unique Memory + 1 more (Windows) regression on Wed March 5 2025
Categories
(Core :: IPC, defect)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox-esr128 | --- | unaffected |
| firefox136 | --- | unaffected |
| firefox137 | --- | unaffected |
| firefox138 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: afranchuk)
References
(Regression)
Details
(Keywords: perf, perf-alert, regression)
Attachments
(1 file)
Perfherder has detected a awsy performance regression from push 7a43eb20fb7c112e190821dec3c58fb8727e7b25. As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
| Ratio | Test | Platform | Options | Absolute values (old vs new) |
|---|---|---|---|---|
| 7% | Base Content Heap Unclassified | windows11-64-24h2-shippable | fission | 1,934,130.67 -> 2,074,806.00 |
| 5% | Base Content Explicit | windows11-64-24h2-shippable | fission | 8,173,794.00 -> 8,576,967.33 |
| 4% | Base Content Resident Unique Memory | windows11-64-24h2-shippable | fission | 11,453,781.33 -> 11,860,821.33 |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
You can run all of these tests on try with ./mach try perf --alert 44199
The following documentation link provides more information about this command.
For more information on performance sheriffing please see our FAQ.
If you have any questions, please do not hesitate to reach out to fbilt@mozilla.com.
Comment 1•1 year ago
|
||
Set release status flags based on info from the regressing bug 1942129
| Assignee | ||
Comment 3•1 year ago
|
||
heap-unclassified and layout/style-sheet-cache/unshared seem to account for the increased sizes. At least the latter may be due to some subtlety in the changed layout/style/GlobalStyleSheetCache, though superficially the code is identical to the old code. I am investigating the cause. Given that the absolute deltas aren't prohibitively large, I'd say hold off on backing out the patch while we get down to the bottom of this regression.
| Assignee | ||
Comment 4•1 year ago
|
||
This disarms a footgun in the FreezableMapping::Freeze API, which was
causing read-only mappings in the global stylesheet cache to fail. These
failures caused content processes to re-load the style sheets,
increasing memory usage.
Updated•1 year ago
|
| Assignee | ||
Comment 5•1 year ago
|
||
Both heap-unclassified and layout/style-sheet-cache/{un,}shared seem to have returned to prior values with the given patch.
Backed out for causing build bustages @TestSharedMemory.cpp.
| Assignee | ||
Comment 8•1 year ago
|
||
Ugh, that was fixed but I didn't push it.
Comment 10•1 year ago
|
||
| bugherder | ||
Comment 11•1 year ago
|
||
Looks like atleast some of the numbers are back to pre-regression
| Assignee | ||
Comment 12•1 year ago
•
|
||
The bug here was subtle:
- The
FreezableMapping::Freezemethod returned a tuple of aMutableMappingand aReadOnlyHandle. - In
GlobalStyleSheetCache.cpp, we were doing a destructured binding, binding only theReadOnlyHandle. However, c++ bindings have different semantics than value assignment, and the returned tuple (including theMutableMapping) was only destructed at the end of the enclosing scope, even though the mapping wasn't bound to an identifier. - In the same scope, we tried to map the
ReadOnlyHandleto the same fixed address to which theMutableMappingwas mapped. Since theMutableMappingwas not destructed yet, this mapping failed.
This isn't an immediate error, but it prevents the mapping from being shared with child processes, so the child processes were all loading the stylesheets themselves rather than using the shared-memory mapping holding the loaded stylesheets from the parent process. The child processes loading the stylesheets is what caused the memory regression (and performance regressions of bug 1952917.
Comment 13•1 year ago
|
||
(In reply to Alex Franchuk [:afranchuk] from comment #12)
- In
GlobalStyleSheetCache.cpp, we were doing a destructured binding, binding only theReadOnlyHandle. However, c++ bindings have different semantics than value assignment, and the returned tuple (including theMutableMapping) was only destructed at the end of the enclosing scope, even though the mapping wasn't bound to an identifier.
This is subtly incorrect: the mapping was bound to an identifier. An example:
auto [_, readOnlyHandle] = std::move(mMem).Freeze();
As of C++17, _ has no special semantics; it's a perfectly valid identifier. (This contrasts with, probably most relevantly, Rust.)
(In C++26 that may change somewhat; there is a proposal to allow use of _ as a special dummy identifier. Unfortunately, it still wouldn't change the semantics of the above code: it allows redeclaring _ without hassle, but any values bound to it are still live until end-of-scope — they're just occluded.)
| Assignee | ||
Comment 14•1 year ago
|
||
(In reply to Ray Kraesig [:rkraesig] from comment #13)
(In reply to Alex Franchuk [:afranchuk] from comment #12)
- In
GlobalStyleSheetCache.cpp, we were doing a destructured binding, binding only theReadOnlyHandle. However, c++ bindings have different semantics than value assignment, and the returned tuple (including theMutableMapping) was only destructed at the end of the enclosing scope, even though the mapping wasn't bound to an identifier.This is subtly incorrect: the mapping was bound to an identifier. An example:
auto [_, readOnlyHandle] = std::move(mMem).Freeze();As of C++17,
_has no special semantics; it's a perfectly valid identifier. (This contrasts with, probably most relevantly, Rust.)(In C++26 that may change somewhat; there is a proposal to allow use of
_as a special dummy identifier. Unfortunately, it still wouldn't change the semantics of the above code: it allows redeclaring_without hassle, but any values bound to it are still live until end-of-scope — they're just occluded.)
Ah of course :) Thanks for clarifying. But yeah, I figure if that proposal is accepted it'll always be a footgun :/
Comment 15•1 year ago
|
||
(In reply to Pulsebot from comment #9)
Pushed by afranchuk@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/017c9f9793ec
Fix base content memory usage regressions r=nika
Perfherder has detected a awsy performance change from push 017c9f9793ecaed3435ad4fd2674fb1d6a9e6358.
Improvements:
| Ratio | Test | Platform | Options | Absolute values (old vs new) |
|---|---|---|---|---|
| 7% | Base Content Heap Unclassified | windows11-64-24h2-shippable | fission | 2,076,175.33 -> 1,935,502.67 |
| 5% | Base Content Explicit | windows11-64-24h2-shippable | fission | 8,582,291.33 -> 8,181,934.00 |
| 3% | Base Content Resident Unique Memory | windows11-64-24h2-shippable | fission | 11,829,930.67 -> 11,485,354.67 |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
You can run these tests on try with ./mach try perf --alert 44317
For more information on performance sheriffing please see our FAQ.
Comment 16•1 year ago
|
||
(In reply to Cristina Horotan [:chorotan] from comment #10)
Perfherder has detected a talos performance change from push 017c9f9793ecaed3435ad4fd2674fb1d6a9e6358.
Improvements:
| Ratio | Test | Platform | Options | Absolute values (old vs new) |
|---|---|---|---|---|
| 4% | cpstartup content-process-startup | windows11-64-24h2-shippable | e10s fission stylo webrender | 44.12 -> 42.42 |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
You can run these tests on try with ./mach try perf --alert 44342
For more information on performance sheriffing please see our FAQ.
Description
•