0.4% Base Content JS (Windows) regression on Fri September 30 2022
Categories
(Remote Protocol :: Agent, defect, P3)
Tracking
(firefox-esr102 unaffected, firefox105 unaffected, firefox106 unaffected, firefox107 wontfix)
Tracking | Status | |
---|---|---|
firefox-esr102 | --- | unaffected |
firefox105 | --- | unaffected |
firefox106 | --- | unaffected |
firefox107 | --- | wontfix |
People
(Reporter: beatrice.acasandrei, Unassigned)
References
(Regression)
Details
(Keywords: perf, perf-alert, regression)
Perfherder has detected a awsy performance regression from push 256e9641f5ae3a427a779321f387c33bfa808315. As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
Ratio | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|
0.40% | Base Content JS | windows10-64-2004-shippable-qr | fission | 1,627,552.00 -> 1,634,134.67 |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) may be backed out in accordance with our regression policy.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
For more information on performance sheriffing please see our FAQ.
Comment 1•6 months ago
|
||
Tooru, the regression is caused by converting the JSM modules under /remote
to ESM modules. Are we aware of a higher memory usage or some leaks for the new modules?
Comment 2•6 months ago
|
||
I haven't heard of regression with other conversions.
I'll look into the details.
:jonco, do you know any module-specific memory consumption?
Comment 3•6 months ago
•
|
||
The way to investigate these kinds of regressions is to download the about:memory reports from before and after the regression, then use about:memory on your local computer to load and diff them.
Also, if these changes are in code that isn't normally loaded (it looks like some kind of testing automation) then I wouldn't worry about it and you could just mark this WONTFIX. Maybe if the regression was large it would be worth looking at, but it is quite small.
Comment 4•6 months ago
|
||
(In reply to Tooru Fujisawa [:arai] from comment #2)
I haven't heard of regression with other conversions.
They'd have to be modules that are loaded into a base content process with only about:blank for us to notice. None of the other memory tests are sensitive enough to pick up regressions of that magnitude.
That said, a lot of the time, regressions from this sort of refactoring are just from small changes that bump a hashtable into a different size class. They often don't actually amount to a real regression in practice, since those hashtables will usually quickly gain new entries during normal browser usage. I just don't like to assume that that's what happened without at least some investigation.
Comment 5•6 months ago
|
||
Thanks for the pointer!
Here's the diff between before/after, excluding items without change, under "js-main-runtime" for web contents (and subcategories for "runtime"), sum of 8 processes:
category | before | after | diff | note |
---|---|---|---|---|
realms/classes/objects/gc-heap | 1,214,672 | 1,235,024 | +20,352 | |
realms/classes/objects/malloc-heap/slots | 248,192 | 245,632 | -2,560 | |
realms/scripts/gc-heap | 211,200 | 206,720 | -4,480 | |
realms/scripts/malloc-heap/data | 212,736 | 219,392 | +6,656 | |
runtime/gc/store-buffer/vals | 5,760 | 6,144 | +384 | |
runtime/script-data | 733,376 | 758,336 | +24,960 | |
runtime/shared-immutable-strings-cache | 224,864 | 222,944 | -1,920 | |
zones/gc-heap-arena-admin | 33,248 | 66,296 | +33,048 | only 4 processes have this in "before" |
zones/property-maps/gc-heap/compact | 242,320 | 240,656 | -1,664 | |
zones/scopes/gc-heap | 106,208 | 115,168 | +8,960 | |
zones/scopes/malloc-heap | 134,928 | 147,344 | +12,416 | |
zones/shapes/gc-heap/shared | 101,632 | 100,864 | -768 | |
zones/strings/gc-heap/latin1 | 789,752 | 796,152 | +6,400 | |
zones/strings/gc-heap/two-byte | 14,208 | 14,016 | -192 | |
zones/strings/malloc-heap/latin1 | 110,272 | 108,352 | -1,920 | |
zones/strings/malloc-heap/two-byte | 125,568 | 124,544 | -1,024 | |
zones/sundries/gc-heap | 50,656 | 17,920 | -32,736 | |
zones/unused-gc-things | 804,280 | 795,840 | -8,440 | |
total | - | - | +57,472 |
I'll continue investigating.
Comment 6•6 months ago
|
||
Set release status flags based on info from the regressing bug 1790473
Comment 7•5 months ago
|
||
(In reply to Tooru Fujisawa [:arai] from comment #2)
:jonco, do you know any module-specific memory consumption?
The memory consumption of ESMs may be greater than that of JSMs. There is more metadata that is kept around (e.g. the ModuleObject, and import and export data) so there will be some overhead. I'd expect the main memory cost to be the JSScript which would be the same though.
I wondered if something that was previously loaded lazily was now being loaded eagerly, but I couldn't immediately spot any instance of that happening.
If the problem is the metadata overhead the we can try and represent this in a more compact form. There hasn't been any effort to optimise this so far.
Comment 8•5 months ago
•
|
||
I've tried loading all converted JSMs/ESMs in xpcshell (except chrome://remote/content/marionette/*
which wasn't available) before/after the patch stack,
but I don't see the regression in the total memory consumption under js-main-runtime/
.
actually, the memory consumption reduces, throughout the patch stack.
before | patch 1 | patch 2 | patch 3 | patch 4 | patch 5 | patch 6 | |
---|---|---|---|---|---|---|---|
js-main-runtime/gc-heap/chunk-admin | 32768 | 32768 | 32768 | 32768 | 32768 | 32768 | 32768 |
js-main-runtime/gc-heap/unused-arenas | 225280 | 223232 | 218112 | 217088 | 218112 | 225280 | 222208 |
js-main-runtime/gc-heap/unused-chunks | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
js-main-runtime/realms/classes/objects/gc-heap | 546480 | 583290 | 585609 | 587608 | 587337 | 600344 | 641530 |
js-main-runtime/realms/classes/objects/malloc-heap/elements/normal | 7936 | 7936 | 7936 | 7936 | 7936 | 7936 | 7936 |
js-main-runtime/realms/classes/objects/malloc-heap/global-data | 3584 | 3584 | 3584 | 3584 | 3584 | 3584 | 3584 |
js-main-runtime/realms/classes/objects/malloc-heap/misc | 11344 | 11344 | 11344 | 11344 | 11344 | 11344 | 11344 |
js-main-runtime/realms/classes/objects/malloc-heap/slots | 63680 | 59968 | 59584 | 59264 | 59264 | 58048 | 52416 |
js-main-runtime/realms/non-syntactic-lexical-scopes-table | 8288 | 0 | 0 | 0 | 0 | 0 | 0 |
js-main-runtime/realms/scripts/gc-heap | 115040 | 111840 | 111680 | 111520 | 111520 | 110480 | 107040 |
js-main-runtime/realms/scripts/malloc-heap/data | 217744 | 197264 | 195152 | 194432 | 194432 | 189104 | 108896 |
js-main-runtime/realms/sundries/malloc-heap | 10556 | 11342 | 12604 | 12776 | 12386 | 11906 | 7758 |
js-main-runtime/runtime | 2326706 | 2321098 | 2330684 | 2331312 | 2327332 | 2328308 | 2254692 |
js-main-runtime/zones/gc-heap-arena-admin | 23352 | 24008 | 24088 | 24112 | 24088 | 24248 | 25270 |
js-main-runtime/zones/getter-setters-gc-heap | 10064 | 10064 | 10064 | 10064 | 10064 | 10064 | 10064 |
js-main-runtime/zones/property-maps/gc-heap/compact | 107432 | 102440 | 101608 | 101296 | 101296 | 98800 | 92144 |
js-main-runtime/zones/property-maps/gc-heap/normal | 32096 | 32096 | 32096 | 32096 | 32096 | 31960 | 31824 |
js-main-runtime/zones/property-maps/malloc-heap/children | 8720 | 0 | 0 | 0 | 0 | 0 | 0 |
js-main-runtime/zones/property-maps/malloc-heap/tables | 34112 | 34112 | 34112 | 34112 | 34112 | 34112 | 34112 |
js-main-runtime/zones/propmap-tables | 16768 | 16768 | 16768 | 16768 | 16768 | 16768 | 16768 |
js-main-runtime/zones/scopes/gc-heap | 73344 | 73216 | 73216 | 73216 | 73216 | 73216 | 73120 |
js-main-runtime/zones/scopes/malloc-heap | 93344 | 93728 | 93760 | 93792 | 93776 | 93968 | 94432 |
js-main-runtime/zones/shape-tables | 41088 | 41088 | 41088 | 41088 | 41088 | 41088 | 41088 |
js-main-runtime/zones/shapes/gc-heap/base | 3768 | 3768 | 3768 | 3768 | 3768 | 3768 | 3768 |
js-main-runtime/zones/shapes/gc-heap/dict | 160 | 160 | 160 | 160 | 160 | 160 | 160 |
js-main-runtime/zones/shapes/gc-heap/shared | 45696 | 43648 | 43456 | 43328 | 43328 | 42400 | 39776 |
js-main-runtime/zones/strings/gc-heap/latin1 | 260992 | 256800 | 256456 | 256384 | 256504 | 255896 | 253312 |
js-main-runtime/zones/strings/gc-heap/two-byte | 7232 | 4568 | 4472 | 4376 | 4232 | 3848 | 2696 |
js-main-runtime/zones/strings/malloc-heap/latin1 | 185920 | 178224 | 177616 | 177472 | 177744 | 176224 | 170384 |
js-main-runtime/zones/strings/malloc-heap/two-byte | 29072 | 13376 | 12864 | 12352 | 11584 | 8880 | 2432 |
js-main-runtime/zones/sundries/gc-heap | 2828 | 2578 | 2778 | 2728 | 2728 | 2823 | 2648 |
js-main-runtime/zones/sundries/malloc-heap | 8896 | 15640 | 16280 | 16528 | 16032 | 16336 | 15768 |
js-main-runtime/zones/unique-id-map | 17152 | 8960 | 8960 | 8960 | 8960 | 8960 | 8960 |
js-main-runtime/zones/unused-gc-things | 250172 | 264996 | 269141 | 268960 | 268255 | 269801 | 296680 |
total | 4821614 | 4783904 | 4791808 | 4791192 | 4785814 | 4792422 | 4665578 |
There are 2 categories that notably increases:
js-main-runtime/realms/classes/objects/gc-heap
js-main-runtime/zones/unused-gc-things
and 2 categories that notably decreases, especially with the last part:
js-main-runtime/realms/scripts/malloc-heap/data
js-main-runtime/runtime
and in the total, the decrease is bigger than the increase.
So, there are 2 possibilities:
- given that not all modules will be loaded during test, the increase was bigger than the decrease with that set
- as jonco says, the set of loaded modules become different with the change
I'll check which modules are actually loaded during the test and see if the regression happens with the set.
Comment 9•5 months ago
|
||
When I run browser and open about:blank
, only the following 2 modules are loaded under chrome://remote/
chrome://remote/content/components/RemoteAgent.jsm
chrome://remote/content/shared/Sync.jsm
Then, when I run awsy (./mach awsy-test
), apparently chrome://remote/content/marionette/*
modules are used there, and the above test wasn't sufficient.
modules observed so far:
chrome://remote/content/components/Marionette.jsm
chrome://remote/content/marionette/actors/MarionetteEventsChild.jsm
chrome://remote/content/marionette/event.js
chrome://remote/content/shared/Log.jsm
I'll check the memory consumption with modules actually used in awsy.
Comment 10•5 months ago
|
||
I don't observe the regression locally on macOS on M1, in the range,
and actually, the memory consumption reduces from 1,978,168 bytes to 1,966,056 bytes.
I'll check on other OS to see if it's OS or architecture specific.
Comment 11•5 months ago
|
||
Observed the regression on Windows 11 x64 (1,623,336 bytes to 1,632,007 bytes : 0.5%)
I'll look into the details.
Updated•5 months ago
|
Comment 12•5 months ago
|
||
The regression comes from the patch for marionette https://hg.mozilla.org/integration/autoland/rev/144dbf9b2f1b9a76102f1e685e3f36ae1d3809b7
Base Content JS | |
---|---|
7ec263b6 before | 1,623,336 |
a0878c65 remote/shared | 1,621,624 |
44f0aa21 remote/components | 1,621,624 |
24894e48 remote/server | 1,621,624 |
144dbf9b remote/marionette | 1,632,008 |
2a9cd41c remote/webdriver-bidi | 1,632,008 |
407ee0d1 remote/cdp | 1,632,008 |
Comment 13•5 months ago
|
||
That would make sense given that it's Marionette that is used for AWSY (BiDi and CDP aren't used here).
What about this change:
https://hg.mozilla.org/integration/autoland/rev/144dbf9b2f1b9a76102f1e685e3f36ae1d3809b7#l23.25
WebReference
has been imported lazily before but due to lint restrictions needs to be imported immediately now given that it's from the same module as the element
import. Could that have caused this increase?
Updated•5 months ago
|
Comment 14•5 months ago
|
||
:jdescottes, since you are the author of the regressor, bug 1790471, could you take a look? Also, could you set the severity field?
For more information, please visit auto_nag documentation.
Updated•5 months ago
|
Comment 16•5 months ago
|
||
OK, I think we're good. It looks like, if anything, ESMs use slightly less memory. I'm not sure why the Marionette case is using slightly more memory on Windows (I suspect it's the typical hashtable size class boundary issue), but I'm also not especially worried about it since Marionette typically isn't loaded in real world use. Ideally I'd like to stop loading the content modules in the AWSY tests too.
(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] (away 10/17 - 10/20) from comment #13)
What about this change:
https://hg.mozilla.org/integration/autoland/rev/144dbf9b2f1b9a76102f1e685e3f36ae1d3809b7#l23.25
WebReference
has been imported lazily before but due to lint restrictions needs to be imported immediately now given that it's from the same module as theelement
import. Could that have caused this increase?
It's possible, since it's now being defined on the top lexical scope rather than one of the lazy objects. It's also possibly because we now define all of the lazy getters via the new ESM API, or because we do it in one step. Although I'd be surprised if either of those would cause a 9KiB regression.
I think it is possible that we're going to wind up generating more/larger ICs for code that references variables on the top lexical scope than we did before, since I think we always wound up just doing hashtable lookups for anything on the top lexical/object scope in JSMs (at least that's what I always saw in profiles), but I can't imagine that we'd actually ever enter JIT mode for any of this code during AWSY. It should be mostly unused.
Updated•5 months ago
|
Description
•