0.67 - 0.67% Base Content JS / Base Content JS + 2 more (Linux, OSX, Windows) regression on Thu July 28 2022
Categories
(Core :: JavaScript Engine, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr91 | --- | unaffected |
firefox-esr102 | --- | unaffected |
firefox103 | --- | unaffected |
firefox104 | --- | unaffected |
firefox105 | --- | wontfix |
firefox106 | --- | wontfix |
firefox107 | --- | wontfix |
firefox108 | --- | fixed |
People
(Reporter: bacasandrei, Assigned: sfink)
References
(Regression)
Details
(Keywords: perf, perf-alert, regression)
Attachments
(1 file)
Perfherder has detected a awsy performance regression from push b014f84dacd08dabe51ea1cc1366b94633248493. As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
Ratio | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|
1% | Base Content JS | linux1804-64-shippable-qr | fission | 1,605,860.00 -> 1,616,620.00 |
1% | Base Content JS | macosx1015-64-shippable-qr | fission | 1,605,864.00 -> 1,616,616.00 |
1% | Base Content JS | macosx1015-64-shippable-qr | fission | 1,605,864.00 -> 1,616,616.00 |
1% | Base Content JS | windows10-64-2004-shippable-qr | fission | 1,607,712.00 -> 1,618,448.00 |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) may be backed out in accordance with our regression policy.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
For more information on performance sheriffing please see our FAQ.
Comment 1•2 years ago
|
||
Set release status flags based on info from the regressing bug 1774733
Assignee | ||
Comment 2•2 years ago
|
||
I am landing some followup fixes for fuzz bugs. I would like to investigate these size regressions separately and decide what to do with them. I was not expecting any size changes, but it may be causing fragmentation by temporarily allocating a bunch of space that doesn't get used.
Updated•2 years ago
|
Updated•2 years ago
|
Comment 4•2 years ago
|
||
Too late to address for 105.
Assignee | ||
Comment 5•2 years ago
|
||
I have mostly tracked this down. It's mozjemalloc slop from realloc. Consider these matching logs.
Before the change:
0 pid=1399 len=278 (good 288) built with 1024 -> usable=288
0 pid=1399 len=20 (good 32) built with 256 -> usable=256
0 pid=1399 len=21 (good 32) built with 256 -> usable=256
0 pid=1399 len=20 (good 32) built with 256 -> usable=256
0 pid=1399 len=33 (good 48) built with 512 -> usable=80
0 pid=1399 len=21 (good 32) built with 256 -> usable=256
0 pid=1399 len=26 (good 32) built with 512 -> usable=512
After the change:
2 pid=1619 len=278 (good 288) built with 2048 -> usable=288
2 pid=1619 len=20 (good 32) built with 2048 -> usable=2048
2 pid=1619 len=21 (good 32) built with 2048 -> usable=2048
2 pid=1619 len=20 (good 32) built with 2048 -> usable=2048
2 pid=1619 len=33 (good 48) built with 4096 -> usable=80
2 pid=1619 len=21 (good 32) built with 2048 -> usable=2048
2 pid=1619 len=26 (good 32) built with 4096 -> usable=4096
What this means is that when building a string that ends up being only 20 bytes, previously it built it in a 256-byte allocation and then called realloc(20). mozjemalloc chose to return the original allocation, wasting 236 bytes. The new code builds that 20 byte string in a space of 2048 bytes and again mozjemalloc returns the original allocation, wasting 2028 bytes. (Note that these numbers aren't exactly right because my logging forgot to account for 2-byte chars. Some of these numbers should be doubled. I'm not sure which ones. But it doesn't really change the picture that much.)
I can get more accurate numbers, but pbone: do you know if there's a good way to get a better size fit short of making a fresh allocation and copying it over? Which might be a good thing to do when the slop is this high, if there's no better way.
Comment 6•2 years ago
|
||
I read the realloc code yesterday and it looks like it should be moving your allocation. but since it doesn't there's something I'm missing. If you've got a single code-path that is easy to change to malloc()
and free()
rather then realloc()
. Then I'd say a temporary fix is fine, and I can test jemalloc for this behaviour soon. If not I'll try to do it sooner.
At the very least this is an opportunity for me/us to understand jemalloc a little better.
Assignee | ||
Comment 7•2 years ago
|
||
Updated•2 years ago
|
Assignee | ||
Comment 8•2 years ago
|
||
So the problem was that previously, we would build eg a 21-char string, and the doubling behavior might give a 256-byte Vector to build it in. 21 chars is smaller than the inline space in the Vector object (which is set to 64 bytes, or either 64 latin1 chars or 32 two-byte chars). That was treated as a small string, not worth reallocing—but that decision was unrelated to the amount of space wasted, only to the size of the string.
Even slightly longer strings would never run into this case, because they would pass the sMaxInlineStorage test and then the capacity - length > length / 4
condition would pass and the realloc would happen.
I changed the logic to instead look at the number of wasted bytes, and if that is too high, to check what fraction of the allocation is wasted.
The related change is that instead of using capacity - length > length / 4
, it'll now do capacity - length > capacity / 4
. The former tests whether the capacity is more than 125% of the length. The latter tests whether we're wasting more than a quarter of the allocation. The difference isn't large, but it makes more sense to me to be talking about percentage of wasted space. (The specific difference is that the old behavior would realloc if you wasted more than 80% of the space; the new will realloc if you waste more than 75%.) This change isn't strictly necessary, it mostly just makes the code easier to read.
Assignee | ||
Comment 9•2 years ago
|
||
Oh right, results from pushing that change.
# push 0: base revision, without the aggressive string resizing
0 1595056.0
# push 1: aggressive string resizing, the one that had a 1% memory regression
1 1605856.0
# push 2: with the modifications described above
2 1593824.0
Numbers are the base content JS metric reported in perfherder. This change appears to resolve the regression and even improve slightly on the original.
Comment 10•2 years ago
|
||
Comment 11•2 years ago
|
||
bugherder |
Comment 12•2 years ago
|
||
Since nightly and release are affected, beta will likely be affected too.
For more information, please visit auto_nag documentation.
Comment 13•2 years ago
|
||
The patch landed in nightly and beta is affected.
:sfink, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox107
towontfix
.
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 14•2 years ago
|
||
I don't think it's worth backporting. It's not that big of a regression (or improvement over the original), and I don't anticipate changing this code in a way that I'd want to backport for another reason.
Reporter | ||
Comment 15•2 years ago
|
||
(In reply to Pulsebot from comment #10)
Pushed by sfink@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e73d6b8edd64
Realloc strings based on #bytes wasted, not #bytes used. r=jandem
== Change summary for alert #35721 (as of Fri, 21 Oct 2022 04:23:52 GMT) ==
Improvements:
Ratio | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|
1% | Base Content JS | windows10-64-2004-shippable-qr | fission | 1,638,482.00 -> 1,626,448.00 |
0.43% | Base Content JS | windows10-64-2004-shippable-qr | fission | 1,633,503.11 -> 1,626,422.00 |
0.41% | Base Content JS | linux1804-64-shippable-qr | fission | 1,627,152.00 -> 1,620,440.00 |
0.41% | Base Content JS | macosx1015-64-shippable-qr | fission | 1,626,644.00 -> 1,620,040.00 |
For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=35721
Description
•