damp source-map.simple.DAMP + 7 more (Linux, OSX, Windows) regression on Tue March 19 2024
Categories
(Core :: JavaScript Engine, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr115 | --- | unaffected |
firefox124 | --- | unaffected |
firefox125 | --- | unaffected |
firefox126 | --- | wontfix |
firefox127 | --- | wontfix |
firefox128 | --- | fixed |
People
(Reporter: nchevobbe, Assigned: jlink)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: perf, perf-alert, regression)
There was some regression on our performance test, in particular the ones for Debugger pretty printing, and the ones on the sourcemap library.
TRY seems to indicate that Bug 1879143 caused this: https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=try&newProject=try&newRevision=47c7868678f4ab3b2aa607d08cee4a9e10999a8e&originalSignature=4943452&newSignature=4943452&framework=12&application=firefox&originalRevision=bab2bd508f36fcde9dda71390e045a40c45feb2d&page=1
Perfherder has detected a devtools performance regression from push 7c0de707e2a448dd3846c81f12ee9bae3ebca9d3. As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
Ratio | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|
29% | damp custom.pretty-print.jsdebugger.reload.DAMP | macosx1015-64-shippable-qr | e10s fission stylo webrender-sw | 7,659.93 -> 9,860.87 |
28% | damp custom.jsdebugger.pretty-print.reload-and-pause.DAMP | macosx1015-64-shippable-qr | e10s fission stylo webrender-sw | 7,703.58 -> 9,876.41 |
16% | damp source-map.constructor.DAMP | windows10-64-shippable-qr | e10s fission stylo webrender | 93.96 -> 108.69 |
14% | damp source-map.constructor.DAMP | windows10-64-shippable-qr | e10s fission stylo webrender-sw | 92.17 -> 105.53 |
13% | damp source-map.constructor.DAMP | macosx1015-64-shippable-qr | e10s fission stylo webrender-sw | 90.69 -> 102.76 |
13% | damp source-map.constructor.DAMP | linux1804-64-shippable-qr | e10s fission stylo webrender | 111.89 -> 126.39 |
11% | damp source-map.constructor.DAMP | macosx1015-64-shippable-qr | e10s fission stylo webrender | 92.17 -> 102.70 |
11% | damp source-map.constructor.DAMP | linux1804-64-shippable-qr | e10s fission stylo webrender-sw | 112.27 -> 124.48 |
4% | damp source-map.simple.DAMP | linux1804-64-shippable-qr | e10s fission stylo webrender | 224.63 -> 232.97 |
3% | damp source-map.simple.DAMP | linux1804-64-shippable-qr | e10s fission stylo webrender-sw | 221.26 -> 228.48 |
3% | damp source-map.constructor.DAMP | linux1804-64-shippable-qr | e10s fission stylo webrender | 111.84 -> 114.81 |
Improvements:
Ratio | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|
5% | damp console.log-in-loop-content-process-nodelist | macosx1015-64-shippable-qr | e10s fission stylo webrender-sw | 192.22 -> 183.20 |
2% | damp console.content-process-bulklog | macosx1015-64-shippable-qr | e10s fission stylo webrender | 506.50 -> 494.82 |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
You can run these tests on try with ./mach try perf --alert 41942
For more information on performance sheriffing please see our FAQ.
Comment 1•3 months ago
|
||
Set release status flags based on info from the regressing bug 1879143
Assignee | ||
Comment 2•3 months ago
|
||
Btw I am starting to take a look at this now. I am on Windows so I am going to see if I can reproduce the source-map.constructor differences locally.
It's really strange that the custom.pretty-print regression is so severe but only on Mac.
Assignee | ||
Comment 3•3 months ago
•
|
||
Some notes/observations on the cited pretty-print tests:
- The graphs indicate that they are tri-model.
- Prior to the time around when my commit landed, the fastest of the three "modes" was the most common and dominated.
- Sometime around when my commit landed (but possibly somewhat before it) the slowest of the three modes started becoming noticeably more common than before.
- Sometime very shortly after my change, the middle mode started becoming more common.
- Sometime after my change, maybe about a day later, the faster of the modes became very uncommon and now the test is dominated by the slowest mode.
It does seem like something real has changed wrt the pretty-print test (and it looks to me like there were two distinct changes). My questions:
- To what extent are we certain that any pretty-print test changes were caused by the set of commits associated with this alert?
- To what extent are we certain that my specific change is responsible for any of the pretty-print test changes?
Assignee | ||
Comment 4•3 months ago
•
|
||
I will be continuing to look at this but, from what I can tell, there is a regression from my commit on the pretty-print tests on macOS, but it is not nearly as significant as shown in the alert above.
I used ./mach perf --alert 41942 --rebuild 20 on both my commit, and the commit right before it, to run these tests again in CI. Because I don't know what I'm doing, that gave me two comparison reports - one comparing my commit against itself and another doing the same for the previous commit.
If you look between those two comparison reports, it looks like both of those tests went from ~8500ms to ~8800ms, which is about a 3.5% regression.
The graphs from all of those runs are still tri-modal and visually they look almost identical. I'm not sure that that difference is even really related to the change or just to the randomness of the modality.
Updated•3 months ago
|
Comment 5•3 months ago
|
||
Justin, next week is the final week of nightly before Fx126 hits beta.
Any updates since Comment 4, is this something you want to address in the Fx126 cycle?
Assignee | ||
Comment 6•3 months ago
|
||
Hi Donal,
Yup, this regression is actually what I have been focused on for the last week although I hadn't posted any updates here.
- I think that the regressions reported with pretty-print on macOS are real but are mostly not related to my change. It would nice to have someone else take a look at the data though and validate that conclusion though. Nicolas Chevobbe: Would you mind taking a look at my comments 3 and 4 and the evidence there and see if you reach the same conclusion or if you think that I'm missing something?
- The damp.source-map.constructor regressions are real and are caused by my change. I am able to reproduce this regression locally but am still working through why exactly this is. A cache is designed to speed up future access of data and, in the constructor test cases, we are mostly just creating stuff and then destroying it and not really doing anything interesting so it seems that there's possibly a bit more overhead with doing that with the new cache than with the old. It seems like the overhead should be similar to the previous version, however. There seems to be a lot of (mostly ineffective?) GC going on though so perhaps it's related to that.
At any rate, I am still looking to figure out what's going on here and have plenty to look at/experiment with. I don't think that the current state should be particularly problematic though as the new cache seems to be working correctly and is providing some improvements in the more intense use cases. I'll just have to figure out why there's a negative change here.
Reporter | ||
Comment 7•3 months ago
|
||
(In reply to Justin Link from comment #6)
- I think that the regressions reported with pretty-print on macOS are real but are mostly not related to my change. It would nice to have someone else take a look at the data though and validate that conclusion though. Nicolas Chevobbe: Would you mind taking a look at my comments 3 and 4 and the evidence there and see if you reach the same conclusion or if you think that I'm missing something?
I agree, the test alternates between different values, even before your patch. We should investigate on our side to figure out what's going on and make it more stable.
Assignee | ||
Updated•3 months ago
|
Comment 8•2 months ago
|
||
Set release status flags based on info from the regressing bug 1879143
Comment 9•2 months ago
|
||
Justin, Fx126 is now in Beta. Do you plan on addressing this during the Beta cycle in time for Fx126 release?
Assignee | ||
Comment 10•2 months ago
|
||
I've spent a good amount of time studying this situation and I'm about to be on PTO for a bit so this might be a good time to post a summary of my findings so far:
- The regression in source-map.constructor.DAMP is readily reproducible for me locally (on Windows) and the numbers are pleasantly consistent from run to test. For testing I edited the js source files locally to only run the source-map.constructor.DAMP unit test and increased the number of times that it runs.
- In that test, ~8,000 strings are atomized and then looked up an additional ~275,000 time. With the previous version of the "cache" there is, as expected, a 100% hit rate on those look-ups. With the new cache, about 150,000 atoms are found in the cache.
- When an atom is not found in the cache, the list of permanent atoms is checked. If not found in the list of permanent atoms, the master list is checked and the atom is created if not found in the master list.
- If we reach the master list, the atom is marked.
- Regular GC does not run during this the test so that is not a factor.
With associativity implemented see bug 1893690, particularly with 4-way associativity implemented, the hit rate gets pretty good in this test and a large portion of the performance regression in source-map.constructor.DAMP is un-regressed. SP3 also likes both 2-way and 4-way associative caches so this is likely a good change overall. (See this comment for perf comparisons.) The other DAMP tests have mixed opinions of the change, however. Some distinctly like it however some others do not like it. I would like to take a look at some of these other tests and find out why this is the case.
In regards to the specific question about, the "fix" is still being worked on and will not make it into 126.
Updated•2 months ago
|
Assignee | ||
Comment 11•1 month ago
|
||
With the landing of bug 1893690 I expect that most of the source-map.constructor and source-map.simple regressions will have gone away. In general, some of the DAMP tests go up a little bit with these changes and some go down a little bit.
Reporter | ||
Comment 12•1 month ago
|
||
Bug 1893690 didn't fully recovered source-map.constructor
, but that's good enough IMO.
Reporter | ||
Updated•1 month ago
|
Updated•1 month ago
|
Description
•