Closed Bug 1888676 Opened 3 months ago Closed 1 month ago

damp source-map.simple.DAMP + 7 more (Linux, OSX, Windows) regression on Tue March 19 2024

Categories

(Core :: JavaScript Engine, defect, P3)

defect

Tracking

()

RESOLVED FIXED
128 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox124 --- unaffected
firefox125 --- unaffected
firefox126 --- wontfix
firefox127 --- wontfix
firefox128 --- fixed

People

(Reporter: nchevobbe, Assigned: jlink)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: perf, perf-alert, regression)

There was some regression on our performance test, in particular the ones for Debugger pretty printing, and the ones on the sourcemap library.
TRY seems to indicate that Bug 1879143 caused this: https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=try&newProject=try&newRevision=47c7868678f4ab3b2aa607d08cee4a9e10999a8e&originalSignature=4943452&newSignature=4943452&framework=12&application=firefox&originalRevision=bab2bd508f36fcde9dda71390e045a40c45feb2d&page=1


Perfherder has detected a devtools performance regression from push 7c0de707e2a448dd3846c81f12ee9bae3ebca9d3. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
29% damp custom.pretty-print.jsdebugger.reload.DAMP macosx1015-64-shippable-qr e10s fission stylo webrender-sw 7,659.93 -> 9,860.87
28% damp custom.jsdebugger.pretty-print.reload-and-pause.DAMP macosx1015-64-shippable-qr e10s fission stylo webrender-sw 7,703.58 -> 9,876.41
16% damp source-map.constructor.DAMP windows10-64-shippable-qr e10s fission stylo webrender 93.96 -> 108.69
14% damp source-map.constructor.DAMP windows10-64-shippable-qr e10s fission stylo webrender-sw 92.17 -> 105.53
13% damp source-map.constructor.DAMP macosx1015-64-shippable-qr e10s fission stylo webrender-sw 90.69 -> 102.76
13% damp source-map.constructor.DAMP linux1804-64-shippable-qr e10s fission stylo webrender 111.89 -> 126.39
11% damp source-map.constructor.DAMP macosx1015-64-shippable-qr e10s fission stylo webrender 92.17 -> 102.70
11% damp source-map.constructor.DAMP linux1804-64-shippable-qr e10s fission stylo webrender-sw 112.27 -> 124.48
4% damp source-map.simple.DAMP linux1804-64-shippable-qr e10s fission stylo webrender 224.63 -> 232.97
3% damp source-map.simple.DAMP linux1804-64-shippable-qr e10s fission stylo webrender-sw 221.26 -> 228.48
3% damp source-map.constructor.DAMP linux1804-64-shippable-qr e10s fission stylo webrender 111.84 -> 114.81

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
5% damp console.log-in-loop-content-process-nodelist macosx1015-64-shippable-qr e10s fission stylo webrender-sw 192.22 -> 183.20
2% damp console.content-process-bulklog macosx1015-64-shippable-qr e10s fission stylo webrender 506.50 -> 494.82

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 41942

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(jlink)

Set release status flags based on info from the regressing bug 1879143

Btw I am starting to take a look at this now. I am on Windows so I am going to see if I can reproduce the source-map.constructor differences locally.

It's really strange that the custom.pretty-print regression is so severe but only on Mac.

Some notes/observations on the cited pretty-print tests:

  1. The graphs indicate that they are tri-model.
  2. Prior to the time around when my commit landed, the fastest of the three "modes" was the most common and dominated.
  3. Sometime around when my commit landed (but possibly somewhat before it) the slowest of the three modes started becoming noticeably more common than before.
  4. Sometime very shortly after my change, the middle mode started becoming more common.
  5. Sometime after my change, maybe about a day later, the faster of the modes became very uncommon and now the test is dominated by the slowest mode.

It does seem like something real has changed wrt the pretty-print test (and it looks to me like there were two distinct changes). My questions:

  1. To what extent are we certain that any pretty-print test changes were caused by the set of commits associated with this alert?
  2. To what extent are we certain that my specific change is responsible for any of the pretty-print test changes?
Flags: needinfo?(jlink)

I will be continuing to look at this but, from what I can tell, there is a regression from my commit on the pretty-print tests on macOS, but it is not nearly as significant as shown in the alert above.

I used ./mach perf --alert 41942 --rebuild 20 on both my commit, and the commit right before it, to run these tests again in CI. Because I don't know what I'm doing, that gave me two comparison reports - one comparing my commit against itself and another doing the same for the previous commit.

If you look between those two comparison reports, it looks like both of those tests went from ~8500ms to ~8800ms, which is about a 3.5% regression.

The graphs from all of those runs are still tri-modal and visually they look almost identical. I'm not sure that that difference is even really related to the change or just to the randomness of the modality.

Blocks: sm-js-perf
Severity: -- → S3
Priority: -- → P3

Justin, next week is the final week of nightly before Fx126 hits beta.
Any updates since Comment 4, is this something you want to address in the Fx126 cycle?

Flags: needinfo?(jlink)

Hi Donal,
Yup, this regression is actually what I have been focused on for the last week although I hadn't posted any updates here.

  1. I think that the regressions reported with pretty-print on macOS are real but are mostly not related to my change. It would nice to have someone else take a look at the data though and validate that conclusion though. Nicolas Chevobbe: Would you mind taking a look at my comments 3 and 4 and the evidence there and see if you reach the same conclusion or if you think that I'm missing something?
  2. The damp.source-map.constructor regressions are real and are caused by my change. I am able to reproduce this regression locally but am still working through why exactly this is. A cache is designed to speed up future access of data and, in the constructor test cases, we are mostly just creating stuff and then destroying it and not really doing anything interesting so it seems that there's possibly a bit more overhead with doing that with the new cache than with the old. It seems like the overhead should be similar to the previous version, however. There seems to be a lot of (mostly ineffective?) GC going on though so perhaps it's related to that.

At any rate, I am still looking to figure out what's going on here and have plenty to look at/experiment with. I don't think that the current state should be particularly problematic though as the new cache seems to be working correctly and is providing some improvements in the more intense use cases. I'll just have to figure out why there's a negative change here.

Flags: needinfo?(jlink) → needinfo?(nchevobbe)

(In reply to Justin Link from comment #6)

  1. I think that the regressions reported with pretty-print on macOS are real but are mostly not related to my change. It would nice to have someone else take a look at the data though and validate that conclusion though. Nicolas Chevobbe: Would you mind taking a look at my comments 3 and 4 and the evidence there and see if you reach the same conclusion or if you think that I'm missing something?

I agree, the test alternates between different values, even before your patch. We should investigate on our side to figure out what's going on and make it more stable.

Flags: needinfo?(nchevobbe)
Assignee: nobody → jlink

Set release status flags based on info from the regressing bug 1879143

Justin, Fx126 is now in Beta. Do you plan on addressing this during the Beta cycle in time for Fx126 release?

Flags: needinfo?(jlink)
Depends on: 1893690

I've spent a good amount of time studying this situation and I'm about to be on PTO for a bit so this might be a good time to post a summary of my findings so far:

  • The regression in source-map.constructor.DAMP is readily reproducible for me locally (on Windows) and the numbers are pleasantly consistent from run to test. For testing I edited the js source files locally to only run the source-map.constructor.DAMP unit test and increased the number of times that it runs.
  • In that test, ~8,000 strings are atomized and then looked up an additional ~275,000 time. With the previous version of the "cache" there is, as expected, a 100% hit rate on those look-ups. With the new cache, about 150,000 atoms are found in the cache.
  • When an atom is not found in the cache, the list of permanent atoms is checked. If not found in the list of permanent atoms, the master list is checked and the atom is created if not found in the master list.
  • If we reach the master list, the atom is marked.
  • Regular GC does not run during this the test so that is not a factor.

With associativity implemented see bug 1893690, particularly with 4-way associativity implemented, the hit rate gets pretty good in this test and a large portion of the performance regression in source-map.constructor.DAMP is un-regressed. SP3 also likes both 2-way and 4-way associative caches so this is likely a good change overall. (See this comment for perf comparisons.) The other DAMP tests have mixed opinions of the change, however. Some distinctly like it however some others do not like it. I would like to take a look at some of these other tests and find out why this is the case.

In regards to the specific question about, the "fix" is still being worked on and will not make it into 126.

Flags: needinfo?(jlink)

With the landing of bug 1893690 I expect that most of the source-map.constructor and source-map.simple regressions will have gone away. In general, some of the DAMP tests go up a little bit with these changes and some go down a little bit.

Bug 1893690 didn't fully recovered source-map.constructor, but that's good enough IMO.

Status: NEW → RESOLVED
Closed: 1 month ago
Resolution: --- → FIXED
Summary: 28.73 - 3.71% damp custom.pretty-print.jsdebugger.reload.DAMP / damp source-map.simple.DAMP + 7 more (Linux, OSX, Windows) regression on Tue March 19 2024 → damp source-map.simple.DAMP + 7 more (Linux, OSX, Windows) regression on Tue March 19 2024
Target Milestone: --- → 128 Branch
You need to log in before you can comment on or make changes to this bug.