Closed Bug 1796685 Opened 2 years ago Closed 2 years ago

7.14% tsvg_static (Windows) regression on Mon October 17 2022

Categories

(Firefox :: Firefox View, defect)

defect

Tracking

()

RESOLVED INVALID
Tracking Status
firefox-esr102 --- unaffected
firefox106 --- unaffected
firefox107 --- wontfix
firefox108 --- wontfix

People

(Reporter: aglavic, Unassigned)

References

(Regression)

Details

(4 keywords)

Attachments

(3 files)

Perfherder has detected a talos performance regression from push 81721704c6ccdd0ec70eb5963a38e10b586169e9. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
7% tsvg_static windows10-64-shippable-qr e10s fission stylo webrender-sw 49.54 -> 53.08

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(tgiles)

Set release status flags based on info from the regressing bug 1794474

I think this test is weirdly bimodal. It's also not really obvious to me what the test measures. I read the documentation; it says:

data: we load the 5 svg pages 25 times, resulting in 5 sets of 25 data points

summarization: An svg-only number that measures SVG rendering

But that doesn't tell me anything useful. Is this a memory measure? CPU? Responsiveness? Energy use? Something else?

I also don't think this test result should have changed as a result of changes entirely contained in the firefox view directory: https://hg.mozilla.org/mozilla-central/rev/81721704c6cc given those are unrelated to SVGs.

:dholbert, :jwatt, can you help?

Flags: needinfo?(jwatt)
Flags: needinfo?(dholbert)
Flags: needinfo?(tgiles)
Attached image image.png

I think this test is weirdly bimodal

It's worse than that; there are also occasional much-more-extreme outliers, with values on the order of 178 or 228 (much higher than the "bad" 53.08 value that this bug was filed for). See screenshot taken from the graph icon-link in the alert, shown for the past 6 months.

It's also not really obvious to me what the test measures.

I forget precisely, but it's some measure of paint time.

(Historical note: this test used to be part of tsvgx; we split this into its part out in bug 1318530, for reasons related to "ASAP" mode and paint-suppression-removal, as discussed in bug 1317048 comment 15.)

I also don't think this test result should have changed as a result of changes entirely contained in the firefox view directory: https://hg.mozilla.org/mozilla-central/rev/81721704c6cc given those are unrelated to SVGs.

Agreed. If it looks like this is just slightly-anomalous oscillation among the more-frequent bimodal values, then I think we can just ignore this and move on.

Flags: needinfo?(dholbert)

Here's the graph with red arrows pointing to the data-points for the two revisions that were flagged in the alert that prompted this bug.

The data does seem to be biased towards the lower bimodal value at the start of this range vs. evenly distributed between low/high at the end. So, it does superficially look like a regression, overall; and PerfHerder was potentially reasonable to detect that as an interesting possible-regression [though I suspect it's noise and am spamming retriggers to test that theory]. But it's not at all obvious why PerfHerder chose this particular single commit as the "guilty" one... That merits further investigation into PerfHerder blame assignment, I think. I would've expected a broader "potentially-guilty pushlog" which would've included the earlier commits that had the higher measurements shown on this graph, rather than the single-push attribution that we got in comment 0.

(For future reference, this is the link to the graph that I was looking at in comment 4:
https://treeherder.mozilla.org/perfherder/graphs?highlightAlerts=1&highlightChangelogData=1&highlightCommonAlerts=0&series=autoland,4086121,1,1&timerange=1209600&zoom=1666025942154,1666032538381,45.08356106708301,56.73211683320326

If you load it now, there will be more data-points than were shown in my comment 4 screenshot, due to some of my retriggers having completed. More data will be coming in as additional retriggers complete, too.)

Also, if we go back in time a couple of days, we have a perfherder alert an "improvement" that's nearly the reverse of comment 0, i.e. this metric oscillating in the opposite "good" directly shortly before it oscillated back in the "bad" direction as reported here.

That alert was https://treeherder.mozilla.org/perfherder/alerts?id=35710 and was filed for a change between two commits on Oct 15, going from a tsvg_static value of 52.7 to 49.5, nearly the exact same values as reported in comment 0.

Retriggers of this job on nearby commits seem to produce the same spread (with a low of ~49 and a high of ~54). It looks like we just got a run of "lucky" low measurements in a row here.

So: There's not actually any real regression here, AFAICT; retriggers have shown that this was just a false signal within some noisy data.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INVALID
Flags: needinfo?(jwatt)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: