4.55 - 5.15% tp5o responsiveness / tp5o_webext responsiveness (linux64) regression on push 367ca1e168537c05f091c766e11b2c8338621cba

VERIFIED FIXED in Firefox 67

Status

()

defect
VERIFIED FIXED
5 months ago
4 months ago

People

(Reporter: igoldan, Assigned: bzbarsky)

Tracking

(Blocks 1 bug, {perf, regression, talos-regression})

Trunk
mozilla67
x86_64
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr60 unaffected, firefox65 unaffected, firefox66 unaffected, firefox67 fixed)

Details

Attachments

(1 attachment)

Talos has detected a Firefox performance regression from push:

https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=1cfd69d05aa1cd508266c37c1b27ce061bec7bb3&tochange=367ca1e168537c05f091c766e11b2c8338621cba

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

5% tp5o responsiveness linux64 opt e10s stylo 1.25 -> 1.31
5% tp5o_webext responsiveness linux64 opt e10s stylo 1.59 -> 1.66

You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=19419

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Performance_sheriffing/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Performance_sheriffing/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Performance_sheriffing/Talos/RegressionBugsHandling

Component: General → DOM
Product: Testing → Core
Flags: needinfo?(bzbarsky)

Er... I never got any bugmail from this bug. I'll look into it now.

OK, so this is the same test that we already know is fairly bogus from bug 1517758, right? :(

How do I see the actual values that are going into the final score? Especially so I can correlate them to the actual pageloads, since the profiles are per-pageload, not overall? I don't see anything at https://wiki.mozilla.org/Performance_sheriffing/Talos/Tests#Responsiveness or https://treeherder.mozilla.org/perf.html#/alerts?id=19419 that provides that information...

Assuming all the source code for this harness lives in mozilla-central, it doesn't seem to be using "document-element-inserted" or "DOMDocElementInserted", so it's not clear offhand why it would be affected here...

Flags: needinfo?(igoldan)

And especially why it would be affected on only one platform.

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #3)

How do I see the actual values that are going into the final score? Especially so I can correlate them to the actual pageloads, since the profiles are per-pageload, not overall? I don't see anything at https://wiki.mozilla.org/Performance_sheriffing/Talos/Tests#Responsiveness or https://treeherder.mozilla.org/perf.html#/alerts?id=19419 that provides that information...

There may be another way to view the replicates, but one way:

From the alert in Perfherder click the link to the graph for one of the tests. Then, click the highlighted datapoint in the graph, and click the job link in the tooltip. This will take you to the job in Treeherder. From here you can find a link to the perfherder_data.json from the Job Details panel.

I'm hoping there's a quicker way that :igoldan can share. If not, I think we need to introduce one! :)

Otherwise we get a performance regression on tp5o responsiveness and
tp5o_webext responsiveness when about:blank starts firing
document-element-inserted notifications.

Dave, thank you! I do think it would be good to have a better way of getting at this sort of data, yes. ;)

Flags: needinfo?(bzbarsky)
Pushed by bzbarsky@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1033b5aecacc
Don't inject customElements.js into system-principal about:blank.  r=bgrins
Status: NEW → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67
Assignee: nobody → bzbarsky

(In reply to Dave Hunt [:davehunt] [he/him] ⌚️UTC from comment #5)

I'm hoping there's a quicker way that :igoldan can share. If not, I think we need to introduce one! :)

tp5o responsivenss is a subtest. In general, any subtests should come along with its replicates; this one doesn't have them provided in the JSON response Perfherder gets, nor in the Compare subtests view. It's possible Talos collects them while running and just computes & outputs the final result. Given that it would have somewhere around 8000+ replicate values (according to the docs), it's possible this was something desired.

I will look further into this and sync with the test owners.

Flags: needinfo?(igoldan)

After a while, Perfherder detected a 3% sessionrestore regression also attributed to bug 1528146.
I looked over the fix from comment 10, but it seems it didn't resolve any of these perf issues.
The only improvements I found were caused by a separate bug 1351078.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Can we please file a separate bug for that? It's going to need a complete different fix, and probably a different owner.

Status: REOPENED → RESOLVED
Closed: 5 months ago5 months ago
Resolution: --- → FIXED

Oh, and the fix I pushed definitely resolved the responsiveness issue in my try pushes. Are you saying it didn't resolve it on the non-try trees?

Flags: needinfo?(igoldan)

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #15)

Oh, and the fix I pushed definitely resolved the responsiveness issue in my try pushes. Are you saying it didn't resolve it on the non-try trees?

Yes, that's what I'm thinking. According to this graph which highlights bug 1528762's landing, the push didn't cause any perf win on responsiveness. Now, there are some possible explanations: either bug 1351078 is related to the fix from comment 10 and resolved it first or there's something fishy going on with the PGO build that's not catching the fix immediately or another regression landed near our fix and confused me.

Flags: needinfo?(igoldan)

I'm looking closer over that graph portion and will keep you updated.

the push didn't cause any perf win on responsiveness.

Just to be sure we're on the same page, the order of events here is:

  1. Bug 1528146 lands, causes a responsiveness regression.
  2. Bug 1528146 is backed out because I didn't get bugmail for this bug and hence didn't respond to it.
  3. Bug 1528146 re-lands in the same push as the fix for this bug.

So I would expect a responsiveness improvement when (2) happens and no changes in responsiveness when (3) happens, right?

(2) and (3) were about 21 hours apart on m-c, I believe...

There shouldn't have been a perf win when this landed, since the original patch was backed out. It should have simply been perf-neutral, since the fix landed in the same push as the re-landing of the patch that caused the regression. And it pretty clearly was perf-neutral given the number of retriggers.

(In reply to Kris Maglione [:kmag] from comment #19)

There shouldn't have been a perf win when this landed, since the original patch was backed out. It should have simply been perf-neutral, since the fix landed in the same push as the re-landing of the patch that caused the regression. And it pretty clearly was perf-neutral given the number of retriggers.

Yes, that's what actually happened. There wasn't any other improvement to be expected. I didn't took into account the backout...

Status: RESOLVED → VERIFIED
Blocks: 1530948
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.