2.18 - 2.37% sessionrestore / sessionrestore_no_auto_restore / ts_paint (windows8-64) regression on push c756f91d7c0901b59651e37288757033baac824c (Mon Jan 30 2017)

RESOLVED WONTFIX

Status

Testing
Talos
RESOLVED WONTFIX
a year ago
a year ago

People

(Reporter: jmaher, Unassigned)

Tracking

(Blocks: 1 bug, {perf, regression, talos-regression})

53 Branch
perf, regression, talos-regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

a year ago
Talos has detected a Firefox performance regression from push edf7c9a320c2f0a6be889628ba6f374265cdb261. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

  2%  sessionrestore windows8-64 pgo                     683.54 -> 699.75
  2%  ts_paint windows8-64 pgo                           798.83 -> 817.42
  2%  sessionrestore_no_auto_restore windows8-64 pgo     704.79 -> 720.67
  2%  sessionrestore windows8-64 pgo                     614.79 -> 628.17


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=4980

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
(Reporter)

Comment 1

a year ago
3 patches landed at the same time, hard to tell who is the author of the patch which caused the regression.  These all landed a week ago, that sort of implies the time to force a backout is past due,  narrowing this down, here is the commit range:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=18c7c3e4963996a8be5960d9f5a8b172db0aa1a2&tochange=c756f91d7c0901b59651e37288757033baac824c

these are windows8 only failures, and we only get pgo failures automatically, the non pgo runs show smaller regressions below our 2% threshold.

first guess is this is related to the d3d11 blacklist, although this is startup only regressions, so I am a bit at a loss.  Could anyone I am cc'ing here help out in identifying why this might be happening?
Summary: 2.18 - 2.37% sessionrestore / sessionrestore_no_auto_restore / ts_paint (windows8-64) regression on push edf7c9a320c2f0a6be889628ba6f374265cdb261 (Mon Jan 30 2017) → 2.18 - 2.37% sessionrestore / sessionrestore_no_auto_restore / ts_paint (windows8-64) regression on push c756f91d7c0901b59651e37288757033baac824c (Mon Jan 30 2017)
(Reporter)

Comment 3

a year ago
the talos alert was from a merge on autoland (and talos alerts are only about 40% accurate in pinpointing the correct revision), I retriggered a few revisions and it came down to this specific push.
(In reply to Joel Maher ( :jmaher) from comment #3)
> the talos alert was from a merge on autoland (and talos alerts are only
> about 40% accurate in pinpointing the correct revision), I retriggered a few
> revisions and it came down to this specific push.

I guess it's possible that that Array.unshift change caused the issue by deoptimizing some very hot JavaScript.  It's definitely not bug 1329520.  Try individual patch talos pushes?
(Reporter)

Comment 6

a year ago
c756f91d7c09 seems to have a 0.7% influence on the test:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=2e83f267ee62&newProject=try&newRevision=77ef3a0758a521155d668c05c0a7315f91f77b02&framework=1&filter=ts_paint&showOnlyImportant=0

that is the js code from bug 1323782.  As I am not able to see anything obvious here which contributes to at least 1% regression, I am running out of options.
Bug 1334561 added some driver versions to the DXVA blocklist (for accelerating playing videos), so I think that one can't be related.
(Reporter)

Comment 8

a year ago
:marco, is this expected then, or is there work to do here?
Flags: needinfo?(mcastelluccio)
The patch in bug 1334561 is most probably unrelated to this regression, but I don't know about the other patches that landed in the other files.
Flags: needinfo?(mcastelluccio)
(In reply to Marco Castelluccio [:marco] from comment #9)
> The patch in bug 1334561 is most probably unrelated to this regression, but
> I don't know about the other patches that landed in the other files.

*that landed in the other bugs.
(Reporter)

Comment 11

a year ago
got it, thanks for clarifying!
(In reply to Nathan Froyd [:froydnj] from comment #4)
> I guess it's possible that that Array.unshift change caused the issue by
> deoptimizing some very hot JavaScript.  It's definitely not bug 1329520. 
> Try individual patch talos pushes?

The optimization failed only in one case [1], namely in Intl.js code (bug 1339621). I doubt fixing bug 1339621 will result in any performance improvements, but maybe I'm wrong. :-)

[1] I was only able to test sessionrestore_no_auto_restore and ts_paint locally, sessionrestore crashed with "Aborting on channel error."
Joel can we move this to Testing: Talos or do you have in mind another component? Thanks
Flags: needinfo?(jmaher)
(Reporter)

Comment 14

a year ago
with no action here, marking as wontfix
Status: NEW → RESOLVED
Last Resolved: a year ago
Component: Untriaged → Talos
Flags: needinfo?(jmaher)
Product: Firefox → Testing
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.