5.54% basic_compositor_video (osx-10-10) regression on push 21fd3ea62d8c (Fri Aug 26 2016)

RESOLVED FIXED

Status

()

Core
Audio/Video: Playback
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: jmaher, Unassigned)

Tracking

({perf, regression, talos-regression})

50 Branch
perf, regression, talos-regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

a year ago
Talos has detected a Firefox performance regression from push 21fd3ea62d8c. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

  6%  basic_compositor_video summary osx-10-10 opt     12.55 -> 13.25


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=2910

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
(Reporter)

Comment 1

a year ago
it appears that Chris is on PTO for a week (despite landing this right before going on PTO...ack!

:jwwang, can you look into this and help us determine why we have this regression and decide if we need to accept this or can fix it.
Flags: needinfo?(jwwang)
checking.

Btw, is the regression happens on Aurora only instead of Central? I wonder what makes the difference.
Flags: needinfo?(jwwang) → needinfo?(jmaher)
It looks like https://hg.mozilla.org/mozilla-central/rev/2cdde44ebcc7 fixes the regression.

Hi Jya,
Can you uplift this patch to Aurora?
Flags: needinfo?(jyavenard)
(In reply to Joel Maher ( :jmaher) from comment #1)
> it appears that Chris is on PTO for a week (despite landing this right
> before going on PTO...ack!

I find this last comment totally unwarranted and inappropriate. 

I'll be requesting uplift for the fix.
Flags: needinfo?(jyavenard)
(Reporter)

Comment 5

a year ago
thanks for uplifting the fix for this.
Flags: needinfo?(jmaher)
Component: Untriaged → Audio/Video: Playback
Product: Firefox → Core
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
(Reporter)

Comment 6

a year ago
this doesn't seem to fix the issue, I did look again at mozilla-central and I don't see any regression there:
https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=%5Bmozilla-aurora,a75c7fb871569a3321ba0c9065fd64b3db159a18,1,1%5D&series=%5Bmozilla-central,a75c7fb871569a3321ba0c9065fd64b3db159a18,1,1%5D

it gets more strange because on autoland where the offending patch originally landed we see a regression, but on inbound around the same time we see an improvement:
https://hg.mozilla.org/integration/mozilla-inbound/rev/fee7ff6786ca

since we merged everything around the regression and improvement cancelled each other out, so on mozilla-central and on our alert algorithm we didn't see anything that looked like a regression.

Can we at look into why this specific patch is causing a performance regression:
https://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=458d5f0a55bbb2423091a2108cab633321a64d26&tochange=21fd3ea62d8ccb941ac7e14d85034c5799073635
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Is this a dupe of Bug 1295630?
Flags: needinfo?(jmaher)
(Reporter)

Comment 8

a year ago
this bug and that one happened around the same time.  The difference with bug 1295630 is that it wasn't posting data to perfherder, in this case we continued to post data to perfherder as we were not generating invalid values.

This specific issue is on Aurora and when Bug 1295630 was uplifted (Aug 30th) we didn't see any changes in the posted values.
Flags: needinfo?(jmaher)
(Reporter)

Comment 9

a year ago
:jwwang, this bug has gone silent, I have identified the code causing problems, but there is no explanation as to why this is happening or action taking place?  Should I back out the original patch?  Is there work being done to explain or investigate this?
Flags: needinfo?(jwwang)
What is there to fix? You don't see a regression in central yet you saw it in autoland, even though the code is identical. If there was a "regression", it would have showed up when autoland was merged into central.
Almost no one in the media team is using inbound anymore, we do all our reviews via reviewboard and as such autoland.

I have provided analysis on why benchmarking the compositor via the use of media decoding is flawed to start with: you're just benchmarking the system clock.
(Reporter)

Comment 11

a year ago
the regression is on Aurora, not on Central, probably something else on central is causing a fix, and we just don't see the regression on central.

:jya, I understand you are concerned with the test itself, but as it stands we have a regression that this test has proven in its short life to find other changes in the product.  That doesn't mean it is perfect- please work towards making the test correct or vote to turn it off.  If we are going to ignore test regressions found by the test I would rather save all the overhead of sheriffs managing results and dealing with the load on our infrastructure.  Since I don't see action being taken to disable the test or modify it, I will continue to operate under the assumption that this is a useful test and trust that firefox developers will follow the policy we have laid out to treat performance regressions with reasonable priority and not ignore them.
Still trying to figure out the root cause.

Since I will be on Mid-Autumn Festival vacation soon, I will come back at this issue next week.

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=e6fbe70e21e8&newProject=try&newRevision=08bdd2e2beeb&framework=1&showOnlyImportant=0

This change shows 5.62% improvement in basic_compositor_video opt e10s, but 3.05% downgrade in basic_compositor_video opt. Still trying to figure out the difference.
Flags: needinfo?(jwwang)
https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=%5Btry,a75c7fb871569a3321ba0c9065fd64b3db159a18,1,1%5D&highlightedRevisions=8ed8b675e712&highlightedRevisions=dd86017e776c&selected=%5Btry,a75c7fb871569a3321ba0c9065fd64b3db159a18,122925,26717683,1%5D

Looking at the graph, I suspect it is just a false alarm. The numbers after 8/29 are around 12.21 which is even better than 12.55 in comment 0.

The deviation is large for this test which can go as high as 13.62, or as low as 10.44. One single test result doesn't justify this is a regression from 21fd3ea62d8c.
Flags: needinfo?(jmaher)
(In reply to JW Wang [:jwwang] [:jw_wang] from comment #13)
> https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=%5Btry,a75c7fb871569a3321ba0c9065fd64b3db159a18,1,1%5D&highlightedRevisions=8ed8b675e712&highlightedRevisions=dd86017e776c&selected=%5Btry,a75c7fb871569a3321ba0c9065fd64b3db159a18,122925,26717683,1%5D
>
> Looking at the graph, I suspect it is just a false alarm. The numbers after
> 8/29 are around 12.21 which is even better than 12.55 in comment 0.
> 
> The deviation is large for this test which can go as high as 13.62, or as
> low as 10.44. One single test result doesn't justify this is a regression
> from 21fd3ea62d8c.

Looking at your graph on try it indeed seem like a false alarm as it becomes quite noisier around Sep 1st (note though that the original regression on Aurora appears around Aug 26th - while your graph is stable around that date). Did you really intend to post a graph on try?

However, Looking at the graph Joel posted which shows central and aurora, IMO indicates clearly that it's not false: https://treeherder.mozilla.org/perf.html#/graphs?timerange=5184000&series=%5Bmozilla-aurora,a75c7fb871569a3321ba0c9065fd64b3db159a18,1,1%5D&series=%5Bmozilla-central,a75c7fb871569a3321ba0c9065fd64b3db159a18,1,1%5D

It's true that on OS X this graph is noisier than on other platforms, yet the regression on aurora is quite clear IMO.
How do I verify a fix if the try results are not stable?
(In reply to JW Wang [:jwwang] [:jw_wang] from comment #15)
> How do I verify a fix if the try results are not stable?

Good question. As far as I know doing aurora try pushes is not reliable, and at the time I files bug 1233740 for that, but it seems idle.

Maybe Joel could suggest how to test potential fixes on Aurora.
(Reporter)

Comment 17

a year ago
there is no reliable way that I know to test Aurora changes on try.  There is a large need for this.  You did what I would do, update to the aurora branch and push to try on top of that.

Outside of that there are different configuration issues on try- ones which I don't understand.

Regarding doing the test runs, it is important to get multiple data points.  Pushing to try, I would do:
try: -b o -p macosx64 -u none -t g4 --rebuild-talos 5

I have retriggered g4 on your pushes to see if that helps get a clearer picture of the data.
Flags: needinfo?(jmaher)
(Reporter)

Comment 18

a year ago
the retriggers didn't help much at all to show a difference.
(In reply to Jean-Yves Avenard [:jya] from comment #10)
> I have provided analysis on why benchmarking the compositor via the use of
> media decoding is flawed to start with: you're just benchmarking the system
> clock.

Link please?
Flags: needinfo?(jyavenard)
It looks like bug 1295630 comment 29 and 1295630 comment 30 'fix' (in fact it is just a workaround, which might explain why we don't have this regression in central) the regression.

It is interesting both of them were requested for uplifting, but only of of them got actually uplifted.
Flags: needinfo?(jmaher)
(Reporter)

Comment 22

a year ago
oh, and this is resolved, thanks for finding that :jwwang and :jya!
Status: REOPENED → RESOLVED
Last Resolved: a year agoa year ago
Flags: needinfo?(jmaher)
Resolution: --- → FIXED
(In reply to JW Wang [:jwwang] [:jw_wang] from comment #21)
> It is interesting both of them were requested for uplifting, but only of of
> them got actually uplifted.

Good catch. Thanks.

(In reply to Joel Maher ( :jmaher) from comment #22)
> oh, and this is resolved, thanks for finding that :jwwang and :jya!

Yay!
You need to log in before you can comment on or make changes to this bug.