1299230 - 5.54% basic_compositor_video (osx-10-10) regression on push 21fd3ea62d8c (Fri Aug 26 2016)

Reporter

Description

•

9 years ago

Talos has detected a Firefox performance regression from push 21fd3ea62d8c. As author of one of the patches included in that push, we need your help to address this regression. Regressions: 6% basic_compositor_video summary osx-10-10 opt 12.55 -> 13.25 You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=2910 On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format. To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running *** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! *** Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 1

•

9 years ago

it appears that Chris is on PTO for a week (despite landing this right before going on PTO...ack! :jwwang, can you look into this and help us determine why we have this regression and decide if we need to accept this or can fix it.

Flags: needinfo?(jwwang)

JW Wang [:jwwang] [:jw_wang]

Comment 2

•

9 years ago

checking. Btw, is the regression happens on Aurora only instead of Central? I wonder what makes the difference.

Flags: needinfo?(jwwang) → needinfo?(jmaher)

JW Wang [:jwwang] [:jw_wang]

Comment 3

•

9 years ago

It looks like https://hg.mozilla.org/mozilla-central/rev/2cdde44ebcc7 fixes the regression. Hi Jya, Can you uplift this patch to Aurora?

Flags: needinfo?(jyavenard)

Jean-Yves Avenard [:jya]

Comment 4

•

9 years ago

(In reply to Joel Maher ( :jmaher) from comment #1) > it appears that Chris is on PTO for a week (despite landing this right > before going on PTO...ack! I find this last comment totally unwarranted and inappropriate. I'll be requesting uplift for the fix.

Flags: needinfo?(jyavenard)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 5

•

9 years ago

thanks for uplifting the fix for this.

Flags: needinfo?(jmaher)

Vlad Bacia-Mociran [:VladB]

Updated

•

9 years ago

Component: Untriaged → Audio/Video: Playback

Product: Firefox → Core

Jean-Yves Avenard [:jya]

Updated

•

9 years ago

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 6

•

9 years ago

this doesn't seem to fix the issue, I did look again at mozilla-central and I don't see any regression there: https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=%5Bmozilla-aurora,a75c7fb871569a3321ba0c9065fd64b3db159a18,1,1%5D&series=%5Bmozilla-central,a75c7fb871569a3321ba0c9065fd64b3db159a18,1,1%5D it gets more strange because on autoland where the offending patch originally landed we see a regression, but on inbound around the same time we see an improvement: https://hg.mozilla.org/integration/mozilla-inbound/rev/fee7ff6786ca since we merged everything around the regression and improvement cancelled each other out, so on mozilla-central and on our alert algorithm we didn't see anything that looked like a regression. Can we at look into why this specific patch is causing a performance regression: https://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=458d5f0a55bbb2423091a2108cab633321a64d26&tochange=21fd3ea62d8ccb941ac7e14d85034c5799073635

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Chris Pearce [:cpearce (Not reading bugmail)]

Comment 7

•

9 years ago

Is this a dupe of Bug 1295630?

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 8

•

9 years ago

this bug and that one happened around the same time. The difference with bug 1295630 is that it wasn't posting data to perfherder, in this case we continued to post data to perfherder as we were not generating invalid values. This specific issue is on Aurora and when Bug 1295630 was uplifted (Aug 30th) we didn't see any changes in the posted values.

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 9

•

9 years ago

:jwwang, this bug has gone silent, I have identified the code causing problems, but there is no explanation as to why this is happening or action taking place? Should I back out the original patch? Is there work being done to explain or investigate this?

Flags: needinfo?(jwwang)

Jean-Yves Avenard [:jya]

Comment 10

•

9 years ago

What is there to fix? You don't see a regression in central yet you saw it in autoland, even though the code is identical. If there was a "regression", it would have showed up when autoland was merged into central. Almost no one in the media team is using inbound anymore, we do all our reviews via reviewboard and as such autoland. I have provided analysis on why benchmarking the compositor via the use of media decoding is flawed to start with: you're just benchmarking the system clock.

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 11

•

9 years ago

the regression is on Aurora, not on Central, probably something else on central is causing a fix, and we just don't see the regression on central. :jya, I understand you are concerned with the test itself, but as it stands we have a regression that this test has proven in its short life to find other changes in the product. That doesn't mean it is perfect- please work towards making the test correct or vote to turn it off. If we are going to ignore test regressions found by the test I would rather save all the overhead of sheriffs managing results and dealing with the load on our infrastructure. Since I don't see action being taken to disable the test or modify it, I will continue to operate under the assumption that this is a useful test and trust that firefox developers will follow the policy we have laid out to treat performance regressions with reasonable priority and not ignore them.

JW Wang [:jwwang] [:jw_wang]

Comment 12

•

9 years ago

Still trying to figure out the root cause. Since I will be on Mid-Autumn Festival vacation soon, I will come back at this issue next week. https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=e6fbe70e21e8&newProject=try&newRevision=08bdd2e2beeb&framework=1&showOnlyImportant=0 This change shows 5.62% improvement in basic_compositor_video opt e10s, but 3.05% downgrade in basic_compositor_video opt. Still trying to figure out the difference.

Flags: needinfo?(jwwang)

JW Wang [:jwwang] [:jw_wang]

Comment 13

•

9 years ago

https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=%5Btry,a75c7fb871569a3321ba0c9065fd64b3db159a18,1,1%5D&highlightedRevisions=8ed8b675e712&highlightedRevisions=dd86017e776c&selected=%5Btry,a75c7fb871569a3321ba0c9065fd64b3db159a18,122925,26717683,1%5D Looking at the graph, I suspect it is just a false alarm. The numbers after 8/29 are around 12.21 which is even better than 12.55 in comment 0. The deviation is large for this test which can go as high as 13.62, or as low as 10.44. One single test result doesn't justify this is a regression from 21fd3ea62d8c.

Flags: needinfo?(jmaher)

Avi Halachmi (:avih)

Comment 14

•

9 years ago

(In reply to JW Wang [:jwwang] [:jw_wang] from comment #13) > https://treeherder.mozilla.org/perf.html#/graphs?timerange=2592000&series=%5Btry,a75c7fb871569a3321ba0c9065fd64b3db159a18,1,1%5D&highlightedRevisions=8ed8b675e712&highlightedRevisions=dd86017e776c&selected=%5Btry,a75c7fb871569a3321ba0c9065fd64b3db159a18,122925,26717683,1%5D > > Looking at the graph, I suspect it is just a false alarm. The numbers after > 8/29 are around 12.21 which is even better than 12.55 in comment 0. > > The deviation is large for this test which can go as high as 13.62, or as > low as 10.44. One single test result doesn't justify this is a regression > from 21fd3ea62d8c. Looking at your graph on try it indeed seem like a false alarm as it becomes quite noisier around Sep 1st (note though that the original regression on Aurora appears around Aug 26th - while your graph is stable around that date). Did you really intend to post a graph on try? However, Looking at the graph Joel posted which shows central and aurora, IMO indicates clearly that it's not false: https://treeherder.mozilla.org/perf.html#/graphs?timerange=5184000&series=%5Bmozilla-aurora,a75c7fb871569a3321ba0c9065fd64b3db159a18,1,1%5D&series=%5Bmozilla-central,a75c7fb871569a3321ba0c9065fd64b3db159a18,1,1%5D It's true that on OS X this graph is noisier than on other platforms, yet the regression on aurora is quite clear IMO.

JW Wang [:jwwang] [:jw_wang]

Comment 15

•

9 years ago

How do I verify a fix if the try results are not stable?

Avi Halachmi (:avih)

Comment 16

•

9 years ago

(In reply to JW Wang [:jwwang] [:jw_wang] from comment #15) > How do I verify a fix if the try results are not stable? Good question. As far as I know doing aurora try pushes is not reliable, and at the time I files bug 1233740 for that, but it seems idle. Maybe Joel could suggest how to test potential fixes on Aurora.

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 17

•

9 years ago

there is no reliable way that I know to test Aurora changes on try. There is a large need for this. You did what I would do, update to the aurora branch and push to try on top of that. Outside of that there are different configuration issues on try- ones which I don't understand. Regarding doing the test runs, it is important to get multiple data points. Pushing to try, I would do: try: -b o -p macosx64 -u none -t g4 --rebuild-talos 5 I have retriggered g4 on your pushes to see if that helps get a clearer picture of the data.

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 18

•

9 years ago

the retriggers didn't help much at all to show a difference.

Ryan VanderMeulen [:RyanVM]

Comment 19

•

9 years ago

(In reply to Jean-Yves Avenard [:jya] from comment #10) > I have provided analysis on why benchmarking the compositor via the use of > media decoding is flawed to start with: you're just benchmarking the system > clock. Link please?

Flags: needinfo?(jyavenard)

Jean-Yves Avenard [:jya]

Comment 20

•

9 years ago

https://bugzilla.mozilla.org/show_bug.cgi?id=1297265#c83

Flags: needinfo?(jyavenard)

JW Wang [:jwwang] [:jw_wang]

Comment 21

•

9 years ago

It looks like bug 1295630 comment 29 and 1295630 comment 30 'fix' (in fact it is just a workaround, which might explain why we don't have this regression in central) the regression. It is interesting both of them were requested for uplifting, but only of of them got actually uplifted.

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 22

•

9 years ago

oh, and this is resolved, thanks for finding that :jwwang and :jya!

Status: REOPENED → RESOLVED

Closed: 9 years ago → 9 years ago

Flags: needinfo?(jmaher)

Resolution: --- → FIXED

Avi Halachmi (:avih)

Comment 23

•

9 years ago

(In reply to JW Wang [:jwwang] [:jw_wang] from comment #21) > It is interesting both of them were requested for uplifting, but only of of > them got actually uplifted. Good catch. Thanks. (In reply to Joel Maher ( :jmaher) from comment #22) > oh, and this is resolved, thanks for finding that :jwwang and :jya! Yay!