Closed Bug 1295927 Opened 6 years ago Closed 6 years ago

46.24 - 46.36% basic_compositor_video (windows8-64) regression on push 2719a392633d (Mon Aug 15 2016)

Categories

(Core :: Graphics, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla51
Tracking Status
firefox48 --- unaffected
firefox49 --- unaffected
firefox50 --- unaffected
firefox51 --- fixed

People

(Reporter: jmaher, Assigned: sotaro, NeedInfo)

References

Details

(Keywords: perf, regression, talos-regression)

Attachments

(1 file, 1 obsolete file)

Talos has detected a Firefox performance regression from push 2719a392633d. As author of one of the patches included in that push, we need your help to address this regression.

Summary of tests that regressed:

  basic_compositor_video summary windows8-64 opt: 4.33 -> 6.33 (46.24% worse)
  basic_compositor_video summary windows8-64 opt e10s: 4.38 -> 6.4 (46.36% worse)

Summary of tests that improved:

  basic_compositor_video summary windowsxp opt: 4.23 -> 4.08 (3.52% better)
  basic_compositor_video summary windows7-32 opt: 4.46 -> 4.23 (5.26% better)
  basic_compositor_video summary windows7-32 opt e10s: 4.48 -> 4.24 (5.48% better)
  basic_compositor_video summary windowsxp opt e10s: 4.26 -> 4.1 (3.59% better)
  basic_compositor_video summary windows7-32 pgo: 4.4 -> 4.16 (5.45% better)
  basic_compositor_video summary windows7-32 pgo e10s: 4.41 -> 4.17 (5.44% better)
  basic_compositor_video summary windowsxp pgo: 4.22 -> 4.06 (3.65% better)
  basic_compositor_video summary windowsxp pgo e10s: 4.23 -> 4.08 (3.61% better)


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=2493

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
:sotaro, I had to push to try to find the exact revision as the windows builds were broken when your patch landed, but it was clear on my try pushes to fill in data for the missing builds:
https://treeherder.mozilla.org/perf.html#/graphs?series=%5Btry,4d1b8939a1868a8c86c3e45c0bd251f1004ef63c,1,1%5D&zoom=1471383089964.8997,1471383458000,4.210665426856439,6.440675359676319

What is odd is we see wins on winXP and win7, but a huge loss on win8.  I see in the original bug you did look for perf data and saw the win7 improvement:
https://bugzilla.mozilla.org/show_bug.cgi?id=1254010#c13

Can you look into this for the windows 8 failures?  I assume there is something simple to fix here.
Flags: needinfo?(sotaro.ikeda.g)
I take a look.
Assignee: nobody → sotaro.ikeda.g
Flags: needinfo?(sotaro.ikeda.g)
When libyuv was updated by Bug 1284803. Win8 basic_compositor_video was improved a lot like Bug 1284803 Comment 44. There seems to be a relationship to this bug.
Joel do you know what cpu the windows talos tests run on? Is it different between windows 8 and windows 7? Also, are the windows 7 machines running windows 7 sp1?
Flags: needinfo?(jmaher)
From the log, cpu seems different.

XP and windows 7 had the following in log.
 >  PROCESSOR_ARCHITECTURE=x86
 >  PROCESSOR_IDENTIFIER=x86 Family 6 Model 30 Stepping 5, GenuineIntel

windows 8 had the following in log.
 >  PROCESSOR_ARCHITECTURE=x86
 >  PROCESSOR_ARCHITEW6432=AMD64
 >  PROCESSOR_IDENTIFIER=Intel64 Family 6 Model 30 Stepping 5, GenuineIntel
I locally run basic_compositor_video on the 2 Win PCs, but I did not see the performance problem.

- Desktop PC(Win10): Intel Core i7-6700K CPU @ 4.00GHz
  + Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
- Laptop PC(Win10): Intel Core i7-4900MQ CPU @ 2.80GHz
  + Intel64 Family 6 Model 60 Stepping 5, GenuineIntel
the win8/win7/winXP/Linux64 are the same hardware (we often re-install them to load balance).  Here is a list of our specs:
https://wiki.mozilla.org/Buildbot/Talos/Misc#Hardware_Profile_of_machines_used_in_automation


win8 is a 64 bit OS/build, whereas Win7 and WinXP are 32 bit OS/build.
Flags: needinfo?(jmaher)
Does our Windows 7 have SP1?
Flags: needinfo?(jmaher)
Ah, it seems more likely the problem is related to 64 bit vs 32 bit builds.
Flags: needinfo?(jmaher)
I succeeded to reproduce a small regression with 64 bit build on Win 10 and desktop pc.
The followings are not defined on 64bit build.
- HAS_INTERPOLATEROW_AVX2
- HAS_INTERPOLATEROW_SSSE3
(In reply to Sotaro Ikeda [:sotaro] from comment #10)
> I succeeded to reproduce a small regression with 64 bit build on Win 10 and
> desktop pc.

The code in libyuv/source/scale_win.cc is not used on x86-64. That's likely the problem.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #12)
> (In reply to Sotaro Ikeda [:sotaro] from comment #10)
> > I succeeded to reproduce a small regression with 64 bit build on Win 10 and
> > desktop pc.
> 
> The code in libyuv/source/scale_win.cc is not used on x86-64. That's likely
> the problem.

The MSVC 64 bit does not support the inline assembly used in that file. We can probably disable the regressing change on 64 bit until there's proper support for this functions on 64 bit. The right way to fix that is likely to port the inline assembly to straight assembly using yasm. Ideally that change would be done by upstream or at least accepted there.
Frank, would you accept a port of the scale_win.cc to a .asm built using nasm/yasm like libvpx uses? This would allow sharing the code between scale_gcc.cc and scale_win.cc.
Flags: needinfo?(fbarchard)
tracking-e10s: --- → ?
(In reply to Sotaro Ikeda [:sotaro] from comment #3)
> When libyuv was updated by Bug 1284803. Win8 basic_compositor_video was
> improved a lot like Bug 1284803 Comment 44. There seems to be a relationship
> to this bug.

Now, I understand it. The new libyuv enables HAS_I422TOARGBROW_SSSE3 definition on win 64bit, since I422ToARGBRow_SSSE3() does not use inline assembler. Then the performance was improved.
  https://dxr.mozilla.org/mozilla-central/source/media/libyuv/source/row_win.cc#87
(In reply to Jeff Muizelaar [:jrmuizel] from comment #13)
> > 
> > The code in libyuv/source/scale_win.cc is not used on x86-64. That's likely
> > the problem.
> 
> The MSVC 64 bit does not support the inline assembly used in that file. We
> can probably disable the regressing change on 64 bit until there's proper
> support for this functions on 64 bit. The right way to fix that is likely to
> port the inline assembly to straight assembly using yasm. Ideally that
> change would be done by upstream or at least accepted there.

Thanks for the advice, I am going to disable the regressing change on 64 bit in this bug.
Attachment #8782738 - Attachment is obsolete: true
Attachment #8782740 - Flags: review?(jmuizelaar)
tracking-e10s: ? → ---
Component: Untriaged → Audio/Video: Playback
Product: Firefox → Core
Component: Audio/Video: Playback → Graphics
Version: 50 Branch → Trunk
Attachment #8782740 - Flags: review?(jmuizelaar) → review+
Pushed by sikeda@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d929d9aa7829
Disable libyuv SIMD scaling on 64bit win r=jrmuizel
https://hg.mozilla.org/mozilla-central/rev/d929d9aa7829
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
(In reply to Jeff Muizelaar [:jrmuizel] from comment #14)
> Frank, would you accept a port of the scale_win.cc to a .asm built using
> nasm/yasm like libvpx uses? This would allow sharing the code between
> scale_gcc.cc and scale_win.cc.

yes, I think yasm would be a good long term solution for win64.
2 other options are:
a. intrinsics, which is what I420ToARGB uses.
b. clangcl for win64 builds scale_gcc.cc instead.
You need to log in before you can comment on or make changes to this bug.