Bug 1424281 (VS15.6)

Investigate upgrading the Visual Studio compiler to VS2017 15.6

RESOLVED FIXED in Firefox 61

Status

enhancement
RESOLVED FIXED
2 years ago
Last year

People

(Reporter: RyanVM, Assigned: RyanVM)

Tracking

(Blocks 3 bugs)

Trunk
mozilla61
Unspecified
Windows
Dependency tree / graph

Firefox Tracking Flags

(firefox58 wontfix, firefox59 wontfix, firefox60 wontfix, firefox61 fixed)

Details

Attachments

(3 attachments, 8 obsolete attachments)

4.60 KB, patch
RyanVM
: review+
Details | Diff | Splinter Review
12.45 KB, patch
RyanVM
: review+
Details | Diff | Splinter Review
12.97 KB, patch
RyanVM
: review+
Details | Diff | Splinter Review
Visual Studio 2017 15.5 was recently released promising new compiler optimizations that might be interesting for us.
https://www.visualstudio.com/en-us/news/releasenotes/vs2017-relnotes#15.5

I just pushed a patch to inbound that cleans up the remaining build issues so that this is now testable on Try. Talos does show some interesting results including a ~3.5% Speedometer improvement on Win7 PGO:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=457b0fe91e0d&newProject=try&newRevision=99d1cff1ea49c7aa4d8dd19bbfa798e5fa3d3f8c&framework=1&filter=win

No obvious test failures occurred that looked related to the compiler change.

Unfortunately, this change isn't without problems at this point:
* clang-cl builds are broken due to "error: cannot mangle this built-in __float128 type yet" errors in pdfium code. Thankfully, that appears to be a known LLVM issue that was fixed from revision 318309 onward (we're using 317840 at the moment). However, as I found in bug 1423307, updating clang-cl can be a mess of random bustage in the various jobs that rely on it, so I have no idea how easily we'll be able to make that switch.
* More significantly, Win64 PGO builds are consistently hitting internal compiler errors during linking: https://treeherder.mozilla.org/logviewer.html#?job_id=150703652&repo=try&lineNumber=37761. I don't see an easy path forward there unless we either get lucky with Microsoft happening to fix it on their own or if someone manages to come up with a reproducible way to report it upstream.

At this point, this bug may very-well be WONTFIX (and it won't hurt my feelings if it's marked as such), though some of the perf numbers are enticing. I at least wanted to document what has been done up to this point in case anybody is interested in exploring it more from here. And for the record, yesterday's 15.5.1 update doesn't update the C++ compiler, so I don't expect any miracles there.
OS: Unspecified → Windows
Are we able to backport just r318309 to fix the __float128 issue?

Have we reported the PGO failure to Microsoft?  Probably not super helpful without a small reproducible testcase.
Flags: needinfo?(ryanvm)
(In reply to Nathan Froyd [:froydnj] (doing reviews while in Austin) from comment #1)
> Are we able to backport just r318309 to fix the __float128 issue?
Probably. Feel free to file a bug and NI me and I'll see how it goes on Try :)

> Have we reported the PGO failure to Microsoft?  Probably not super helpful
> without a small reproducible testcase.
I'm not sure who has that ability these days, but I would assume a useful bug report would be gated on a minimal testcase which I certainly won't be attempting to provide.
Flags: needinfo?(ryanvm)
I gave version 15.5.2 a spin as well since MS claimed to have fixed some optimizer-related crashes in it. Unfortunately, PGO builds still crash. Uploaded the revised patch for perpetuity's sake.
Attachment #8935795 - Attachment is obsolete: true
Attachment #8937563 - Attachment description: fix compiler errors with VS2017 15.5.2 → in-tree config changes for building with VS2017 15.5.2
Depends on: 1425984
Depends on: 1423649
Blocks: 1429807
Blocks: 1429812
Here's an updated patch for version 15.5.3. The PGO crashes are still there. The patch below is enough to get around the Win64 ones, but I haven't made any effort to do the same for the Win32 ones.
https://hg.mozilla.org/try/rev/29153f5bc793e3fc0d31e57ac978934adf42887a

Win32 PGO crash log:
https://treeherder.mozilla.org/logviewer.html#?job_id=155668813&repo=try

dmajor says it might be worth taking a look at whether backporting bug 1412889 helps.
Attachment #8937563 - Attachment is obsolete: true
> dmajor says it might be worth taking a look at whether backporting bug 1412889 helps.

I just tried and it didn't help. :-(

De-optimizing `fdct16_avx2` fixed it, but I feel bad about slowing down a SIMD function! Plus being in third-party code it's harder to land changes.
Depends on: 1430696
Blocks: 1433171
Updated for the 15.5.6 release
Attachment #8941920 - Attachment is obsolete: true
Blocks: 1438876
Product: Core → Firefox Build System
15.6 was released today. Working on packaging it up now.
Summary: Investigate upgrading the Visual Studio compiler to VS2017 15.5 → Investigate upgrading the Visual Studio compiler to VS2017 15.6
Depends on: 1443367
Updated for 15.6.0. Be advised that builds are currently busted from at least bug 1443367 (and possibly others lurking behind that).
Attachment #8946635 - Attachment is obsolete: true
With the fix for bug 1443367 included, Try pushes are green except for the same PGO issues we saw with 15.5 :(
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a22b199ebd3da04ea4db84cc35b4ba712e0d04d2
dmajor ran a Try push with the PGO hacks in place we were using for 15.5 and I was able to get Talos numbers. Looks mostly neutral with some modest wins on some suites with 32-bit seeing more wins.

https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=a007dd56b994&newProject=try&newRevision=540a09377b25a11e97cde3c6d3d72c4a51ec121d&framework=1&filter=win%20pgo&showOnlyImportant=1
Alias: VS15.6
There's an upstream issue already on file for what looks like our Win32 PGO issue:
https://developercommunity.visualstudio.com/content/problem/169932/internal-compiler-errors-on-vs-155-64-bit-involvin.html

I went ahead and filed a new issue for the Win64 PGO link failures:
https://developercommunity.visualstudio.com/content/problem/211796/internal-compiler-error-building-win64-firefox-w-p.html

I see at least one other upstream issue that looks similar, but we'll see what they say.
Duplicate of this bug: 1444273
Blocks: 1444271
Depends on: 1425626
*sigh* I tried...
Assignee: nobody → ryanvm
This switches our builds to the new version, but doesn't make it mandatory.
Attachment #8956297 - Attachment is obsolete: true
Attachment #8957752 - Flags: review?(nfroyd)
We've been building with the 15063 SDK for awhile, I think we should just buck up and call that the minimum one we support as well.
Attachment #8957753 - Flags: review?(nfroyd)
Posted patch workarounds for PGO issues (obsolete) — Splinter Review
Work around the issues previously noted in this bug causing PGO issues.

The libaom one has me a bit worried since we're directly patching an upstream library and it's not clear to me what's going to stop that from getting clobbered on the next upstream update. Ralph, any thoughts? There's an issue on file with Microsoft that appears to fit the symptoms we were seeing, but who knows when they'll ship a compiler fix.

And to be clear, the plan for all of these patches is to land after the 61 version bump. And after bug 1429807 gets sorted out.
Attachment #8957754 - Flags: review?(nfroyd)
Attachment #8957754 - Flags: feedback?(giles)
Comment on attachment 8957753 [details] [diff] [review]
make version 15.6 and SDK 15063 the minimum supported versions

If we bump the SDK requirement, 15063 will effectively be the only version allowed for clang-cl builds. Consider making the "older" phrasing more precise here: https://searchfox.org/mozilla-central/rev/44fa24847e4e73ce8932de4c8cf47559302e4ba2/build/moz.configure/windows.configure#243
Comment on attachment 8957752 [details] [diff] [review]
in-tree config changes for building with VS2017 15.6.0

Review of attachment 8957752 [details] [diff] [review]:
-----------------------------------------------------------------

::: build/docs/toolchains.rst
@@ +46,5 @@
>  3. Under ``Windows and Web Development`` uncheck everything except
>     ``Universal Windows App Development Tools`` and the items under it
>     (should be ``Tools (1.3.1)...`` and the ``Windows 10 SDK``).
>  
>  Once Visual Studio 2015 Community has been installed, from a checkout

This should probably be Visual Studio 2017 (and potentially elsewhere in the file), yes?  (Either here or in the patch where you make it mandatory.)

::: build/moz.configure/toolchain.configure
@@ +507,5 @@
>                                                cxx14_version):
>              append_flag('-std=gnu++14')
>  
> +    # We force clang-cl to emulate Visual C++ 2017 version 15.6.0
> +    if info.type == 'clang-cl' and info.version != '19.13.26128':

While you're here, do you feel like pulling this constant out:

msvc_version = '19.13.26128':
if ... info.version != msvc_version:
  flags.append('-fm-compatibility-version=%s' % msvc_version)

?

::: build/windows_toolchain.py
@@ +85,5 @@
>              },
>          ],
>      },
>      {
> +        'srcdir': '%(vs_path)s/VC/Redist/MSVC/14.13.26020',

While we're here, why do the two changed lines differ?  One is for 32-bit and one is for 64-bit?

::: python/mozbuild/mozbuild/test/configure/test_toolchain_configure.py
@@ +255,5 @@
>      '*.cpp': {
>          '__STDC_VERSION__': False,
>          '__cplusplus': '201103L',
>      },
> +    '-fms-compatibility-version=19.13.26128': VS('19.13.26128')[None],

Similarly to the toolchain.configure bit, maybe we should pull this version out into EMULATED_MSVC_VERSION or something and use it everywhere in this file?  Followup is OK for this one, I think.
Attachment #8957752 - Flags: review?(nfroyd) → review+
Comment on attachment 8957753 [details] [diff] [review]
make version 15.6 and SDK 15063 the minimum supported versions

Review of attachment 8957753 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the modifications below.

::: build/moz.configure/windows.configure
@@ +235,5 @@
>                                ' version can be installed using the Visual'
>                                ' Studio installer.'
>                                % (version, minimum_ucrt_version))
>  
>      broken_ucrt_version = Version('10.0.16299.0')

dmajor's point about modifying the message associated with this check is well taken.  Please do that.

::: python/mozbuild/mozbuild/test/configure/test_toolchain_configure.py
@@ +235,5 @@
>  VS_2015 = VS('19.00.23026')
>  VS_2015u1 = VS('19.00.23506')
>  VS_2015u2 = VS('19.00.23918')
>  VS_2015u3 = VS('19.00.24213')
> +VS_2017u6 = VS('19.13.26128')

I think for completeness, we really want at least one VS_2017u$N check with N < 6.  Can you add one of those as well?
Attachment #8957753 - Flags: review?(nfroyd) → review+
Comment on attachment 8957754 [details] [diff] [review]
workarounds for PGO issues

Review of attachment 8957754 [details] [diff] [review]:
-----------------------------------------------------------------

I'm about 50/50 on the comments; I know people can check blame, but having comments makes it more likely that somebody might investigate whether a newer compiler has actually fixed the PGO issues.

::: layout/generic/nsContainerFrame.cpp
@@ +1051,5 @@
>   *    don't want to automatically sync the frame and view
>   * NS_FRAME_NO_SIZE_VIEW - don't size the frame's view
>   */
> +#if defined(_MSC_VER) && !defined(__clang__) && defined(_M_AMD64)
> +#pragma optimize("g", off)

Maybe add an XXX comment here, pointing at this bug?

@@ +1100,5 @@
>  
>    aKidFrame->DidReflow(aPresContext, aReflowInput);
>  }
> +#if defined(_MSC_VER) && !defined(__clang__) && defined(_M_AMD64)
> +#pragma optimize("", on)

Likewise for this one.
Attachment #8957754 - Flags: review?(nfroyd) → review+
Comment on attachment 8957754 [details] [diff] [review]
workarounds for PGO issues

Review of attachment 8957754 [details] [diff] [review]:
-----------------------------------------------------------------

Fine with me. I'll try to upstream so we don't have to carry this.

Comment with at least the offending Visual Studio version would be good.
Attachment #8957754 - Flags: feedback?(giles) → feedback+
For an upstream maybe we should do a test on the value of _MSC_VER or _MSC_FULL_VER.
Submitted upstream at https://aomedia-review.googlesource.com/c/aom/+/50962 Amendments welcome, but I figured there was less point in testing _MSC_VER until we know what newer release doesn't show the problem.
(In reply to Nathan Froyd [:froydnj] from comment #21)
> Maybe add an XXX comment here, pointing at this bug?

Done for nsContainerFrame.cpp. I went with rillian's upstream version for hybrid_fwd_txfm_avx2.c to avoid unnecessary comment churn down the road.

(In reply to David Major [:dmajor] from comment #23)
> For an upstream maybe we should do a test on the value of _MSC_VER or
> _MSC_FULL_VER.

_MSC_VER >= 1911 should work to start.
Attachment #8957754 - Attachment is obsolete: true
Attachment #8958270 - Flags: review+
(In reply to Nathan Froyd [:froydnj] from comment #19)
> This should probably be Visual Studio 2017 (and potentially elsewhere in the
> file), yes?  (Either here or in the patch where you make it mandatory.)

Done.

> While you're here, do you feel like pulling this constant out:
> 
> msvc_version = '19.13.26128':
> if ... info.version != msvc_version:
>   flags.append('-fm-compatibility-version=%s' % msvc_version)
> 
> ?

Done.

> While we're here, why do the two changed lines differ?  One is for 32-bit
> and one is for 64-bit?

One is the toolchain version and one is the compiler version. As to why they differ, yay Microsoft I guess.

> Similarly to the toolchain.configure bit, maybe we should pull this version
> out into EMULATED_MSVC_VERSION or something and use it everywhere in this
> file?  Followup is OK for this one, I think.

This test has some other laughably redundant stuff in it that could probably stand some cleanups. A follow-up bug sounds good to me (or maybe just repurpose bug 1324041 into a more generic cleanup bug).
Attachment #8957752 - Attachment is obsolete: true
Attachment #8958271 - Flags: review+
(In reply to Nathan Froyd [:froydnj] from comment #20)
> dmajor's point about modifying the message associated with this check is
> well taken.  Please do that.

Done.

> I think for completeness, we really want at least one VS_2017u$N check with
> N < 6.  Can you add one of those as well?

I added 2017u4 as well. We could probably stand to clean up some of those older 2013/2015 ones too, but follow-up fodder same as the last comment :)
Attachment #8957753 - Attachment is obsolete: true
Attachment #8958272 - Flags: review+
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/1187ae9020be
De-optimize some functions to work around crashes during compilation. r=froydnj
https://hg.mozilla.org/integration/autoland/rev/87b446c458d3
Use Visual Studio 2017 15.6.0 for Windows builds. r=froydnj
https://hg.mozilla.org/integration/autoland/rev/a7ab282e1d4a
Require Visual Studio 2017 15.6.0 and Win SDK 10.0.15063.0 to build on Windows. r=froydnj
Comment on attachment 8958272 [details] [diff] [review]
make version 15.6 and SDK 15063 the minimum supported versions

Review of attachment 8958272 [details] [diff] [review]:
-----------------------------------------------------------------

::: python/mozbuild/mozbuild/test/configure/test_toolchain_configure.py
@@ +234,5 @@
>  VS_2013u3 = VS('18.00.30723')
>  VS_2015 = VS('19.00.23026')
>  VS_2015u1 = VS('19.00.23506')
>  VS_2015u2 = VS('19.00.23918')
>  VS_2015u3 = VS('19.00.24213')

We can probably remove a few of the 2015 ones.
Depends on: 1445105
Blocks: 1086964
No longer blocks: 1443706
Blocks: 1448436
Blocks: 1448472
Blocks: 1453317
No longer depends on: 1430696
Duplicate of this bug: 1430696
Blocks: 1445105
No longer depends on: 1445105
You need to log in before you can comment on or make changes to this bug.