Closed Bug 1084162 Opened 5 years ago Closed 5 years ago

PGO all the things on Windows

Categories

(Firefox Build System :: General, defect)

36 Branch
x86_64
Windows 7
defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED
mozilla36

People

(Reporter: dmajor, Assigned: dmajor)

References

Details

Attachments

(4 files, 2 obsolete files)

Currently Windows PGO is enabled per-directory with MSVC_ENABLE_PGO. With VS2013 we don't need to worry about overloading the linker and should be able to PGO everything.
Please do check how much build time this adds, though.
I think this need to be conditional if express version is supported.
(In reply to zhoubcfan from comment #2)
> I think this need to be conditional if express version is supported.

Please forget this, pgo is not available in express version.
This bug is not about local builds. Requiring PGO, a two-step process, for all local builds would be silly.
Removing the MSVC_ENABLE_PGO test gives me an error dialog when it tries to capture the profile:

---------------------------
shlibsign.exe - Application Error
---------------------------
The application was unable to start correctly (0xc000007b). Click OK to close the application. 
---------------------------
OK   
---------------------------

Smells a lot like bug 919735 comment 2.
The shlibsign errors were caused by shlibsign.exe's use of nss3.dll, which was itself getting PGO'd as well as using mozglue.dll which was also getting PGO'd.
This took my PGO build from 58 minutes to 75 minutes. Ouch - but we're still better than VS2010.

I didn't yet remove uses of MSVC_ENABLE_PGO. Want to make sure this sticks first.
Attachment #8508393 - Flags: review?(mh+mozilla)
Comment on attachment 8508393 [details] [diff] [review]
PGO all the things (except nss and mozglue)

Review of attachment 8508393 [details] [diff] [review]:
-----------------------------------------------------------------

::: config/external/nss/moz.build
@@ +6,5 @@
>  
>  DIRS += ['crmf']
>  
> +if CONFIG['_MSC_VER']:
> +    NO_PGO = True

Please add comments as to why this is necessary. Also, please file a bug on the why it fails, I don't see why we'd actually be /running/ nss tools during the instrumentation phase of the build.
Attachment #8508393 - Flags: review?(mh+mozilla) → review+
(In reply to Mike Hommey [:glandium] from comment #8)
> Please add comments as to why this is necessary. Also, please file a bug on
> the why it fails, I don't see why we'd actually be /running/ nss tools
> during the instrumentation phase of the build.

I suspect this is just fallout from running 'make stage-package' so we can run the profiling scenario from the staged bits. We could probably just skip the NSS signing during that package staging. Alternately we could maybe use a workaround like bug 919735 if that's what's broken.
> Alternately we could maybe use a workaround like bug 919735 if that's what's broken.

I was typing up a comment like "but I feel dirty about that fix and hesitate to duplicate it" until I realized that the path hack is already in packager.py. We just need to move it a few lines to a more general location. I don't mind that so much :)
Attachment #8510195 - Flags: review?(ted)
Comment on attachment 8510195 [details] [diff] [review]
Part 1: Move the path hack so that shlibsign uses it

Review of attachment 8510195 [details] [diff] [review]:
-----------------------------------------------------------------

Well that's not so bad then, is it?
Attachment #8510195 - Flags: review?(ted) → review+
Attachment #8510195 - Attachment description: Move the path hack so that shlibsign uses it → Part 1: Move the path hack so that shlibsign uses it
With Part 1, we no longer need to opt-out any files. Stripping this patch down to just the config.mk piece.
Attachment #8508393 - Attachment is obsolete: true
Attachment #8510529 - Flags: review+
Fixed an indentation error caught by 64-bit try.
Attachment #8510195 - Attachment is obsolete: true
Attachment #8511718 - Flags: review+
On 32-bit I failed xpcshell at test_crash_purevirtual.js because the compiler completely removed nsTestCrasher.cpp's PureVirtualCall()!

To get around this I marked testcrasher.dll as INTERNAL_TOOLS (which it is) since that prevents PGO.
Attachment #8511746 - Flags: review?(ted)
On Win64 I still failed to build (even with the indentation fix): https://tbpl.mozilla.org/?tree=Try&rev=7e9c4d2ca46c

c:\builds\moz2_slave\try-w64-0000000000000000000000\build\dom\mobileconnection\mobileconnection.cpp(727) : fatal error C1001: An internal error has occurred in the compiler.
mozmake.exe[6]: *** [xul.dll] Error 1
mozmake.exe[6]: *** Deleting file 'xul.dll'
mozmake.exe[5]: *** [toolkit/library/target] Error 2
mozmake.exe[4]: *** [compile] Error 2
mozmake.exe[3]: *** [default] Error 2
mozmake.exe[2]: *** [realbuild] Error 2
mozmake.exe[1]: *** [profiledbuild] Error 2
mozmake.exe: *** [build] Error 2 

I haven't yet investigated, but this sounds bad. This is the kind of error that we used to see in 32-bit builds before Update 3. I wouldn't be surprised if the Win64 PGO were less mature and not yet ready for the full weight of libxul.
Comment on attachment 8511746 [details] [diff] [review]
Part 0: Mark testcrasher.dll as INTERNAL_TOOLS to prevent PGO

Review of attachment 8511746 [details] [diff] [review]:
-----------------------------------------------------------------

Hah!
Attachment #8511746 - Flags: review?(ted) → review+
(In reply to David Major [:dmajor] (UTC+13) from comment #16)
> I haven't yet investigated, but this sounds bad. This is the kind of error
> that we used to see in 32-bit builds before Update 3. I wouldn't be
> surprised if the Win64 PGO were less mature and not yet ready for the full
> weight of libxul.

In general we've worked around bugs in the PGO compiler for the entirety of the time we've been using PGO, so this doesn't surprise me. Our standard operating procedure here is to disable PGO for the source file that's causing the issue and try to file a ticket with Microsoft.
Assignee: nobody → dmajor
> In general we've worked around bugs in the PGO compiler for the entirety of
> the time we've been using PGO, so this doesn't surprise me. Our standard
> operating procedure here is to disable PGO for the source file that's
> causing the issue and try to file a ticket with Microsoft.

For the 32-bit failures before Update 3, it was a matter of volume, not specific files: if you disabled a single file, you'd just crash on the next one. I suspect the Win64 crash is in the same boat, but I haven't tested this theory yet. I'll look more when I get back.
Well I'm quite happy to be wrong about this. MobileConnection.cpp was indeed a one-off failure. We can just disable PGO for that file on Win64, and I'll send a LINK_REPRO to Microsoft.

https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=f7591a7e04e4
Attachment #8515857 - Flags: review?(mh+mozilla)
Comment on attachment 8515857 [details] [diff] [review]
Disable PGO for MobileConnection.cpp

Review of attachment 8515857 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/mobileconnection/moz.build
@@ +50,5 @@
> +    'MobileConnection.cpp', # Non-unified due to no_pgo
> +]
> +
> +if CONFIG['_MSC_VER'] and CONFIG['CPU_ARCH'] == 'x86_64':
> +    SOURCES['MobileConnection.cpp'].no_pgo = True # VS2013 crashes, bug 1084162

maybe make that a version-specific thing?
Attachment #8515857 - Flags: review?(mh+mozilla) → review+
(In reply to David Major [:dmajor] (UTC+13) from comment #23)
> Didn't bother with a version check since PGO implicitly needs >= 2013
> nowadays.

The point was that PGO would have been enabled for 2014.
Talos email shows some modest wins (2-5%) from this.

I filed https://connect.microsoft.com/VisualStudio/feedback/details/1020301 about the crash in MobileConnection.cpp.
we have 2 slight regressions on windows in dromaeo DOM on win7 and win8.  I see this on Aurora while investigating the issues seen after the uplift.
Shouldn't we also remove all these |MSVC_ENABLE_PGO = True| lines from moz.build files?

I just spent a lot of time investigating some PGO performance issues, then I realized the MSVC_ENABLE_PGO moz.build flag has no effect :( (Or maybe I'm missing something subtle...)
Flags: needinfo?(dmajor)
Thanks for the reminder, Jan. I'll file a follow-up bug. Sorry for the trouble.

(In reply to David Major [:dmajor] from comment #7)
> I didn't yet remove uses of MSVC_ENABLE_PGO. Want to make sure this sticks first.
I'm going to say nearly-six-months counts as stuck!
Flags: needinfo?(dmajor)
Thanks! Sorry, I didn't see comment 7 :)
Blocks: 1157835
Depends on: 1290642
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.