Closed Bug 925605 Opened 8 years ago Closed 8 years ago

Re-allow to build with GNU make on windows

Categories

(Firefox Build System :: General, defect)

x86_64
Windows XP
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla27

People

(Reporter: glandium, Assigned: glandium)

References

Details

Attachments

(1 file, 9 obsolete files)

We made the build system reject non-pymake builds on windows. We should undo this. I've got a build semi-working with GNU make 4.0 with windows-style paths (NOT msys paths, thankfully) with a couple changes to security/build.
OS: Linux → Windows XP
I believe the main reasons we decided to reject gmake were the -jN hang (which presumably is fixed), and the path issue. If those aren't issues then it seems fine. Perhaps we want to reject gmake < 4.0 or something like that, however?
Well, even plain gnu make 4.0 is not suitable. For two reasons:
  https://savannah.gnu.org/bugs/?40227 (most notably the second issue)
  https://savannah.gnu.org/bugs/?40241

And I haven't got a full build yet, but i'm more than halfway through, and I'm hopeful it will work.
Attached patch WIP (obsolete) — Splinter Review
I have a tree that fails to build with pymake that fails similarly with gnu make. I'm currently running a new build with a fresh tree that's not supposed to fail, with this preliminary patch (it's not in a landable shape), using a MSVC-built gnu make 4.0 + two patches + a config adjustment (binary on http://people.mozilla.org/~mhommey/make.exe). I'm hopeful it will finish properly, but my Windows VM is slow, so it's going to take some time.
Attached patch WIP (obsolete) — Splinter Review
With fixups i had forgotten to fold.
Attachment #815821 - Attachment is obsolete: true
Attached patch WIP (obsolete) — Splinter Review
This gets me to the end of a build with this mozconfig:
ac_add_options --disable-webgl
ac_add_options --disable-gamepad
ac_add_options --without-intl-api

The former 2 because i don't have the directx sdk installed, and the latter because there are still msys path issues with the icu build to fix.

It however fails at the very last command (make binaries-deps) because of the too long command line.
Attachment #815831 - Attachment is obsolete: true
Depends on: 925766
Depends on: 926007
Since stock GNU make doesn't work properly (yet) and we need a patched version, I figured the easiest way to ensure a working version is used is to rename the executable. I named it mozmake, but it could be anything else.

I can build in my Windows VM with this patch and http://people.mozilla.org/~mhommey/mozmake.exe , provided i build with --without-intl-api (there are still issues with the icu build). This is good enough imho to review and land. ICU can be handled in a followup (or maybe by a new build of mozmake.exe)
Attachment #816871 - Flags: review?(gps)
Attachment #815868 - Attachment is obsolete: true
Ah, I should mention this: it actually fails at the end of the build when MOZ_PSEUDO_DERECURSE has no-skip because the command line for the binary-deps target is too long.
Got icu to build with a small additional change.
Attachment #816897 - Flags: review?(gps)
Attachment #816871 - Attachment is obsolete: true
Attachment #816871 - Flags: review?(gps)
Sigh... broken shell quoting on windows...

And with a mach change to use mozmake when it's available.
Attachment #816907 - Flags: review?(gps)
Attachment #816897 - Attachment is obsolete: true
Attachment #816897 - Flags: review?(gps)
Sorry for the noise ; this version still allows to force pymake even when mozmake is there.
Attachment #816908 - Flags: review?(gps)
Attachment #816907 - Attachment is obsolete: true
Attachment #816907 - Flags: review?(gps)
And a stupid fixup for mach.
Attachment #816919 - Flags: review?(gps)
Attachment #816908 - Attachment is obsolete: true
Attachment #816908 - Flags: review?(gps)
So, in then end, running from a non msys command also makes $(MAKE) contain .exe.
Attachment #816922 - Flags: review?(gps)
Attachment #816919 - Attachment is obsolete: true
Attachment #816919 - Flags: review?(gps)
Depends on: 926900
Some timing I got on try with a normal build (non PGO):
- opt, with pymake: 52:32 spent in compilation ; 57:37 in make check
- opt, with mozmake: 46:38 spent in compilation ; 25:43 in make check
- debug, with pymake: 56:26 spent in compilation ; 1:30:07 in make check
- debug, with mozmake: 50:05 spent in compilation ; 36:24 in make check
Depends on: 869406
I'd rather not use a core # multiplier because I don't believe it will
result in much gains (we're only a few percent away from 100% during
compile) and will likely just increase context switching and system
load.

If we had anonymous reporting of build system resource usage back to
Mozilla, would could experiment with values and choose a scientifically
proven value :)
Attachment #817176 - Flags: review?(mh+mozilla)
Assignee: mh+mozilla → gps
Comment on attachment 817176 [details] [diff] [review]
Default to building with #cores + 2

Wrong bug.
Attachment #817176 - Flags: review?(mh+mozilla)
Attachment #817176 - Attachment is obsolete: true
Assignee: gps → mh+mozilla
I wanted to test this before reviewing. When I did, I ran into an error.

Calling into link_deps to produce binaries-deps.mk produces a "Bad file number" error. I /think/ this has to do with excessively long commands (or at least an argument after the first few thousand bytes). On my machine, the auto-generated temp file holding the recipe for execution by sh.exe contains a line/command that was ~44,000 characters long!

I know there is history behind very long commands and Windows. We may even have a hack deployed on machines to accept very long commands. It sounds like we may need to have link_deps read its arguments from a file?
(In reply to Gregory Szorc [:gps] from comment #16)
> I know there is history behind very long commands and Windows. We may even
> have a hack deployed on machines to accept very long commands. It sounds
> like we may need to have link_deps read its arguments from a file?

See comment 7. Bug 926733 makes that work.
Depends on: 926733
(In reply to Mike Hommey [:glandium] from comment #17)
> (In reply to Gregory Szorc [:gps] from comment #16)
> > I know there is history behind very long commands and Windows. We may even
> > have a hack deployed on machines to accept very long commands. It sounds
> > like we may need to have link_deps read its arguments from a file?
> 
> See comment 7. Bug 926733 makes that work.

That is a very hacky workaround and a time bomb if our tree ever gets too large. Hopefully we'll have autogenerated rules for producing object and library files before it detonates. I think we will.
Comment on attachment 816922 [details] [diff] [review]
Allow to build with a special build of GNU make on windows

Review of attachment 816922 [details] [diff] [review]:
-----------------------------------------------------------------

\o/

::: security/build/Makefile.in
@@ +160,4 @@
>  endif
>  # Hack to force NSS build system to use "normal" object directories
> +DEFAULT_GMAKE_FLAGS += ABS_topsrcdir='$(ABS_topsrcdir)'
> +DEFAULT_GMAKE_FLAGS += BUILD='$(MOZ_BUILD_ROOT)/security/$$(subst $$(ABS_topsrcdir)/security/,,$$(CURDIR))'

Must we export ABS_topsrcdir? I don't see what's wrong expanding it right here.
Attachment #816922 - Flags: review?(gps) → review+
(In reply to Gregory Szorc [:gps] from comment #19)
> ::: security/build/Makefile.in
> @@ +160,4 @@
> >  endif
> >  # Hack to force NSS build system to use "normal" object directories
> > +DEFAULT_GMAKE_FLAGS += ABS_topsrcdir='$(ABS_topsrcdir)'
> > +DEFAULT_GMAKE_FLAGS += BUILD='$(MOZ_BUILD_ROOT)/security/$$(subst $$(ABS_topsrcdir)/security/,,$$(CURDIR))'
> 
> Must we export ABS_topsrcdir? I don't see what's wrong expanding it right
> here.

MSYS path mangling fun. If ABS_topsrcdir is expanded directly in BUILD, MSYS fucks up both the paths (MOZ_BUILD_ROOT, too), while it doesn't when doing it this way.
(In reply to Gregory Szorc [:gps] from comment #18)
> (In reply to Mike Hommey [:glandium] from comment #17)
> > (In reply to Gregory Szorc [:gps] from comment #16)
> > > I know there is history behind very long commands and Windows. We may even
> > > have a hack deployed on machines to accept very long commands. It sounds
> > > like we may need to have link_deps read its arguments from a file?
> > 
> > See comment 7. Bug 926733 makes that work.
> 
> That is a very hacky workaround and a time bomb if our tree ever gets too
> large. Hopefully we'll have autogenerated rules for producing object and
> library files before it detonates. I think we will.

Something else to look forward to: when all variables related to objects, libraries and programs are in moz.build, we can skip directories that don't declare those variables even if they have a Makefile.in.
Depends on: 927260
The previous patch made nsinstall.py be used by all platforms for nss. And it looks like nsinstall.py is actually broken with symlinks. Anyways, the intent was not to use nsinstall.py, and the reason this happened is because NSINSTALL and NSINSTALL_PY are defined in config.mk, and I was testing them before including rules.mk (which includes config.mk). So I moved everything NSINSTALL related after the rules.mk include (I also tries including config.mk earlier, but that breaks the build for other reasons)
https://hg.mozilla.org/integration/mozilla-inbound/rev/afae5911a1e0
Depends on: 927213
https://hg.mozilla.org/mozilla-central/rev/afae5911a1e0
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Did anyone ensure that this works with releng's release locale build automation? I don't want to find ourselves in that pickle again. I'm assuming it will since it returns to how things were, but I'm not certain. Need-info'ing Aki since he had the unenviable job of writing the fixes for much of that pymake for make locales fiasco with the 24 release. Aki, feel free to hot-potato this to someone else if I have pegged you incorrectly.
Flags: needinfo?(aki)
I don't think anyone has specifically tested that yet.

The pymake setting for releases is a flag that can be disabled easily; whether it will work is another question.

I think we'll be ok if:

* RelEng continues to do a staging beta release in the final days of aurora
* we turn on gnu make for CI and release builds before requiring it.

If, as the bug summary suggests, we're allowing for gnu make without requiring it, and the staging release during the respective aurora cycle blows up, we have the option of reverting to pymake before the beta 1.

So I would suggest we verify that real live betas work with gnu make before disabling the ability to use pymake in m-c... a couple train migrations down the road.
Flags: needinfo?(aki)
Thanks for looking into it, Aki and Glandium. I appreciate it.
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.