As a security precaution, we have turned on the setting "Require API key authentication for API requests" for everyone. If this has broken something, please contact bugzilla-admin@mozilla.org
Last Comment Bug 705479 - Parallel builds (-jN) on Windows should fail early if gmake is being used
: Parallel builds (-jN) on Windows should fail early if gmake is being used
Status: RESOLVED FIXED
fixed-in-bs
:
Product: Firefox
Classification: Client Software
Component: Build Config (show other bugs)
: unspecified
: All Windows 7
: -- enhancement (vote)
: Firefox 11
Assigned To: Jon Buckley
:
: Gregory Szorc [:gps]
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-11-26 13:34 PST by Jon Buckley
Modified: 2011-11-30 06:56 PST (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Throw an error when parallel building (-jN) on Windows with gmake (1.16 KB, patch)
2011-11-27 12:55 PST, Jon Buckley
no flags Details | Diff | Splinter Review
Move checks to browser/build.mk (1.29 KB, patch)
2011-11-28 14:25 PST, Jon Buckley
khuey: review-
bugspam.Callek: feedback-
Details | Diff | Splinter Review
Allow multi-CPU builds to work with -j1 (1.19 KB, patch)
2011-11-29 08:13 PST, Jon Buckley
khuey: review+
Details | Diff | Splinter Review

Description User image Jon Buckley 2011-11-26 13:34:16 PST
I ran into the documented problem where doing a parallel build on Windows with gmake will randomly deadlock during the compile process. I got directed to use Pymake instead, which works great for me, but it'd be helpful for new developers if the compile process would stop with an error message if gmake is being used to do a parallel build.
Comment 1 User image Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-11-26 17:04:00 PST
Are Thunderbird or Seamonkey still using VMs for build infra?  IIRC we did parallel make on gmake in VMs without deadlocking because the VMs are so damn slow.
Comment 2 User image Justin Wood (:Callek) 2011-11-27 04:15:07 PST
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #1)
> Are Thunderbird or Seamonkey still using VMs for build infra?  IIRC we did
> parallel make on gmake in VMs without deadlocking because the VMs are so
> damn slow.

SeaMonkey is, but we probably will be ending up moving away from it one of these years. And yes, SeaMonkey is still using -j3 on our VM's so I would appreciate at the least a way to override this error (or no error to be introduced). We do hit the deadlock once and a while, but it is rare.
Comment 3 User image Jon Buckley 2011-11-27 12:55:53 PST
Created attachment 577158 [details] [diff] [review]
Throw an error when parallel building (-jN) on Windows with gmake

Here's a first crack at a fix for this. It doesn't handle other Mozilla applications, but it does work for building Firefox on Windows and OS X. Is putting this logic in client.mk the right approach to fixing this, or is there a better place for it?
Comment 4 User image Justin Wood (:Callek) 2011-11-27 15:18:00 PST
(In reply to Jon Buckley [:jbuck] from comment #3)
> Created attachment 577158 [details] [diff] [review] [diff] [details] [review]
> Throw an error when parallel building (-jN) on Windows with gmake
> 
> Here's a first crack at a fix for this. It doesn't handle other Mozilla
> applications, but it does work for building Firefox on Windows and OS X. Is
> putting this logic in client.mk the right approach to fixing this, or is
> there a better place for it?

Strictly my opinion, IFF we do this, we should do it in the toplevel Makefile.in not client.mk

client.mk itself is not executed in parallel, and people don't need (and sometimes don't) start with client.mk for builds, but client.mk does always dive into top-level Makefile. Could also [instead] be done in config/rules.mk.

As I said above though, I'm against doing this without an "ignore"/"off" switch.
Comment 5 User image Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-11-28 04:42:48 PST
Comment on attachment 577158 [details] [diff] [review]
Throw an error when parallel building (-jN) on Windows with gmake

Lets put this in browser/build.mk instead.
Comment 6 User image Ted Mielczarek [:ted.mielczarek] 2011-11-28 04:51:11 PST
I think we could make this relatively foolproof by erroring iff:
1) You're using gmake
2) On Windows
3) And you have >1 core. (Since this is Windows-only, you can probably check the NUMBER_OF_PROCESSORS env var.)
Comment 7 User image Jon Buckley 2011-11-28 14:25:51 PST
Created attachment 577379 [details] [diff] [review]
Move checks to browser/build.mk

This patch moves the parallel build checks to browser/build.mk, and changes the check to NUMBER_OF_PROCESSORS instead.
Comment 8 User image Justin Wood (:Callek) 2011-11-28 14:33:29 PST
Comment on attachment 577379 [details] [diff] [review]
Move checks to browser/build.mk

This breaks compiling Firefox with gnuMake on windows when youhave multiple CPUs even when you don't -jN.  Which for example, would break current firefox build machines too.
Comment 9 User image Justin Wood (:Callek) 2011-11-28 14:36:11 PST
(In reply to Justin Wood (:Callek) from comment #8)
> Comment on attachment 577379 [details] [diff] [review] [diff] [details] [review]
>  Which for example, would break current firefox build machines too.

Err assumptions presented as fact -- fail.

I *think* Current firefox build machines actually have multiple processors assigned, but I did not verify that assumption. but my f- stands since it also breaks the "don't break for the easy/common case" for many users.

Since the deadlocks do not happen with -j1 or no -j at all.
Comment 10 User image Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-11-28 14:37:54 PST
Comment on attachment 577379 [details] [diff] [review]
Move checks to browser/build.mk

Review of attachment 577379 [details] [diff] [review]:
-----------------------------------------------------------------

We shouldn't break -j1 on multiprocessor machines, even if it doesn't affect the build bots.
Comment 11 User image Jon Buckley 2011-11-29 08:13:19 PST
Created attachment 577623 [details] [diff] [review]
Allow multi-CPU builds to work with -j1

Now checks for the presence of -j in MAKEFLAGS. MAKEFLAGS doesn't get expanded on the first pass through the file, so this patch uses a workaround found on the make mailing list to search MAKEFLAGS: http://lists.gnu.org/archive/html/help-make/2009-05/msg00023.html
Comment 12 User image Jon Buckley 2011-11-29 08:30:39 PST
Just for some further clarification, MAKEFLAGS at that point in the file shows "w --jobserver-fds=3,4 -j", if you pass any -jN option via MOZ_MAKE_FLAGS, eg -j8, -j12. If you do not pass -jN in your MOZ_MAKE_FLAGS, then MAKEFLAGS will be "w".

If we need to test for specific -jN values (if someone does MOZ_MAKE_FLAGS=-j1), the only way I can think of right now is exporting the MOZ_MAKE_FLAGS variable in client.mk, so that browser/build.mk can see exactly what flag was set. Looking through mxr, this *might* be a concern with some of the build configs: http://mxr.mozilla.org/mozilla-central/search?string=MOZ_MAKE_FLAGS
Comment 13 User image Jon Buckley 2011-11-29 08:53:42 PST
Turns out I am being needlessly paranoid about my first patch :)

If you pass MOZ_MAKE_FLAGS="", MAKEFLAGS="w"
If you pass MOZ_MAKE_FLAGS="-j1", MAKEFLAGS="w"
If you pass MOZ_MAKE_FLAGS="-jN", where N is > 1, MAKEFLAGS="w --jobserver-fds=3,4 -j"
Comment 14 User image Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-11-29 11:24:45 PST
Comment on attachment 577623 [details] [diff] [review]
Allow multi-CPU builds to work with -j1

Review of attachment 577623 [details] [diff] [review]:
-----------------------------------------------------------------

Woo.  Looks good to me.
Comment 15 User image Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-11-29 13:35:29 PST
https://hg.mozilla.org/projects/build-system/rev/0e5dfc784411
Comment 16 User image Kyle Huey [:khuey] (Exited; not receiving bugmail, email if necessary) 2011-11-30 06:56:44 PST
https://hg.mozilla.org/mozilla-central/rev/0e5dfc784411

Note You need to log in before you can comment on or make changes to this bug.