815219 - Default to building with all available cores

Assignee

Description

•

13 years ago

I pushed a specialized build to try which measures system resource usage when building. On the EC2 instance it hit, it never peaked above 50% CPU usage. It appears that the Linux mozconfigs are all running -j4. Since we peaked at 50% CPU, I'm guessing these machines have 8 cores and we should probably increase to -j12. I'm not sure if it's safe to make this change globally or if we should conditionally increase it on just the EC2 builders. https://tbpl.mozilla.org/php/getParsedLog.php?id=17338665&tree=Try&full=1 contains the raw data.

(not currently active) Ted Mielczarek

Comment 1

•

13 years ago

catlee told me what type of EC2 instance we were using but I forget now: http://aws.amazon.com/ec2/instance-types/ I wonder if we shouldn't make the build (whether via mozconfig or other trickery) able to determine the number of cores and set -j to an appropriate value.

Gregory Szorc [:gps]

Assignee

Comment 2

•

13 years ago

I wholeheartedly agree that -j should be chosen automatically. Early implementations of mach featured this. Unfortunately, it got lost when I transitioned to building through client.mk. There are a number of solutions to this. Unfortunately, I think they are all somewhat dirty. The one-liner you are looking for to obtain CPU count is: python -c 'import multiprocessing; print(multiprocessing.cpu_count() + 1)' We can bikeshed about how many extra processes to add. I don't think any more than 2 extra is beneficial. My measurements show that even 1 extra doesn't do much, if anything.

(not currently active) Ted Mielczarek

Comment 3

•

13 years ago

-j<#of cores> is the right first pass here. We can tweak it later if there's a more correct value.

Gregory Szorc [:gps]

Assignee

Comment 4

•

13 years ago

Attached patch Add -jN to MOZ_MAKE_FLAGS automatically, v1 — Details — Splinter Review

What could go wrong?

Attachment #685759 - Flags: review?(ted)

Gregory Szorc [:gps]

Assignee

Comment 5

•

13 years ago

Pretend that last patch doesn't contain the "+ 1"

(not currently active) Ted Mielczarek

Comment 6

•

13 years ago

Comment on attachment 685759 [details] [diff] [review] Add -jN to MOZ_MAKE_FLAGS automatically, v1 Review of attachment 685759 [details] [diff] [review]: ----------------------------------------------------------------- Sure, why not?

Attachment #685759 - Flags: review?(ted) → review+

Gregory Szorc [:gps]

Assignee

Comment 7

•

13 years ago

Attached patch Part 2: Remove -jN from in-tree mozconfigs, v1 — Details — Splinter Review

Since we set -j automatically now and the default value is optimal, we should be able to remove its definition from the in-tree mozconfigs. This patch does that. The only place were MOZ_MAKE_FLAGS is still referenced is Windows. If we're not using pymake, we explicitly use -j1. We should probably just have the driver bail if GNU make is used on Windows. But, I'm pretty sure we can't do that yet since not all Windows tree configs have been swung over to pymake. Try at https://tbpl.mozilla.org/?tree=Try&rev=d4e98a1704ac

Assignee: nobody → gps

Status: NEW → ASSIGNED

Attachment #685769 - Flags: review?(ted)

Gregory Szorc [:gps]

Assignee

Updated

•

13 years ago

Summary: Increase make -j on EC2 builders → Default to building with all available cores

Gregory Szorc [:gps]

Assignee

Comment 8

•

13 years ago

Speak now or forever hold your peace.

Benjamin Smedberg

Comment 9

•

13 years ago

gogo

(not currently active) Ted Mielczarek

Comment 10

•

13 years ago

Comment on attachment 685769 [details] [diff] [review] Part 2: Remove -jN from in-tree mozconfigs, v1 Review of attachment 685769 [details] [diff] [review]: ----------------------------------------------------------------- We should make sure this doesn't regress build speed on any of our current platforms.

Attachment #685769 - Flags: review?(ted) → review+

(no longer active)

Comment 11

•

13 years ago

Yay! If anyone objects to passing -j#ofcores automatically, please report them to the authorities. Thank you!

(not currently active) Ted Mielczarek

Comment 12

•

13 years ago

coop was looking at updating the Build Faster dashboards. Did we ever get those back up and running?

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 13

•

13 years ago

Did we up our python version requirement enough for this? If so, lets do it!

Gregory Szorc [:gps]

Assignee

Comment 14

•

13 years ago

(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #13) > Did we up our python version requirement enough for this? If so, lets do it! multiprocessing was added in Python 2.6, which is our current minimum required Python version.

Mike Hommey [:glandium]

Comment 15

•

13 years ago

I object to -j#ofcores. -j#ofcores*1.5 has my vote. :)

Gregory Szorc [:gps]

Assignee

Comment 16

•

13 years ago

From 2 try builds I performed today. Before -> After times for just the compile buildbot step. Linux64 Opt: 19:00 (try-linux64-ec2-618) -> 16:21 (bld-centos6-hp-027) OS X 10.7 Opt: 21:38 (bld-lion-r5-021) -> 21:05 (bld-lion-r5-009) Win Opt: 39:50 (w64-ix-slave34) -> 39:30 (w64-ix-slave28) Android 2.2 Opt: 32:19 (try-linux64-ec2-317) -> 20:20 (bld-centos6-hp-032) B2G ARM Opt: 23:31 (try-linux64-ec2-617) -> 17:38 (bld-centos6-hp-041) B2G Panda Opt: 24:02 (try-linux64-ec2-325) -> 25:15 (bld-centos6-hp-024) B2G Unagi Opt: 33:33 (try-linux64-ec2-609) -> 24:45 (bld-centos6-hp-040) For comparison purposes, most of these are worthless because A) machines are different B) ccache state significantly impacts build times. The numbers do show that there are no significant regressions in performance (I would expect a 2-4x slowdown if the patches didn't work, for example). Regardless of what this does for buildbot times, this makes individual development much nicer. If you are on a multi-core machine, you just run |./mach build| and you use all the cores without any extra configuration.

Gregory Szorc [:gps]

Assignee

Comment 17

•

13 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/ba730945bc6d https://hg.mozilla.org/integration/mozilla-inbound/rev/7f5e2a9addff (In reply to Mike Hommey [:glandium] from comment #15) > I object to -j#ofcores. > > -j#ofcores*1.5 has my vote. :) Let's wait to discuss this until after data proves we regressed build times. I suspect we won't be having a discussion :)

Gregory Szorc [:gps]

Assignee

Comment 18

•

13 years ago

We will need to scrub MDN of most references to -jN. I think about the only legitimate reference to it should be telling people how to slow builds down in case they are using too many system resources. I would trust our build system to make optimal decisions about the proper -j value. If this means building a lookup table or something or some kind of algorithm supported by data, we should do that in a follow-up bug. i.e. users should not need to take any action to ensure optimal build times.

Keywords: dev-doc-needed

(no longer active)

Comment 19

•

13 years ago

(In reply to comment #17) > https://hg.mozilla.org/integration/mozilla-inbound/rev/ba730945bc6d > https://hg.mozilla.org/integration/mozilla-inbound/rev/7f5e2a9addff > > (In reply to Mike Hommey [:glandium] from comment #15) > > I object to -j#ofcores. > > > > -j#ofcores*1.5 has my vote. :) > > Let's wait to discuss this until after data proves we regressed build times. I > suspect we won't be having a discussion :) Because we don't have the data? :P

(no longer active)

Comment 20

•

13 years ago

Should we backport this to aurora/beta as well?

Gregory Szorc [:gps]

Assignee

Comment 21

•

13 years ago

(In reply to Ehsan Akhgari [:ehsan] from comment #20) > Should we backport this to aurora/beta as well? If we build those trees enough to warrant backport, sure. This change should be pretty harmless and a pretty easy candidate for backport. Although, we should probably wait a day or two. I expect this patch to bring at least one house of cards down (my bet is on l10n nightlies).

(no longer active)

Comment 22

•

13 years ago

(In reply to comment #21) > (In reply to Ehsan Akhgari [:ehsan] from comment #20) > > Should we backport this to aurora/beta as well? > > If we build those trees enough to warrant backport, sure. This change should be > pretty harmless and a pretty easy candidate for backport. We're doing a lot of landings (I mean, more than usual) on aurora and beta because of b2g. > Although, we should probably wait a day or two. I expect this patch to bring at > least one house of cards down (my bet is on l10n nightlies). Agreed.

Ed Morley [:emorley]

Comment 23

•

13 years ago

https://hg.mozilla.org/mozilla-central/rev/ba730945bc6d https://hg.mozilla.org/mozilla-central/rev/7f5e2a9addff

Status: ASSIGNED → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Landry Breuil (:gaston)

Comment 24

•

13 years ago

I suppose ms2ger cc'ed me on that bug (thanks!) to confirm that it works on OpenBSD.. and yes it does :) $python2.7 -c 'import multiprocessing;print(multiprocessing.cpu_count())' 2 tip configures fine and -j2 is indeed passed to gmake if i dont set it in .mozconfig. | | |-+= 02736 landry gmake -f client.mk | | | \-+- 12662 landry gmake -f /src/mozilla-central/client.mk realbuild | | | \-+- 01171 landry gmake -j2 -C /usr/obj/m-c

Gregory Szorc [:gps]

Assignee

Comment 25

•

13 years ago

multiprocessing itself should be safe on BSD's. It's when you get into the true multiprocessing foo that uses locks that you run into trouble.

Gregory Szorc [:gps]

Assignee

Comment 26

•

13 years ago

Updated the following: https://developer.mozilla.org/en-US/docs/Developer_Guide/mach https://developer.mozilla.org/en-US/docs/Developer_Guide/Mozilla_build_FAQ#Making_builds_faster Surprisingly those were the only references I found!

Keywords: dev-doc-needed → dev-doc-complete

Trevor Saunders (:tbsaunde)

Comment 27

•

13 years ago

remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/5ef3d98bc229

Nobody; OK to take it and work on it

Updated

•

12 years ago

Product: mozilla.org → Release Engineering

neil@parkwaycc.co.uk

Comment 28

•

12 years ago

Bah, I'm used to make -s -j3 -f client.mk but now that Standard8 has ported this to comm-central I have to remember to switch to make -s -f client.mk MOZ_MAKE_FLAGS=-j3 otherwise client.mk ignores me.

neil@parkwaycc.co.uk

Comment 29

•

12 years ago

(this is Windows where I have aliased make to pymake; I'm told that gmake ignores -j3 because client.mk contains .NOTPARALLEL)

neil@parkwaycc.co.uk

Comment 30

•

12 years ago

(and the reason I add -j3 to the command line is because that way it affects all of my makes, not just the ones that go through client.mk)

neil@parkwaycc.co.uk

Comment 31

•

12 years ago

I should try not to think about this after midnight, because pymake might be saving me anyway.

neil@parkwaycc.co.uk

Comment 32

•

12 years ago

So, for gmake, the rule is that a submake with an explicit $(MAKE) -j setting does not trigger the normal parent/child job sharing. The parent makefile treats the submakefile as its own job, parellising it with anything else in the same make, and then the submakefile creates its own parallel jobs. This means that gmake -j3 -f client.mk uses the override setting in client.mk to build. For pymake, the way that builtins aren't multiprocessed really confused me, but I eventually figured out a workaround by using an external shell script. This shows that pymake ignores -j settings with conflicting values; only -j1 is correctly honoured, so that pymake -j3 -f client.mk ignores the override.

Add -jN to MOZ_MAKE_FLAGS automatically, v1 13 years ago Gregory Szorc [:gps] 1.20 KB, patch	ted : review+	Details \| Diff \| Splinter Review
Part 2: Remove -jN from in-tree mozconfigs, v1 13 years ago Gregory Szorc [:gps] 33.99 KB, patch	ted : review+	Details \| Diff \| Splinter Review