Prioritize js/src in make directory traversal list

RESOLVED FIXED in Firefox 55

Status

defect
RESOLVED FIXED
3 years ago
Last year

People

(Reporter: gps, Assigned: glandium)

Tracking

(Blocks 1 bug)

unspecified
mozilla55
Dependency tree / graph

Firefox Tracking Flags

(firefox55 fixed)

Details

Attachments

(2 attachments)

Reporter

Description

3 years ago
js/src typically builds towards the end of the build. Historically, this was due to ICU and its -j1 badness being a long pole, holding it up from starting.

Now that ted implemented a concurrent build of ICU, js/src is no longer blocked from building for as long.

However, what make appears to do is build targets in the order they are listed in Makefiles as long as their dependencies are met. The order of directories in root.mk and root-deps.mk currently has js/src in the later portion of the list. While traversing into js/src may be unblocked earlier in the build now that ICU doesn't suck, make still waits a long time before building js/src. This can be problematic because js/src contain some of our longest compilation units and the late start makes js/src the long pole in the build.

On a 32 core machine, moving js/src entries to the beginning of the dependencies lists in root.mk and root-deps.mk made a build ~30s faster. That's a win worth adding hacks over.
Reporter

Comment 1

3 years ago
I might be wrong about make's behavior...
Reporter

Comment 2

3 years ago
I'm not actively working on this. https://hg.mozilla.org/users/gszorc_mozilla.com/firefox/rev/8b5f702c63d0 is my work in progress. When I tried it last, it didn't seem to always have an impact.

I once tracked down a post on GNU make's mailing list where they claimed that items were evaluated in dependency order. But it may only work if the dependency is unblocked. We may have to look at make's source code to see what is actually going on.

Bug 1276278 is also relevant to this bug, as Interpreter.cpp is the long pole in C++ compilation and that happens to be in js/src :/
Assignee: gps → nobody
Status: ASSIGNED → NEW
See Also: → 1276278
Can I convince you or anyone else to give this another shot? Actually I don't even care about a complete prioritization solution; I just really don't want Interpreter.cpp to spend 3 minutes compiling all by itself while my machine does nothing else.
Flags: needinfo?(gps)
Reporter

Comment 4

2 years ago
I agree this is high priority for developers.

I'm still dealing with backlog, however. I'll try to find someone else to look into this.
Blocks: fastci
Flags: needinfo?(gps)
I took a look at this today - it seems we're being bitten by the way make schedules dependencies when running in parallel (-jX). In serial mode (-j1), make appears to do a depth-first search, which can be seen with a test Makefile:

#######
targets := $(shell seq 1 10)
all: $(targets)

1: 2

$(targets):
        @echo "Build: $@"
        @sleep 1

.PHONY: $(targets) all
#######

Here we have ten targets (1-10), and make tries to build them in order since that's the order they're listed in after the 'all' target:

Build: 2
Build: 1
Build: 3
Build: 4
Build: 5
Build: 6
Build: 7
Build: 8
Build: 9
Build: 10

Since "1" depends on "2", it can't be built first, but it is built immediately after "2" finishes.

But if we run the same thing with -j2, target "1" is built last, even though its lone dependency is fulfilled after the first job finishes:

Build: 2
Build: 3
Build: 4
Build: 5
Build: 6
Build: 7
Build: 8
Build: 9
Build: 10
Build: 1

So all targets that can be built without dependencies will be scheduled before make tries to revisit those that were skipped the first time around because their dependencies were unmet (ie: a breadth first search). Note this happens on much larger chains as well, for example bumping the number up to 100,000 with -j2 (and no sleep, obviously):

...
Build: 99997
Build: 99998
Build: 99999
Build: 100000
Build: 1

Looking at our own build, the recurse_compile target ends up building 579 other targets when fully expanded (I'm testing locally on Linux), and we are only interested in js/src/target here. The js/src/target has a depth of 3 via the chains:

 config/external/nspr/target -> config/external/nspr/{libc,ds}/target -> config/external/nspr/pr/target

The bad news is that 554 of the 578 other targets are at depth 0-2, meaning roughly 96% of the build will be scheduled before make schedules js/src/target. Moving targets around within root-deps or the like won't help because it won't change the depth of js/src, which ultimately determines its scheduling in parallel mode.

I can try to poke inside make to see if there's anything we can use to try to force a depth-first-search in parallel mode, but I suspect it's more to do with how the parallel job execution is handled rather than an explicit algorithm change. I did also try a newer make version (4.2), but that made no difference.

Is there anything we can do to reduce the depth of js/src/target? For example, split some pieces (ie: the Interpreter and/or other long poles) of js/src into another directory that doesn't depend on other libraries? I'm not familiar with the code structure, so I've no idea if that's feasible.
Assignee

Comment 6

2 years ago
Thanks Mike for the analysis. I happen to have written a patch a while ago, for some other purpose, that, as a side effect, should reduce the depth of js/src/target. Let me revive it.
Assignee: nobody → mh+mozilla
Assignee

Updated

2 years ago
Depends on: 1353259
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Assignee

Comment 9

2 years ago
The first patch is not strictly necessary but allows to measure the effect of the second:

Before:
  $ make -C obj-x86_64-pc-linux-gnu/ compile RECURSE_TRACE_ONLY=1 --no-print-directory -j | grep -n '\(js/src/target\|toolkit/library/target\)'
  569:js/src/target
  583:toolkit/library/target

After:
  $ make -C obj-x86_64-pc-linux-gnu/ compile RECURSE_TRACE_ONLY=1 --no-print-directory -j | grep -n '\(js/src/target\|toolkit/library/target\)'
  19:js/src/target
  583:toolkit/library/target

I think this is good enough.

Comment 10

2 years ago
mozreview-review
Comment on attachment 8854328 [details]
Bug 1262241 - Add a build system mode that prints out how directories are recursed.

https://reviewboard.mozilla.org/r/126266/#review129064
Attachment #8854328 - Flags: review?(mshal) → review+

Comment 11

2 years ago
mozreview-review
Comment on attachment 8854329 [details]
Bug 1262241 - Move the definition of the js library to a subdirectory.

https://reviewboard.mozilla.org/r/126268/#review129066

Whoa, nice! I'm glad you already had patches ready for this to save me from trying to understand GNU Make's process handling :). This looks good as far as I can tell, and I can confirm that js/src starts building much earlier, so Interpreter.cpp should begin in the first ~5% of the build rather than the last ~5%. I haven't confirmed it directly on Windows, but I don't see why it should be different.
Attachment #8854329 - Flags: review?(mshal) → review+

Comment 12

2 years ago
Pushed by mh@glandium.org:
https://hg.mozilla.org/integration/autoland/rev/a0201cb01a56
Add a build system mode that prints out how directories are recursed. r=mshal
https://hg.mozilla.org/integration/autoland/rev/2159959522f4
Move the definition of the js library to a subdirectory. r=mshal

Comment 13

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/a0201cb01a56
https://hg.mozilla.org/mozilla-central/rev/2159959522f4
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Thanks very much mshal for working on this!
(I should have said so much earlier but this bugmail got lost among workweeks)

Updated

Last year
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.