Closed Bug 1308982 Opened 9 years ago Closed 9 years ago

Read gyp files in parallel

Tracking

(firefox53 fixed)

Status:

RESOLVED FIXED

Milestone:

mozilla53

Tracking Flags:

Tracking

Status

firefox53

---

fixed

People

(Reporter: chmanchester, Assigned: chmanchester)

References

Details

Attachments

(3 files)

Bug 1308982 - Normalize gyp_reader.py indentation to 4 spaces. 9 years ago Chris Manchester (limited bugmail, email directly) 58 bytes, text/x-review-board-request	gps : review+	Details
Bug 1308982 - Stop adding gyp directories to the DIRS make variable. 9 years ago Chris Manchester (limited bugmail, email directly) 58 bytes, text/x-review-board-request	gps : review+	Details
Bug 1308982 - Read gyp files in a separate process. 9 years ago Chris Manchester (limited bugmail, email directly) 58 bytes, text/x-review-board-request	gps : review+	Details

Chris Manchester (limited bugmail, email directly)

Assignee

Description

•

9 years ago

Passing parallel=True to the gyp reader shaves some seconds off of build-backend. Bug 1295937 makes some changes in this area, so I'll re-measure once that lands.

(not currently active) Ted Mielczarek

Comment 1

•

9 years ago

It's definitely noticeable when using gyp to build NSS standalone. The version of gyp I installed from the Ubuntu packages defaulted to --no-parallel, and using the git version (which defaults to --parallel) is much faster.

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 2

•

9 years ago

I'm going to morph this bug slightly, to not only pass parallel=True to the gyp reader, but do this in its own process, so reading gyp files as a whole can execute in parallel to the main reader/emitter pipeline. I have some initial patches that look promising.

Chris Manchester (limited bugmail, email directly)

Assignee

Updated

•

9 years ago

Depends on: 1316140

Chris Manchester (limited bugmail, email directly)

Assignee

Updated

•

9 years ago

Depends on: 1295937

Comment hidden (mozreview-request)

Gregory Szorc [:gps]

Comment 6

•

9 years ago

mozreview-review

Comment on attachment 8815863 [details] Bug 1308982 - Normalize gyp_reader.py indentation to 4 spaces. https://reviewboard.mozilla.org/r/94484/#review96936

Attachment #8815863 - Flags: review?(gps) → review+

Gregory Szorc [:gps]

Comment 7

•

9 years ago

mozreview-review

Comment on attachment 8815864 [details] Bug 1308982 - Stop adding gyp directories to the DIRS make variable. https://reviewboard.mozilla.org/r/94486/#review96938

Attachment #8815864 - Flags: review?(gps) → review+

Gregory Szorc [:gps]

Comment 8

•

9 years ago

mozreview-review

Comment on attachment 8815865 [details] Bug 1308982 - Read gyp files in a separate process. https://reviewboard.mozilla.org/r/94488/#review96940 This reduces wall time from ~10.6s to ~7.0s on my machine! The approach looks solid. Just a few code issues. ::: python/mozbuild/mozbuild/frontend/gyp_reader.py:383 (Diff revision 1) > + includes = [encode(mozpath.join(script_dir, 'common.gypi'))] > + finder = FileFinder(chrome_src, find_executables=False) > + includes.extend(encode(mozpath.join(chrome_src, name)) > + for name, _ in finder.find('*/supplement.gypi')) As a further optimization, could this be moved into `load_gyp()`? FileFinder and I/O feels like the kind of thing that should be moved to a worker process. ::: python/mozbuild/mozbuild/frontend/gyp_reader.py:396 (Diff revision 1) > + if self._results: > + for res in self._results: > + yield res > + else: > + flat_list, targets, data, t = self._gyp_loader_future.result() > + self.execution_time += t > + for res in process_gyp_result((flat_list, targets, data), self._gyp_dir_attrs, > + self._path, self._config, self._output, > + self._non_unified_sources, self._action_overrides): > + self._results.append(res) > + yield res This pattern does funky things when it is called multiple times and the first call hasn't consumed the generator. This is because `self._results` will have a value upon consuming 1 item from the generator. However, a subsequent caller could call `results` and that call would only yield 1 value because the first generator hadn't been exhausted yet. i.e. you could get incomplete results. ::: python/mozbuild/mozbuild/frontend/reader.py:885 (Diff revision 1) > + try: > + from multiprocessing import cpu_count > + num_workers = cpu_count() > + except ImportError: > + pass Why is this needed? IIRC the only issue importing multiprocessing stem from using the pipe features of multiprocessing: cpu_count() should work just fine. We do call `multiprocessing.cpu_count()` without guards in `python/mozbuild/mozbuild/base.py`, so there shouldn't be a problem. ::: python/mozbuild/mozbuild/frontend/reader.py:905 (Diff revision 1) > + return ExecutionSummary( > + 'Read {file_count:d} gyp files in parallel in ' > + '{execution_time:.2f}s', > + file_count=self._gyp_file_count, > + execution_time=self._gyp_execution_time) When I ran this series locally, I was surprised that the sum of all times was greater than the wall time! It's not a blocking issue, but I think it would be more useful if we clearly denoted the wall time boundaries in end-to-end execution, not sum of wall times from N invocations. What I think is interesting is: * total read time (moz.build + gyp) * whether GYP processing finished *after* moz.build reading and therefore slowed things down Perhaps it could report total read time, moz.build read time, gyp read time (all in wall times of course). You could still report total wall time in GYP execution. But that's not too interesting unless GYP is a long pole. ::: python/mozbuild/mozbuild/frontend/reader.py:925 (Diff revision 1) > path = mozpath.join(self.config.topsrcdir, 'moz.build') > - return self.read_mozbuild(path, self.config) > + for r in self.read_mozbuild(path, self.config): > + yield r > + for g in self._gyp_processors: > + for gyp_context in g.results: > + self._gyp_file_count += len(gyp_context.all_paths) Running locally, this reports 1905 gyp files! Surely we don't have that many files and there are redundant entries. Do you think we should distinguish between unique files and total file reads? This is a minor and non-blocking issue. I ask because glandium has made noise about the file counts being inaccurate. If you're in this code and can change it easily, you might as well do the right thing.

Attachment #8815865 - Flags: review?(gps) → review-

Chris Manchester (limited bugmail, email directly)

Assignee

Comment 9

•

9 years ago

mozreview-review-reply

Comment on attachment 8815865 [details] Bug 1308982 - Read gyp files in a separate process. https://reviewboard.mozilla.org/r/94488/#review96940 > As a further optimization, could this be moved into `load_gyp()`? FileFinder and I/O feels like the kind of thing that should be moved to a worker process. From a quick timing locally, the time spent in this finder is negligible. > This pattern does funky things when it is called multiple times and the first call hasn't consumed the generator. > > This is because `self._results` will have a value upon consuming 1 item from the generator. However, a subsequent caller could call `results` and that call would only yield 1 value because the first generator hadn't been exhausted yet. i.e. you could get incomplete results. Moved the assignment to `self._results` to after processing is complete. > When I ran this series locally, I was surprised that the sum of all times was greater than the wall time! > > It's not a blocking issue, but I think it would be more useful if we clearly denoted the wall time boundaries in end-to-end execution, not sum of wall times from N invocations. What I think is interesting is: > > * total read time (moz.build + gyp) > * whether GYP processing finished *after* moz.build reading and therefore slowed things down > > Perhaps it could report total read time, moz.build read time, gyp read time (all in wall times of course). You could still report total wall time in GYP execution. But that's not too interesting unless GYP is a long pole. In local testing I'm finding the gyp workers all finish by the time the reader starts to yield gyp_contexts, so reading gyp doesn't actually add much to end to end time anymore. I'll update this to report gyp time as the time we spend blocking in our call to `result` for each processor (which will be 0.00s in this case). > Running locally, this reports 1905 gyp files! > > Surely we don't have that many files and there are redundant entries. Do you think we should distinguish between unique files and total file reads? > > This is a minor and non-blocking issue. I ask because glandium has made noise about the file counts being inaccurate. If you're in this code and can change it easily, you might as well do the right thing. Looking at the list, there are a startling number of duplicates -- we have 120 unique entries. It might make sense to report it if we were spending time working through each of 1905 files, but it looks like gyp is doing some caching for these, so the unique number is probably more meaningful.

Comment hidden (mozreview-request)

Gregory Szorc [:gps]

Comment 13

•

9 years ago

mozreview-review

Comment on attachment 8815865 [details] Bug 1308982 - Read gyp files in a separate process. https://reviewboard.mozilla.org/r/94488/#review97558

Attachment #8815865 - Flags: review?(gps) → review+

Pulsebot

Comment 14

•

9 years ago

Pushed by gszorc@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9e66e4d7e87d Normalize gyp_reader.py indentation to 4 spaces. r=gps https://hg.mozilla.org/integration/autoland/rev/bbb8642886b8 Stop adding gyp directories to the DIRS make variable. r=gps https://hg.mozilla.org/integration/autoland/rev/09e996bdc9e4 Read gyp files in a separate process. r=gps

Pulsebot

Comment 15

•

9 years ago

Backout by gszorc@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/02bd7db82457 Backed out changesets 09e996bdc9e4, bbb8642886b8, and 9e66e4d7e87d for landing prematurely

Pulsebot

Comment 16

•

9 years ago

Pushed by cmanchester@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/834737859127 Normalize gyp_reader.py indentation to 4 spaces. r=gps https://hg.mozilla.org/integration/mozilla-inbound/rev/4b56c03c582d Stop adding gyp directories to the DIRS make variable. r=gps https://hg.mozilla.org/integration/mozilla-inbound/rev/763fc7845e8c Read gyp files in a separate process. r=gps

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 17

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/834737859127 https://hg.mozilla.org/mozilla-central/rev/4b56c03c582d https://hg.mozilla.org/mozilla-central/rev/763fc7845e8c

Status: NEW → RESOLVED

Closed: 9 years ago

status-firefox53: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla53

Mike Hommey [:glandium]

Updated

•

8 years ago

Depends on: 1337391

BMO Automation

Updated

•

7 years ago

Product: Core → Firefox Build System

You need to log in before you can comment on or make changes to this bug.