Bootstrap fails with 32-core / 64-thread processor
Categories
(Firefox Build System :: Bootstrap Configuration, defect)
Tracking
(firefox76 fixed, firefox77 fixed)
People
(Reporter: kip, Assigned: glandium)
Details
Attachments
(3 files)
Ran mach bootstrap
on a fresh mozilla-unified
clone and fresh mozillabuild install.
Windows 10 running on a Threadripper 2990wx (32-core 64-thread).
This failed with an error:
ValueError: max_workers must be <= 61
Attached is longer tty output and the callstack.
Reporter | ||
Comment 1•4 years ago
|
||
This appears to be Windows-specific, as mach bootstrap
succeeds on a Xeon 7210 (64-core / 256-thread) system running Linux.
Reporter | ||
Comment 2•4 years ago
|
||
Looked a bit deeper...
It appears that the problem is rooted in the 61 worker limitation of the Python3 concurrent.futures library:
https://docs.python.org/3.8/library/concurrent.futures.html
This is implemented using Win32 WaitForMultipleObject
calls, which are not able to support more than 63 objects in a single call.
Now that there are consumer oriented, single-socket processors with up to 64-cores / 128-threads, this function call is obsolete for the purpose of synchronizing per-cpu instantiated workers. This is better done with IOCP's (https://docs.microsoft.com/en-us/windows/win32/fileio/i-o-completion-ports). It seems that even the latest version of concurrent.futures
has this 61-worker limitation.
Reporter | ||
Comment 3•4 years ago
|
||
I was able to complete a bootstrap by hard-coding max_workers = 61
in python/mozbuild/mozbuild/frontend/reader.py
:
# max_workers = cpu_count()
max_workers = 61
Comment 4•4 years ago
|
||
Easiest fix is probably just clamping the max_workers
on Windows only.
Comment 5•4 years ago
|
||
ProcessPoolExecutor
will naturally default to the number of CPUs on
the machine and will also handle edge cases on Windows.
I'd love to hear about what kind of build times you get with that machine. :)
Pushed by nfroyd@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f6e96e32b5f0 remove max_workers argument for ProcessPoolExecutor; r=dmajor
Comment 8•4 years ago
|
||
bugherder |
Assignee | ||
Comment 9•4 years ago
|
||
So, fun fact, this actually didn't fix it entirely... because somehow the capping doesn't work properly in python.
See https://bugs.python.org/issue26903#msg365886
Assignee | ||
Comment 10•4 years ago
|
||
Comment 11•4 years ago
|
||
Pushed by mh@glandium.org: https://hg.mozilla.org/integration/autoland/rev/be8356bf09d7 Cap ProcessPoolExecutor's max_workers to 60 on Windows. r=firefox-build-system-reviewers,rstewart
Comment 12•4 years ago
|
||
bugherder |
Description
•