mach build hangs near the end of the build in make
Categories
(Firefox Build System :: General, defect, P2)
Tracking
(firefox136 fixed)
| Tracking | Status | |
|---|---|---|
| firefox136 | --- | fixed |
People
(Reporter: Alex_Gaynor, Assigned: glandium)
References
(Blocks 1 open bug)
Details
Comment 1•7 years ago
|
||
Updated•7 years ago
|
| Reporter | ||
Comment 2•7 years ago
|
||
| Reporter | ||
Updated•7 years ago
|
| Reporter | ||
Comment 3•7 years ago
|
||
./mach build hangs on my machine, Ubuntu 20.04, too, at the end of the build, until cancelled via Ctrl+c:
0:09.58 Finished dev [unoptimized + debuginfo] target(s) in 6.45s
0:09.76 toolkit/library/build/libxul.so
0:24.86 ./dependentlibs.list.stub
0:30.10 Packaging specialpowers@mozilla.org.xpi...
0:30.19 Packaging quitter@mozilla.org.xpi...
0:30.28 Packaging mozscreenshots@mozilla.org.xpi...
^CProcess Process-1:ort compile misc libs tools
The next build succeeds.
Updated•3 years ago
|
Comment 5•2 years ago
|
||
I am encountering the same issue. I think it might be related to scccache, running sccache --stop-server while it is hanging makes it continue as usual.
Comment 6•2 years ago
|
||
The hang seems to occur when the sccache server is not running before the build. So running sccache --stop-server ; ./mach build makes it occur every time for me.
Updated•1 year ago
|
Updated•1 year ago
|
Comment 8•1 year ago
|
||
Could somebody please work on improving this? I'm hitting it multiple times a day, and it wastes a minute or so of my time whenever it happens. From the many dupes, I can't be the only one. Thanks.
Comment 11•1 year ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #8)
Could somebody please work on improving this? I'm hitting it multiple times a day, and it wastes a minute or so of my time whenever it happens. From the many dupes, I can't be the only one. Thanks.
Same here. If there is anything I can do to help track this down please let me know. It's probably happening on around a third of my builds at this point.
Comment 12•1 year ago
|
||
Glandium said he will dig into this.
Comment 13•1 year ago
|
||
Might be a red herring, here is the stack I got when I ctrl-c:
Traceback (most recent call last):
File "/usr/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.11/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/julien/travail/git/mozilla-central/testing/mozbase/mozsystemmonitor/mozsystemmonitor/resourcemonitor.py", line 137, in _collect
while not _poll(pipe, poll_interval=sleep_interval):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/julien/travail/git/mozilla-central/testing/mozbase/mozsystemmonitor/mozsystemmonitor/resourcemonitor.py", line 104, in _poll
return pipe.poll(poll_interval)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/multiprocessing/connection.py", line 256, in poll
return self._poll(timeout)
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/multiprocessing/connection.py", line 423, in _poll
r = wait([self], timeout)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/multiprocessing/connection.py", line 930, in wait
ready = selector.select(timeout)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/selectors.py", line 415, in select
fd_event_list = self._selector.poll(timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
KeyboardInterrupt
Comment 14•1 year ago
|
||
I'm running into this multiple times a day. At this point it's really slowing me down. Is there a known workaround? Would you consider bumping this in severity / priority?
Comment 15•1 year ago
|
||
I'll bump the priority and see if Serge can take a look since Glandium hasn't been able to.
I will be on leave for ~2 months starting next week, otherwise I would do it.
| Assignee | ||
Comment 16•1 year ago
|
||
So... this only happens on Mac. Not Linux. Not Windows.
A workaround: setting SCCACHE_IDLE_TIMEOUT=5 in the environment, or running sccache --start-server before the build.
But I'm also not sure this has anything to do with the original bug anymore.
| Assignee | ||
Comment 17•1 year ago
|
||
comment 13 also seems to be a different issue.
| Assignee | ||
Updated•1 year ago
|
| Assignee | ||
Comment 18•1 year ago
|
||
This is caused by https://raw.githubusercontent.com/apple-oss-distributions/gnumake/refs/tags/gnumake-135/patches/PR-5071266.patch but it's also possible to end up in a similar situation in the following situations, on both Linux and Mac:
- with a verbatim make 4.4.1
- with a verbatim make 3.81 when adding a + before the sccache command
The reason this is happening is that sccache, invoked from make, inherits its jobserver file descriptors. It's actually not supposed to, because the commands in the Makefile don't have a +. But on mac that happens because of that patch. It also happens with newer versions of make because they use a different method of giving access to the jobserver (using a named fifo), and that's always available, whether + is on the command or not.
(so comment 13 might, in fact, be the same bug ; comment 4 looks like it might be a different issue, but comment 0 might be the same)
In the case of newer make versions, not initializing the jobserver from the environment would be a workaround. sccache would have its own jobserver, and while it would not respect the limits set by the original make jobserver, it would still serve its purpose. The theoretically ideal solution would be for the jobserver pipes to be passed to the sccache server by the sccache client, but that would require Unix sockets, and sccache currently uses TCP sockets.
Unfortunately, in the case of older make versions, the simple fact of daemonizing triggers the problem. That is, the fact that the pipes are still around in the daemonized process blocks make until that process closes.
Comment 19•1 year ago
|
||
This also happens on Linux. My reported bug 1841067 was for Linux. (Shows same stack trace comment 13 IIRC). If it is a different issue it might be good to reopen.
| Assignee | ||
Comment 20•1 year ago
|
||
Can y'all test these builds of sccache:
- Linux: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/XgOwaOFTR1COSG_RQ3pCEw/runs/0/artifacts/public/build/sccache.tar.zst
- Intel mac: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Pk8K_ZPqSB69NxjPdDu0hw/runs/0/artifacts/public/build/sccache.tar.zst
- Apple silicon mac: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/a4a2GB1eSjO58C1wDhyflw/runs/0/artifacts/public/build/sccache.tar.zst
Downloading and unpacking those might not be totally convenient, so what you can do is the following:
cd ~/.mozbuild
/path/to/source_dir/mach artifact toolchain --from-task <TASK>:public/build/sccache.tar.zst
Where <TASK> is:
XgOwaOFTR1COSG_RQ3pCEwfor LinuxPk8K_ZPqSB69NxjPdDu0hwfor Intel maca4a2GB1eSjO58C1wDhyflwfor Apple silicon mac
(beware that updating your tree may make the build pull a new sccache, so please try to avoid pulling when testing)
A reliable way I found to reproduce:
./mach configure
find obj* -name \*.o -delete
~/.mozbuild/sccache/sccache --stop-server
./mach build
Comment 21•1 year ago
|
||
(In reply to Mike Hommey [:glandium] from comment #20)
Can y'all test these builds of sccache:
This fixes the issue for me!
Comment 22•1 year ago
|
||
we upgraded sccache with a fix here:
https://bugzilla.mozilla.org/show_bug.cgi?id=1940229
Mike, can we close this bug? thanks
| Assignee | ||
Updated•1 year ago
|
Updated•1 year ago
|
Description
•