Closed Bug 1957417 Opened 10 months ago Closed 9 months ago

resourcemonitor.py still hanging the build on linux

Categories

(Firefox Build System :: General, defect)

defect

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1948051

People

(Reporter: julienw, Unassigned)

References

Details

Very frequently my build hangs forever.
Then I stop it with Ctrl-C, run it again, and then this works fine.

When I interrupt the build, the trace looks like this:

Traceback (most recent call last):
  File "/usr/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.11/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/julien/travail/git/mozilla-central/testing/mozbase/mozsystemmonitor/mozsystemmonitor/resourcemonitor.py", line 137, in _collect
    while not _poll(pipe, poll_interval=sleep_interval):
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/julien/travail/git/mozilla-central/testing/mozbase/mozsystemmonitor/mozsystemmonitor/resourcemonitor.py", line 104, in _poll
    return pipe.poll(poll_interval)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/multiprocessing/connection.py", line 256, in poll
    return self._poll(timeout)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/multiprocessing/connection.py", line 423, in _poll
    r = wait([self], timeout)
        ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/multiprocessing/connection.py", line 930, in wait
    ready = selector.select(timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/selectors.py", line 415, in select
    fd_event_list = self._selector.poll(timeout)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Seems similar to bug 1499382, can you confirm you're using sccache 0.9.1 or greater?

Flags: needinfo?(felash)

Glandium and I were chatting about this and we want to add a keyboard interrupt hook here and try to dump subprocess information. If the build hangs this is always the symptom. We're waiting for something, but there's no information as to what that something is, so it's impossible to determine what the the root cause is (and we often can't reproduce it ourselves).

This should facilitate better troubleshooting, since when people encounter it, they can break out and (hopefully) have some useful details to share with us in a bug.

(In reply to Alex Hochheiden [:ahochheiden] from comment #1)

Seems similar to bug 1499382, can you confirm you're using sccache 0.9.1 or greater?

yeah, it looks like this is 0.9.1 indeed!

Flags: needinfo?(felash)

Not sure if still relevant, but maybe you can also try out glandiums suggestion from Bug 1948051 comment 2 and see whether you can answer the question?

Could be closed as duplicate of Bug 1948051?

Thanks Manuel,
The problem is that it doesn't reproduce reliably, so it's hard to try the patch :/
I believe it's the same bug so it could be closed as a duplicate indeed.

The problem is that it doesn't reproduce reliably, so it's hard to try the patch :/

Same problem for me :/

Status: NEW → RESOLVED
Closed: 9 months ago
Duplicate of bug: 1948051
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.