Gecko Decision task times out when pushing to try from release or beta
Categories
(Developer Infrastructure :: Try, defect)
Tracking
(Not tracked)
People
(Reporter: jstutte, Unassigned)
References
(Blocks 1 open bug)
Details
On a given patch set, the decision task times out reliably.
Comment 1•4 years ago
|
||
The HTTP service is trying to pull a lot of changesets that are not part of the push, and times out while pulling:
Feb 23 14:29:06 bugbug app/worker.3: 2021-02-23 14:29:06,037:INFO:rq.worker:default: bugbug_http.models.schedule_tests('try', 'c79f27f66498130b4133c0052fa3770227670f16') (c6a25f1974764f318b35e0225a984cb8)
Feb 23 14:29:12 bugbug app/worker.3: 2021-02-23 14:29:11,743:INFO:root:Processing schedule_tests:try_c79f27f66498130b4133c0052fa3770227670f16...
Feb 23 14:29:12 bugbug app/worker.3: 2021-02-23 14:29:11,743:INFO:root:Pulling commits from the remote repository...
Feb 23 14:29:35 bugbug app/worker.3: pulling from https://hg.mozilla.org/try/
Feb 23 14:29:35 bugbug app/worker.3: using https://hg.mozilla.org/try/
Feb 23 14:29:35 bugbug app/worker.3: sending capabilities command
Feb 23 14:29:35 bugbug app/worker.3: using ca certificates from certifi
Feb 23 14:29:35 bugbug app/worker.3: using /version-control-tools/third_party/python/certifi/certifi/cacert.pem for CA file
Feb 23 14:29:35 bugbug app/worker.3: preparing listkeys for "bookmarks"
Feb 23 14:29:35 bugbug app/worker.3: sending batch command
Feb 23 14:29:35 bugbug app/worker.3: sending 91 bytes
Feb 23 14:29:35 bugbug app/worker.3: received listkey for "bookmarks": 3080 bytes
Feb 23 14:29:35 bugbug app/worker.3: query 1; heads
Feb 23 14:29:35 bugbug app/worker.3: sending batch command
Feb 23 14:29:35 bugbug app/worker.3: sending 1011 bytes
Feb 23 14:29:35 bugbug app/worker.3: searching for changes
Feb 23 14:29:35 bugbug app/worker.3: taking initial sample
Feb 23 14:29:35 bugbug app/worker.3: query 2; still undecided: 38, sample size is: 38
Feb 23 14:29:35 bugbug app/worker.3: sending known command
Feb 23 14:29:35 bugbug app/worker.3: sending 1563 bytes
Feb 23 14:29:35 bugbug app/worker.3: 2 total queries in 5.1258s
Feb 23 14:29:35 bugbug app/worker.3: sending getbundle command
Feb 23 14:29:35 bugbug app/worker.3: sending 1367 bytes
Feb 23 14:29:35 bugbug app/worker.3: bundle2-input-bundle: with-transaction
Feb 23 14:29:35 bugbug app/worker.3: bundle2-input-part: "changegroup" (params: 1 mandatory 1 advisory) supported
Feb 23 14:29:35 bugbug app/worker.3: adding changesets
Feb 23 14:29:35 bugbug app/worker.3: add changeset 976dd158ef7f
Feb 23 14:29:35 bugbug app/worker.3: add changeset 18b27fa2be84
Feb 23 14:29:35 bugbug app/worker.3: add changeset 26a17424c310
Feb 23 14:29:35 bugbug app/worker.3: add changeset 08cd11c22095
Feb 23 14:29:35 bugbug app/worker.3: add changeset 7e26ca8db92b
Feb 23 14:29:35 bugbug app/worker.3: add changeset 1a53b79ea529
Feb 23 14:29:35 bugbug app/worker.3: add changeset 822bc5cbc8f4
Feb 23 14:29:35 bugbug app/worker.3: add changeset 50de8c1763e2
Feb 23 14:29:35 bugbug app/worker.3: add changeset 17666746e8cc
Feb 23 14:29:35 bugbug app/worker.3: add changeset 18416a172146
Feb 23 14:29:35 bugbug app/worker.3: add changeset 19d48b5f0ca1
...
Feb 23 14:59:53 bugbug app/worker.1: add changeset c79f27f66498
Feb 23 14:59:53 bugbug app/worker.1: adding manifests
Feb 23 14:59:53 bugbug app/worker.1: bundle2-input-bundle: 1 parts total
Feb 23 14:59:53 bugbug app/worker.1: transaction abort!
Feb 23 14:59:53 bugbug app/worker.1: rollback completed
Feb 23 14:59:53 bugbug app/worker.1: (sent 5 HTTP requests and 4725 bytes; received 156164853 bytes in responses)
Feb 23 14:59:53 bugbug app/worker.1: killed!
Feb 23 14:59:53 bugbug app/worker.1: 2021-02-23 14:59:53,288:ERROR:rq.worker:Traceback (most recent call last):
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/site-packages/rq/worker.py", line 975, in perform_job
Feb 23 14:59:53 bugbug app/worker.1: rv = job.perform()
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/site-packages/rq/job.py", line 696, in perform
Feb 23 14:59:53 bugbug app/worker.1: self._result = self._execute()
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/site-packages/rq/job.py", line 719, in _execute
Feb 23 14:59:53 bugbug app/worker.1: return self.func(*self.args, **self.kwargs)
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/site-packages/bugbug_http/models.py", line 128, in schedule_tests
Feb 23 14:59:53 bugbug app/worker.1: repository.pull(REPO_DIR, branch, rev)
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/site-packages/bugbug/repository.py", line 1349, in pull
Feb 23 14:59:53 bugbug app/worker.1: trigger_pull()
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/site-packages/tenacity/__init__.py", line 333, in wrapped_f
Feb 23 14:59:53 bugbug app/worker.1: return self(f, *args, **kw)
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/site-packages/tenacity/__init__.py", line 423, in __call__
Feb 23 14:59:53 bugbug app/worker.1: do = self.iter(retry_state=retry_state)
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/site-packages/tenacity/__init__.py", line 372, in iter
Feb 23 14:59:53 bugbug app/worker.1: raise retry_exc.reraise()
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/site-packages/tenacity/__init__.py", line 189, in reraise
Feb 23 14:59:53 bugbug app/worker.1: raise self.last_attempt.result()
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 432, in result
Feb 23 14:59:53 bugbug app/worker.1: return self.__get_result()
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
Feb 23 14:59:53 bugbug app/worker.1: raise self._exception
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/site-packages/tenacity/__init__.py", line 426, in __call__
Feb 23 14:59:53 bugbug app/worker.1: result = fn(*args, **kwargs)
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/site-packages/bugbug/repository.py", line 1338, in trigger_pull
Feb 23 14:59:53 bugbug app/worker.1: p.wait(timeout=180)
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/site-packages/sentry_sdk/integrations/stdlib.py", line 208, in sentry_patched_popen_wait
Feb 23 14:59:53 bugbug app/worker.1: return old_popen_wait(self, *a, **kw)
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/subprocess.py", line 1079, in wait
Feb 23 14:59:53 bugbug app/worker.1: return self._wait(timeout=timeout)
Feb 23 14:59:53 bugbug app/worker.1: File "/usr/local/lib/python3.8/subprocess.py", line 1796, in _wait
Feb 23 14:59:53 bugbug app/worker.1: raise TimeoutExpired(self.args, timeout)
Feb 23 14:59:53 bugbug app/worker.1: subprocess.TimeoutExpired: Command '['hg', 'pull', b'-rc79f27f66498130b4133c0052fa3770227670f16', b'--debug', b'--', b'https://hg.mozilla.org/try/']' timed out after 180 seconds
Comment 2•4 years ago
|
||
I think the issue is that the patch was based on a commit from "release", and so the service was trying to pull everything from "release" (the service locally has a "autoland" clone).
A possible fix would be to use a "unified" clone in the service. There will always be a mismatch problem when running "mach try auto" on a "release" commit, since the tests that the service knows about might not be the same as the ones available on release, but at least it will not fail with a timeout.
Updated•4 years ago
|
Comment 3•4 years ago
|
||
Comment 4•2 years ago
|
||
Is there a better component to track bugbug issues such as this? It's not a task configuration bug.
Comment 5•2 years ago
|
||
We can use Developer Infrastructure::Try.
Updated•2 years ago
|
Description
•