Closed Bug 1484261 Opened 6 years ago Closed 2 months ago

Intermittent [taskcluster:error] Task timeout after 1800 seconds. Force killing container. for Gecko Decision Task

Categories

(Developer Services :: Mercurial: hg.mozilla.org, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: noemi_erli, Unassigned)

Details

(Keywords: intermittent-failure)

These have [vcs 2018-08-17T01:09:26.396Z] executing ['hg', 'robustcheckout', '--sharebase', '/builds/worker/checkouts/hg-store', '--purge', '--upstream', 'https://hg.mozilla.org/mozilla-unified', '--sparseprofile', 'build/sparse-profiles/taskgraph', '--revision', '2d5b4c59078e5fb0919ffeabb955b9f78c37174c', 'https://hg.mozilla.org/integration/autoland', '/builds/worker/checkouts/gecko'] [vcs 2018-08-17T01:09:26.445Z] (using Mercurial 4.5.2) [vcs 2018-08-17T01:09:26.446Z] ensuring https://hg.mozilla.org/integration/autoland@2d5b4c59078e5fb0919ffeabb955b9f78c37174c is available at /builds/worker/checkouts/gecko [vcs 2018-08-17T01:09:27.130Z] (cloning from upstream repo https://hg.mozilla.org/mozilla-unified)
Summary: Gecko Decision Task failing on multiple pushes → Intermittent [taskcluster:error] Task timeout after 1800 seconds. Force killing container. for Gecko Decision Task

This seems to be fairly easy to reproduce if you mass-retrigger test jobs (like 50 or more at one time):

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b9fd94d578e6242e72e04b0597751a3994c6287a

Looks like something like a 10% failure rate. Is this just "decision tasks overwhelm hg.m.o"?

Although I'm sure hgmo being overwhelmed causes plenty of issues in CI, lots of the failures from comment 10 seem unrelated to vcs operations. Take this task for example, where vcs operations finish in 15 seconds (line 32) and the task still fails.

Status: NEW → RESOLVED
Closed: 2 months ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.