Closed Bug 1653090 Opened 5 years ago Closed 5 years ago

Decision task on kaios-try project takes up to 30 minutes, times out

Categories

(Firefox Build System :: Task Configuration, task)

task

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: erahm, Assigned: erahm)

References

Details

Attachments

(1 file)

We're seeing decision task times in the 25-30 minute range on the kaios-try repo.

It seems to consistently stall out for 20+ minutes when generating tasks for test test-verify-gpu on platform windows7-32/opt:

[task 2020-07-15T17:10:43.172Z] Generating tasks for test browsertime-tp6 on platform windows7-32/opt
[task 2020-07-15T17:10:43.304Z] Generating tasks for test telemetry-tests-client on platform windows7-32/opt
*[task 2020-07-15T17:10:43.305Z] Generating tasks for test test-verify-gpu on platform windows7-32/opt*
[task 2020-07-15T17:31:16.109Z] Generating tasks for test cppunit on platform windows7-32/opt

STR:

hg clone https://hg.mozilla.org/projects/kaios
hg up d849cb2278ec
hg import https://hg.mozilla.org/projects/kaios-try/rev/5122739a44d9fddfa5335c36e31ece21c315f244
hg import https://hg.mozilla.org/projects/kaios-try/rev/be99c2303ba09f86c0fbd5d80d61769e3e5413fe
hg try fuzzy --full -q "'b2g opt"

Switching the GECKO_BASE_REPOSITORY in .taskcluster.yml to https://hg.mozilla.org/projects/kaios seems to have solved this issue for me.

Interesting, in this push, which has that change, I had a 26 minute decision task. I a few pushes later we're down to 6-9 minutes.

I think the issue with the first push is the worker had a cache where GECKO_BASE_REPOSITORY was set to m-u. It looks like hg robustcheckout doesn't update the default for the repository, if the clone already exists. We could maybe change this to pass the base repository explicitly, as well as update robustcheckout.

Looking at this task (JSoTjKqYSDCXd-LBLJ702g) which is fast, it ran on a worker that hadn't run any other jobs.

Looking at this task (OQxM6o_QSOGXjPN5dueEDQ) which is slow, it ran on a worker that had run a task with m-u as the base repository.


So, there are a couple of things we can do to make this more resilient, but the GECKO_BASE_REPOSITORY does fix it, at least in the fresh cache case.

Assignee: nobody → erahm
Blocks: 1628832
Status: NEW → ASSIGNED

Switches the GECKO_BASE_REPOSITORY for taskcluster to point the the kaios project branch.

Landed upstream.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
See Also: → 1653325
See Also: → 1653332

I filed Bug 1653325 and Bug 1653332 about making things more robust to changes in GECKO_BASE_REPOSITORY.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: