Closed Bug 1840830 Opened 2 years ago Closed 1 year ago

Speed up task replacement by batching Taskcluster requests

Categories

(Firefox Build System :: Task Configuration, enhancement)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: marco, Assigned: alphare33)

References

(Blocks 1 open bug)

Details

Attachments

(4 files, 1 obsolete file)

Looking at the logs of a decision task, I see around 12 seconds spent in doing requests to Taskcluster in sequence:

[task 2023-06-28T10:20:41.440Z] Removed 27 tasks by build (skip-unless-backstop-and-skip-unless-push-interval-10.0-and-skip-unless-schedules-or-bugbug-reduced-fallback), 75 tasks by dependents optimized, 2 tasks by if-dependencies pruning, 42 tasks by skip-unless-backstop (skip-unless-backstop), 158 tasks by skip-unless-changed (skip-unless-changed), 43 tasks by skip-unless-expanded (skip-unless-backstop-and-skip-unless-push-interval-10.0), 1 tasks by skip-unless-schedules (skip-unless-schedules), 1494 tasks by test (skip-unless-backstop-and-not-skip-unless-backstop-and-skip-unless-push-interval-10.0-and-skip-unless-schedules-or-bugbug-reduced-manifests-fallback-last-10-pushes-or-platform-disperse-or-skip-unless-backstop-and-skip-unless-push-interval-10.0-and-skip-unless-schedules-or-bugbug-reduced-manifests-fallback-low-or-platform-disperse), 71 tasks by test-inclusive (skip-unless-schedules) during optimization.
[task 2023-06-28T10:20:41.468Z] https://firefox-ci-tc.services.mozilla.com:443 "GET /api/index/v1/task/gecko.cache.level-3.docker-images.v2.image_builder.hash.d8b9b5abf9d3c6070aa5ee80394e05402a7b210327a72d827a6b8d0d37039b3a HTTP/1.1" 200 256
[task 2023-06-28T10:20:41.484Z] https://firefox-ci-tc.services.mozilla.com:443 "GET /api/queue/v1/task/Y8XcObT4QbaK4I6-8611tQ/status HTTP/1.1" 200 841
[task 2023-06-28T10:20:41.499Z] https://firefox-ci-tc.services.mozilla.com:443 "GET /api/index/v1/task/gecko.cache.level-3.docker-images.v2.index-task.hash.d6cce5462527c824ef3b2aeaf9017c8740156e9009d116d3b8428ef47ca9e210 HTTP/1.1" 200 253
[task 2023-06-28T10:20:41.512Z] https://firefox-ci-tc.services.mozilla.com:443 "GET /api/queue/v1/task/C-1bQ11ITUCJ58TfgLSskA/status HTTP/1.1" 200 841
...
[task 2023-06-28T10:20:53.589Z] Replaced 296 tasks by index-search (index-search) during optimization.

The code is at https://searchfox.org/mozilla-central/rev/c0adc2160976e2c118e2e5709d08aac071fddce9/third_party/python/taskcluster_taskgraph/taskgraph/optimize/base.py#254 and https://searchfox.org/mozilla-central/rev/c0adc2160976e2c118e2e5709d08aac071fddce9/third_party/python/taskcluster_taskgraph/taskgraph/optimize/strategies.py#11. Can we parallelize the requests?

:alphare33 created a new API endpoint to batch queries[1]. Yesterday's try seems to yield significant results[2].

[1] https://github.com/taskcluster/taskcluster/pull/6915
[1] https://treeherder.mozilla.org/jobs?repo=try&revision=00a6d8f46d1e6f5ed7fb2b855fe1a94ba5ea26d2

Assignee: nobody → alphare33
Summary: Speed up task replacement by performing Taskcluster requests in parallel → Speed up task replacement by batching Taskcluster requests

Batching queries to get from task index to status reduces the number of
(sometimes trans-continental) queries from 2*900+ to ~2. This reduces the
time spent replacing tasks from 20% to 75% depending on the use-case.

The 20% improvement in wall time was observed when running
mach taskgraph morphed in a CI worker, while the 75% improvement
was observed in a developer machine in France running mach taskgraph full.

More information in https://github.com/taskcluster/taskcluster-rfcs/pull/189.

Attachment #9399185 - Attachment is obsolete: true

Batched queries are landed in central as part of bug 1896126 which landed 4 hours ago. Closing bug. I'll also add the other patches (the ones outside of mozilla-central) to this current bug.

Status: NEW → RESOLVED
Closed: 1 year ago
Depends on: 1896126
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: