Closed
Bug 1278406
Opened 8 years ago
Closed 8 years ago
Gecko decision tasks take 8 minutes making HTTP requests to queue
Categories
(Firefox Build System :: Task Configuration, task)
Firefox Build System
Task Configuration
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: gps, Assigned: gps)
References
(Blocks 1 open bug)
Details
(Whiteboard: [optimization])
Attachments
(1 file)
From https://public-artifacts.taskcluster.net/D7BjgysjRnuNVG3rLhsGZw/0/public/logs/live_backing.log: [34m 0:31.94(B[m Starting new HTTP connection (1): taskcluster [34m 0:33.16(B[m "PUT /queue/v1/task/eYYlnt-bRWOfh0qsEhK7ZA HTTP/1.1" 200 524 [34m 0:33.16(B[m Creating task with taskId fIt_6K9qT9u7RkH4U6lKnQ for TaskLabel==AIED_qR-RPqZ3C47ttkTiQ ... [34m 8:38.19(B[m Creating task with taskId drTF3b2bQYyNv-onxXtJQg for TaskLabel==eynGHW5vQqWir0PPmrem-w [34m 8:39.18(B[m "PUT /queue/v1/task/drTF3b2bQYyNv-onxXtJQg HTTP/1.1" 200 378 [taskcluster 2016-06-06 21:40:04.807Z] === Task Finished === 8 minutes making HTTP requests is a rather long time. We can do better. dustin: can you brain dump what you said in IRC? I didn't really follow.
Flags: needinfo?(dustin)
Comment 1•8 years ago
|
||
This would be better if the requests were made concurrently (in a limited fashion -- 1000's of concurrent requests will cause timeouts, I bet). You mentioned requests has built-in support for such a thing.
Component: Queue → Task Configuration
Flags: needinfo?(dustin)
Assignee | ||
Comment 2•8 years ago
|
||
Concurrent requests will only fix part of the problem. Why do we need to make dozens of HTTP requests at all? Wouldn't it be more efficient for the queue to have a "bulk submit" API? That feels like fewer moving parts and increases the probability of an all-or-nothing end state. I like atomic operations.
Assignee | ||
Comment 3•8 years ago
|
||
But if we must do concurrent requests, https://github.com/ross/requests-futures would likely work. The requests library is thread safe. But I think you need to provide your own thread/process pool. Probably better off using something pre-built.
Comment 4•8 years ago
|
||
The queue is making HTTP requests to Azure on the backend anyway, so a bulk API wouldn't save anything. The task-graph-scheduler, which we had been using up until a week or so ago, made parallel requests: https://github.com/taskcluster/task-graph-scheduler/blob/e4874e8170d55d0e506cb63dd2eeb9f744342574/scheduler/helpers.js#L180 return Promise.all(input.tasks.map(function(taskNode) { return options.queue.defineTask( taskNode.taskId, taskNode.task with no limit on the concurrency (but using the TC client library, which will retry 500's, thus handling timeouts; the decision tasks are using the Go version of that library). I mentioned in irc, this timing used to be invisible to the user: the docker-worker would call scheduler.extendTaskGraph *after* the task had completed, so the time taken making that call was not counted in the time for the decision task. So yes, whatever is easiest to use in-tree (that is, requiring the least vendoring) would be best here.
Assignee | ||
Comment 5•8 years ago
|
||
https://github.com/ross/requests-futures/blob/master/requests_futures/sessions.py is a very minimal wrapper around a basic futures executor. We already have futures vendored. This should be trivial to make concurrent...
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → gps
Status: NEW → ASSIGNED
Assignee | ||
Comment 6•8 years ago
|
||
Currently, Gecko decision tasks spend ~8 minutes making HTTP requests to the queue API. Let's throw a thread pool at it so tasks are submitted faster. I'd prefer the queue offer a batch submit API. But it appears its operators are not willing to offer such an API at this time. Review commit: https://reviewboard.mozilla.org/r/58086/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/58086/
Attachment #8760510 -
Flags: review?(dustin)
Assignee | ||
Comment 7•8 years ago
|
||
Comment on attachment 8760510 [details] Bug 1278406 - Use a thread pool for submitting tasks to queue; Review request updated; see interdiff: https://reviewboard.mozilla.org/r/58086/diff/1-2/
Assignee | ||
Comment 8•8 years ago
|
||
https://reviewboard.mozilla.org/r/58086/#review54960 https://queue.taskcluster.net/v1/task/AozSXU0ATji_xubqlO72Bg/runs/0/artifacts/public%2Flogs%2Flive_backing.log says queue API calls are back under 60s again. Appears to be a roughly 8x speedup. We can't fire HTTP requests as fast as the server will accept them because you can't create the task on the server until the dependent task is created. A bulk submit API would cut down on the round trip latency. But I suppose things are fast enough with this patch... for now.
Comment 9•8 years ago
|
||
Comment on attachment 8760510 [details] Bug 1278406 - Use a thread pool for submitting tasks to queue; https://reviewboard.mozilla.org/r/58086/#review54984 The tone of last sentence of the commit message feels out of place, particularly since I've indicated several times why that wouldn't actually be an advantage, but that will be lost in the sands of time soon enough. And I've certainly been guilty of passive-aggressive comments from time to time. ::: taskcluster/taskgraph/create.py:42 (Diff revision 2) > + task_id = label_to_taskid[label] > > - _create_task(session, label_to_taskid[label], label, task_def) > + # We can't submit a task until its dependencies have been submitted. > + deps_fs = [fs[dep] for dep in deps_by_name.values()] > + for f in futures.as_completed(deps_fs): > + f.result() This may still end up blocking more than necessary, since the post-order traversal doesn't totally order the tasks. But is probably adequate for the purpose. A better model might be to chain the futures into a DAG patterned after the graph. If you'd like to give that a shot, I'd be happy to r? it.
Attachment #8760510 -
Flags: review?(dustin) → review+
Assignee | ||
Comment 10•8 years ago
|
||
https://reviewboard.mozilla.org/r/58086/#review54984 Sorry - I didn't mean to sound passive aggressive in the commit message. Reading it again, it definitely comes across that way. I'll change the language. FTR, I definitely understand where you are coming from.
Assignee | ||
Comment 11•8 years ago
|
||
https://reviewboard.mozilla.org/r/58086/#review54984 > This may still end up blocking more than necessary, since the post-order traversal doesn't totally order the tasks. But is probably adequate for the purpose. A better model might be to chain the futures into a DAG patterned after the graph. If you'd like to give that a shot, I'd be happy to r? it. This feels like scope bloat. Let's add an inline comment to call out the potential optimization and call it a day?
Assignee | ||
Comment 12•8 years ago
|
||
Comment on attachment 8760510 [details] Bug 1278406 - Use a thread pool for submitting tasks to queue; Review request updated; see interdiff: https://reviewboard.mozilla.org/r/58086/diff/2-3/
Attachment #8760510 -
Flags: review?(gps)
Assignee | ||
Comment 14•8 years ago
|
||
Comment on attachment 8760510 [details] Bug 1278406 - Use a thread pool for submitting tasks to queue; Review request updated; see interdiff: https://reviewboard.mozilla.org/r/58086/diff/3-4/
Comment 16•8 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #11) > This feels like scope bloat. Let's add an inline comment to call out the > potential optimization and call it a day? I agree, and I like the comment.
Assignee | ||
Comment 17•8 years ago
|
||
Comment on attachment 8760510 [details] Bug 1278406 - Use a thread pool for submitting tasks to queue; Review request updated; see interdiff: https://reviewboard.mozilla.org/r/58086/diff/4-5/
Assignee | ||
Comment 18•8 years ago
|
||
Comment on attachment 8760510 [details] Bug 1278406 - Use a thread pool for submitting tasks to queue; Review request updated; see interdiff: https://reviewboard.mozilla.org/r/58086/diff/5-6/
Comment 19•8 years ago
|
||
Pushed by gszorc@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/aba39b9c043c Use a thread pool for submitting tasks to queue; r=dustin
Assignee | ||
Updated•8 years ago
|
Blocks: thunder-try
Updated•8 years ago
|
Whiteboard: [optimization]
Comment 20•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/aba39b9c043c
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•8 years ago
|
Attachment #8760510 -
Flags: review?(gps)
Updated•6 years ago
|
Product: TaskCluster → Firefox Build System
You need to log in
before you can comment on or make changes to this bug.
Description
•