Closed Bug 1278406 Opened 9 years ago Closed 9 years ago

Gecko decision tasks take 8 minutes making HTTP requests to queue

Categories

(Firefox Build System :: Task Configuration, task)

Product:

Component:

Type:

task

Priority:

Not set

Severity:

normal

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: gps, Assigned: gps)

References

(Blocks 1 open bug)

Details

(Whiteboard: [optimization])

Attachments

(1 file)

Bug 1278406 - Use a thread pool for submitting tasks to queue; 9 years ago Gregory Szorc [:gps] 58 bytes, text/x-review-board-request	dustin : review+	Details

Gregory Szorc [:gps]

Assignee

Description

•

9 years ago

From https://public-artifacts.taskcluster.net/D7BjgysjRnuNVG3rLhsGZw/0/public/logs/live_backing.log: [34m 0:31.94(B[m Starting new HTTP connection (1): taskcluster [34m 0:33.16(B[m "PUT /queue/v1/task/eYYlnt-bRWOfh0qsEhK7ZA HTTP/1.1" 200 524 [34m 0:33.16(B[m Creating task with taskId fIt_6K9qT9u7RkH4U6lKnQ for TaskLabel==AIED_qR-RPqZ3C47ttkTiQ ... [34m 8:38.19(B[m Creating task with taskId drTF3b2bQYyNv-onxXtJQg for TaskLabel==eynGHW5vQqWir0PPmrem-w [34m 8:39.18(B[m "PUT /queue/v1/task/drTF3b2bQYyNv-onxXtJQg HTTP/1.1" 200 378 [taskcluster 2016-06-06 21:40:04.807Z] === Task Finished === 8 minutes making HTTP requests is a rather long time. We can do better. dustin: can you brain dump what you said in IRC? I didn't really follow.

Flags: needinfo?(dustin)

Dustin J. Mitchell [:dustin] (he/him)

Comment 1

•

9 years ago

This would be better if the requests were made concurrently (in a limited fashion -- 1000's of concurrent requests will cause timeouts, I bet). You mentioned requests has built-in support for such a thing.

Component: Queue → Task Configuration

Flags: needinfo?(dustin)

Gregory Szorc [:gps]

Assignee

Comment 2

•

9 years ago

Concurrent requests will only fix part of the problem. Why do we need to make dozens of HTTP requests at all? Wouldn't it be more efficient for the queue to have a "bulk submit" API? That feels like fewer moving parts and increases the probability of an all-or-nothing end state. I like atomic operations.

Gregory Szorc [:gps]

Assignee

Comment 3

•

9 years ago

But if we must do concurrent requests, https://github.com/ross/requests-futures would likely work. The requests library is thread safe. But I think you need to provide your own thread/process pool. Probably better off using something pre-built.

Dustin J. Mitchell [:dustin] (he/him)

Comment 4

•

9 years ago

The queue is making HTTP requests to Azure on the backend anyway, so a bulk API wouldn't save anything. The task-graph-scheduler, which we had been using up until a week or so ago, made parallel requests: https://github.com/taskcluster/task-graph-scheduler/blob/e4874e8170d55d0e506cb63dd2eeb9f744342574/scheduler/helpers.js#L180 return Promise.all(input.tasks.map(function(taskNode) { return options.queue.defineTask( taskNode.taskId, taskNode.task with no limit on the concurrency (but using the TC client library, which will retry 500's, thus handling timeouts; the decision tasks are using the Go version of that library). I mentioned in irc, this timing used to be invisible to the user: the docker-worker would call scheduler.extendTaskGraph *after* the task had completed, so the time taken making that call was not counted in the time for the decision task. So yes, whatever is easiest to use in-tree (that is, requiring the least vendoring) would be best here.

Gregory Szorc [:gps]

Assignee

Comment 5

•

9 years ago

https://github.com/ross/requests-futures/blob/master/requests_futures/sessions.py is a very minimal wrapper around a basic futures executor. We already have futures vendored. This should be trivial to make concurrent...

Gregory Szorc [:gps]

Assignee

Updated

•

9 years ago

Assignee: nobody → gps

Status: NEW → ASSIGNED

Gregory Szorc [:gps]

Assignee

Comment 6

•

9 years ago

Attached file Bug 1278406 - Use a thread pool for submitting tasks to queue; — Details

Currently, Gecko decision tasks spend ~8 minutes making HTTP requests to the queue API. Let's throw a thread pool at it so tasks are submitted faster. I'd prefer the queue offer a batch submit API. But it appears its operators are not willing to offer such an API at this time. Review commit: https://reviewboard.mozilla.org/r/58086/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/58086/

Attachment #8760510 - Flags: review?(dustin)

Gregory Szorc [:gps]

Assignee

Comment 7

•

9 years ago

Comment on attachment 8760510 [details] Bug 1278406 - Use a thread pool for submitting tasks to queue; Review request updated; see interdiff: https://reviewboard.mozilla.org/r/58086/diff/1-2/

Gregory Szorc [:gps]

Assignee

Comment 8

•

9 years ago

https://reviewboard.mozilla.org/r/58086/#review54960 https://queue.taskcluster.net/v1/task/AozSXU0ATji_xubqlO72Bg/runs/0/artifacts/public%2Flogs%2Flive_backing.log says queue API calls are back under 60s again. Appears to be a roughly 8x speedup. We can't fire HTTP requests as fast as the server will accept them because you can't create the task on the server until the dependent task is created. A bulk submit API would cut down on the round trip latency. But I suppose things are fast enough with this patch... for now.

Dustin J. Mitchell [:dustin] (he/him)

Comment 9

•

9 years ago

Comment on attachment 8760510 [details] Bug 1278406 - Use a thread pool for submitting tasks to queue; https://reviewboard.mozilla.org/r/58086/#review54984 The tone of last sentence of the commit message feels out of place, particularly since I've indicated several times why that wouldn't actually be an advantage, but that will be lost in the sands of time soon enough. And I've certainly been guilty of passive-aggressive comments from time to time. ::: taskcluster/taskgraph/create.py:42 (Diff revision 2) > + task_id = label_to_taskid[label] > > - _create_task(session, label_to_taskid[label], label, task_def) > + # We can't submit a task until its dependencies have been submitted. > + deps_fs = [fs[dep] for dep in deps_by_name.values()] > + for f in futures.as_completed(deps_fs): > + f.result() This may still end up blocking more than necessary, since the post-order traversal doesn't totally order the tasks. But is probably adequate for the purpose. A better model might be to chain the futures into a DAG patterned after the graph. If you'd like to give that a shot, I'd be happy to r? it.

Attachment #8760510 - Flags: review?(dustin) → review+

Gregory Szorc [:gps]

Assignee

Comment 10

•

9 years ago

https://reviewboard.mozilla.org/r/58086/#review54984 Sorry - I didn't mean to sound passive aggressive in the commit message. Reading it again, it definitely comes across that way. I'll change the language. FTR, I definitely understand where you are coming from.

Gregory Szorc [:gps]

Assignee

Comment 11

•

9 years ago

https://reviewboard.mozilla.org/r/58086/#review54984 > This may still end up blocking more than necessary, since the post-order traversal doesn't totally order the tasks. But is probably adequate for the purpose. A better model might be to chain the futures into a DAG patterned after the graph. If you'd like to give that a shot, I'd be happy to r? it. This feels like scope bloat. Let's add an inline comment to call out the potential optimization and call it a day?

Gregory Szorc [:gps]

Assignee

Comment 12

•

9 years ago

Comment on attachment 8760510 [details] Bug 1278406 - Use a thread pool for submitting tasks to queue; Review request updated; see interdiff: https://reviewboard.mozilla.org/r/58086/diff/2-3/

Attachment #8760510 - Flags: review?(gps)

Dustin J. Mitchell [:dustin] (he/him)

Comment 13

•

9 years ago

https://reviewboard.mozilla.org/r/58084/#review55136

Gregory Szorc [:gps]

Assignee

Comment 14

•

9 years ago

Comment on attachment 8760510 [details] Bug 1278406 - Use a thread pool for submitting tasks to queue; Review request updated; see interdiff: https://reviewboard.mozilla.org/r/58086/diff/3-4/

Dustin J. Mitchell [:dustin] (he/him)

Comment 15

•

9 years ago

https://reviewboard.mozilla.org/r/58086/#review55144

Dustin J. Mitchell [:dustin] (he/him)

Comment 16

•

9 years ago

(In reply to Gregory Szorc [:gps] from comment #11) > This feels like scope bloat. Let's add an inline comment to call out the > potential optimization and call it a day? I agree, and I like the comment.

Gregory Szorc [:gps]

Assignee

Comment 17

•

9 years ago

Comment on attachment 8760510 [details] Bug 1278406 - Use a thread pool for submitting tasks to queue; Review request updated; see interdiff: https://reviewboard.mozilla.org/r/58086/diff/4-5/

Gregory Szorc [:gps]

Assignee

Comment 18

•

9 years ago

Comment on attachment 8760510 [details] Bug 1278406 - Use a thread pool for submitting tasks to queue; Review request updated; see interdiff: https://reviewboard.mozilla.org/r/58086/diff/5-6/

Comment 19

•

9 years ago

Pushed by gszorc@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/aba39b9c043c Use a thread pool for submitting tasks to queue; r=dustin

Gregory Szorc [:gps]

Assignee

Updated

•

9 years ago

Blocks: thunder-try

Armen [:armenzg]

Updated

•

9 years ago

Whiteboard: [optimization]

Carsten Book [:Tomcat]

Comment 20

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/aba39b9c043c

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

Gregory Szorc [:gps]

Assignee

Updated

•

9 years ago

Attachment #8760510 - Flags: review?(gps)

Updated

•

7 years ago

Product: TaskCluster → Firefox Build System

You need to log in before you can comment on or make changes to this bug.