Closed Bug 1077202 Opened 10 years ago Closed 10 years ago

treeherder: Use much higher prefetch

Categories

(Taskcluster :: General, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jlal, Unassigned)

References

Details

Attachments

(1 file)

The code there is a comment about not racing during prefetch this logic makes sense but even with our small load we are already seening a slowdown (this is mabye 1/100th of our target traffic) so we need to increase our throughput (which means more prefetch.
Blocks: 1076681
We have two options:
A) accept race conditions,
B) queue messages with the same taskId

With (B) we're limited to the amount of load a single heroku node can handle, as we'll never be able to scale to multiple nodes.
With (A) we'll occasionally report the wrong state, this will most likely only happen when task-defined, task-pending and task-running messages arrive very close to eachother.
In far the most cases the task-completed messages will arrive later. Also if I recall correctly, a treeherder task, can't change state from completed, once it has been reported completed.
Note: you'll need to ask treeherder guys about that.
Anyways, my point is that it might be preferable to accept the occasional race conditions.

Additionally, we can cache jobs for project.postJobs() and only run project.postJobs() every 500ms, that'll probably make TH guys happier :)
We might be fine with B part of our slowness I suspect is treeherder staging is fairly slow compared to production (we are having growing pains this week) but whatever appraoch we take we must be as reliable as possible in how we report results (accepting race conditions makes me nervous unless we prove that approach never results in state reverals or errors)
B) is more complicated to implement... And no slowness here is probably from loading the task definition from which we get task.extra.treeherder

Note I hope someday we can store the small tasks in azure table storage and only use blob storage for large tasks. That'll be much faster...
I think we're best off accepting race conditions (A).
When I wrote taskcluster-treeherder I ended up inserting each run as a job in treeherder because I couldn't transition from "completed" -> "pending".
It might be possible to transition from "running" -> "pending", or "exception" -> "running". We'll have to ask the treeherder team.
But I'm pretty sure we can't transition from "completed"/"success" to anything else.

Note, I'm not sure we actually need to provide jobSymbol and other things from task.extra.treeherder more than once, if not maybe we can just leave the properties unspecified when handling events other task-defined. Then we wouldn't have to load the task definition, which would speed up things a lot.
Just adds configuration to prefetch option... I already deployed this for testing (better to do this now over the weekend I will turn it back to 1 if shit is really messed up)
Attachment #8500159 - Flags: review?(jopsen)
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Component: TaskCluster → General
Product: Testing → Taskcluster
Target Milestone: --- → mozilla41
Version: unspecified → Trunk
Resetting Version and Target Milestone that accidentally got changed...
Target Milestone: mozilla41 → ---
Version: Trunk → unspecified
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: