[aws provider] Worker registration takes long
Categories
(Taskcluster :: Services, defect)
Tracking
(Not tracked)
People
(Reporter: owlish, Assigned: owlish)
References
Details
Which leads to estimator not working correctly.
TODO:
- Investigate how the registration time can be improved
- Investigate other ways to alleviate (ensure idempotency of the requests etc.)
Assignee | ||
Comment 1•5 years ago
|
||
Tests with 1 worker: The longest time period is between worker record creation time and /register/worker
request appearing in the logs. So it seems to be something happening outside of worker manager. Also, the time that instances spend in starting
state is in the order of seconds (overall registering time is in the order of minutes).
Apparently worker-runner and worker logs need to be examined
There's also the matter of looping over an array of instances when creating worker records in the DB, which has linear complexity. Given that registering of 1 instance already takes minutes, this doesn't seem to be the bottleneck; however, this can also be dealt with (saving the whole array of instances into a single record, for example, would give it a constant complexity of 1)
Assignee | ||
Comment 2•5 years ago
|
||
Upon examining the worker logs, it appears that it takes around 1s or less from worker-runner start to registering worker. So the bottleneck to be residing between instance start and worker-runner start - that's the part that takes the longest
Assignee | ||
Comment 3•5 years ago
|
||
It seems like the bottleneck might be a common denominator between aws provider and gcp provider - if it's on our part, gcp provider might be having the same problem. If it's something that comes from the cloud, then it wouldn't necessarily manifest itself in other clouds. I am curious about the results of google provider load testing, if any. bstack?
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 4•5 years ago
|
||
Upon a meeting with bstack and dustin, turns out that the delay is outside of our systems and our control. Also turns out it's actually not that long.
Description
•