Make bitbar device pool manager multi-threaded
Categories
(Testing Graveyard :: Autophone, enhancement, P1)
Tracking
(Not tracked)
People
(Reporter: bc, Assigned: aerickson)
References
Details
We currently iterate over the bitbar projects checking for pending tasks on taskcluster and starting tests on an as-needed basis. I've recently deployed a hot-patch that queues up tests equal to twice the number of devices in a device group to help with the issue with short lived superseded jobs. See bug 1563307.
Each bitbar test takes 10-13 seconds to start not counting the other auxiliary tasks involved. As the number of devices in groups grows, the time to completely start tests for each device grows as well and is now over 15 minutes for perf p2 and g5 projects with the pre-population of tests.
We should convert the test run manager into a multi-threaded script which runs each project on its own thread so that they do not have to wait for other projects to be processed.
Side Note: Using autophone's old component for android-hw @ bitbar since it is already available and why not let autophone live on even if in name only. ;-)
| Reporter | ||
Comment 1•6 years ago
|
||
We should hit this first as it will do the most to alleviate the problem with idle devices.
| Reporter | ||
Comment 2•6 years ago
|
||
Andrew and I have deployed a work in progress changeset which has converted the test run manager to run each project on a separate thread. So far, it is working well however it has surfaced an issue with Bitbar which is causing our connection to the bitbar api to fail after an hour or so of operation. The queue is now over 6100 and I do not know when we will be able to get it under control. Hopefully tomorrow we will have a resolution for the bitbar issue, but the eta is unknown at this time.
| Reporter | ||
Comment 3•6 years ago
|
||
The bitbar issue was related to an api call to device problems. When I stubbed that out, the system no longer caused back end problems at bitbar. We are currently running well though the number of superseded jobs still pending is taking a while to work through.
unit-p2 4 hour backlog on production
perf-g5 3 hour backlog on production
perf-p2 21 hour backlog on production
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Comment 4•6 years ago
|
||
Work is being done here: https://github.com/bclary/mozilla-bitbar-devicepool/pull/29
| Assignee | ||
Comment 5•6 years ago
|
||
PR has been landed. devicepool0 is running the code.
Updated•4 years ago
|
Description
•