bitbar Android performance machines stopped taking jobs
Categories
(Infrastructure & Operations :: RelOps: Hardware, defect)
Tracking
(Not tracked)
People
(Reporter: aryx, Assigned: aerickson)
Details
None of the machines in those pools are running:
gecko-t-bitbar-gw-perf-p2 https://tools.taskcluster.net/provisioners/proj-autophone/worker-types/gecko-t-bitbar-gw-perf-p2
gecko-t-bitbar-gw-perf-g5 https://tools.taskcluster.net/provisioners/proj-autophone/worker-types/gecko-t-bitbar-gw-perf-g5
Comment 1•5 years ago
|
||
keep in mind that :aerickson is the main point of contact.
our automated bot detected that 72 devices are offline in the last 1.5 hours, so this is recent and it is something that bitbar is already notified about (although they come online in 4 hours)
Comment 2•5 years ago
|
||
I've rebooted devicepool0 and things appear to be coming back up and starting jobs.
Assignee | ||
Comment 3•5 years ago
|
||
timeline:
2019/08/16 07:12 TC requests start being aborted or having connection errors. These exceptions cause worker threads to stop starting new jobs.
2019/08/16 14:23 BC reboots devicepool. Issues are resolved.
first exception:
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: Exception in thread mozilla-gw-batttest-g5:
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: Traceback (most recent call last):
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: self.run()
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: File "/usr/lib/python2.7/threading.py", line 754, in run
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: self.__target(*self.__args, **self.__kwargs)
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: File "/home/bitbar/mozilla-bitbar-devicepool/mozilla_bitbar_devicepool/test_run_manager.py", line 101, in handle_queue
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: pending_tasks = get_taskcluster_pending_tasks(taskcluster_provisioner_id, worker_type)
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: File "/home/bitbar/mozilla-bitbar-devicepool/mozilla_bitbar_devicepool/taskcluster.py", line 10, in get_taskcluster_pending_tasks
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: r = requests.get(taskcluster_queue_url)
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: File "/home/bitbar/mozilla-bitbar-devicepool/venv/local/lib/python2.7/site-packages/requests/api.py", line 75, in get
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: return request('get', url, params=params, **kwargs)
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: File "/home/bitbar/mozilla-bitbar-devicepool/venv/local/lib/python2.7/site-packages/requests/api.py", line 60, in request
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: return session.request(method=method, url=url, **kwargs)
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: File "/home/bitbar/mozilla-bitbar-devicepool/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 533, in request
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: resp = self.send(prep, **send_kwargs)
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: File "/home/bitbar/mozilla-bitbar-devicepool/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 646, in send
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: r = adapter.send(request, **kwargs)
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: File "/home/bitbar/mozilla-bitbar-devicepool/venv/local/lib/python2.7/site-packages/requests/adapters.py", line 498, in send
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: raise ConnectionError(err, request=request)
Aug 16 07:12:14 bitbar-devicepool-0 bash[15911]: ConnectionError: ('Connection aborted.', error(0, 'Error'))
exception near end of issue:
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: Exception in thread mozilla-gw-perftest-g5:
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: Traceback (most recent call last):
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: self.run()
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: File "/usr/lib/python2.7/threading.py", line 754, in run
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: self.__target(*self.__args, **self.__kwargs)
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: File "/home/bitbar/mozilla-bitbar-devicepool/mozilla_bitbar_devicepool/test_run_manager.py", line 101, in handle_queue
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: pending_tasks = get_taskcluster_pending_tasks(taskcluster_provisioner_id, worker_type)
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: File "/home/bitbar/mozilla-bitbar-devicepool/mozilla_bitbar_devicepool/taskcluster.py", line 10, in get_taskcluster_pending_tasks
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: r = requests.get(taskcluster_queue_url)
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: File "/home/bitbar/mozilla-bitbar-devicepool/venv/local/lib/python2.7/site-packages/requests/api.py", line 75, in get
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: return request('get', url, params=params, **kwargs)
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: File "/home/bitbar/mozilla-bitbar-devicepool/venv/local/lib/python2.7/site-packages/requests/api.py", line 60, in request
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: return session.request(method=method, url=url, **kwargs)
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: File "/home/bitbar/mozilla-bitbar-devicepool/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 533, in request
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: resp = self.send(prep, **send_kwargs)
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: File "/home/bitbar/mozilla-bitbar-devicepool/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 646, in send
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: r = adapter.send(request, **kwargs)
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: File "/home/bitbar/mozilla-bitbar-devicepool/venv/local/lib/python2.7/site-packages/requests/adapters.py", line 516, in send
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: raise ConnectionError(e, request=request)
Aug 16 14:21:46 bitbar-devicepool-0 bash[15911]: ConnectionError: HTTPSConnectionPool(host='queue.taskcluster.net', port=443): Max retries exceeded with url: /v1/pending/proj-autophone/geck
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 4•5 years ago
|
||
I've rolled out a fix to devicepool0 that handles these exceptions.
Assignee | ||
Comment 5•5 years ago
|
||
Closing as the incident is over and I've deployed a fix.
Description
•