Azure-related tests timing out
Categories
(Taskcluster :: Services, defect)
Tracking
(Not tracked)
People
(Reporter: dustin, Assigned: bstack)
References
Details
[0m 1) services/queue/test/reruntask_test.js (real)
"before all" hook: withEntity for Artifact for "create, claim, complete and rerun (is idempotent)":
[0m[31m Error: Timeout of 30000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/taskcluster/services/queue/test/reruntask_test.js)[0m[90m
[0m 1) services/queue/test/claimtask_test.js (real)
"before each" hook: withEntity for TaskGroupActiveSet for "can claimTask":
[0m[31m Error: Timeout of 30000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/taskcluster/services/queue/test/claimtask_test.js)[0m[90m
[0m 1) services/queue/test/workerinfo_test.js (real)
queue.getWorker returns 20 most recent taskIds:
[0m[31m Error: Timeout of 30000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/taskcluster/services/queue/test/workerinfo_test.js)[0m[90m
| Reporter | ||
Comment 1•6 years ago
|
||
[0m 1) services/queue/test/claimresolver_test.js (real)
"before each" hook: withEntity for TaskRequirement for "createTask , claimWork, claim expires, retried":
[0m[31m Error: Timeout of 30000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/taskcluster/services/queue/test/claimresolver_test.js)[0m[90m
| Reporter | ||
Comment 2•6 years ago
|
||
I managed to repro this locally, running mocha with -b so that it bails out at the first failure.
azure:table Request: DELETE /WorkerT20190906TS7g3pqWq(PartitionKey='prov1-extended-extended-extended~gecko-b-2-linux-extended-extended',RowKey='my-worker-group-extended-extended~new'), retry: 0 +55ms
nock.common options.host: jungle.table.core.windows.net +55ms
nock.common options.hostname in the end: "jungle.table.core.windows.net" +0ms
nock.common options.host in the end: "jungle.table.core.windows.net:443" +0ms
nock.intercept interceptors for "jungle.table.core.windows.net:443" +54ms
nock.intercept filtering interceptors for basepath https://jungle.table.core.windows.net:443 +0ms
nock.common options.host: jungle.table.core.windows.net:443 +0ms
nock.common options.hostname in the end: "jungle.table.core.windows.net" +0ms
nock.common options.host in the end: "jungle.table.core.windows.net:443" +0ms
nock.intercept Net connect enabled for jungle.table.core.windows.net:443 +0ms
azure:table Request: DELETE /WorkerT20190906TS7g3pqWq(PartitionKey='prov1-extended-extended-extended~gecko-b-2-linux-extended-extended',RowKey='my-worker-group-extended-extended~newq'), retry: 0 +0ms
nock.common options.host: jungle.table.core.windows.net +0ms
nock.common options.hostname in the end: "jungle.table.core.windows.net" +0ms
nock.common options.host in the end: "jungle.table.core.windows.net:443" +0ms
nock.intercept interceptors for "jungle.table.core.windows.net:443" +0ms
nock.intercept filtering interceptors for basepath https://jungle.table.core.windows.net:443 +0ms
nock.common options.host: jungle.table.core.windows.net:443 +0ms
nock.common options.hostname in the end: "jungle.table.core.windows.net" +0ms
nock.common options.host in the end: "jungle.table.core.windows.net:443" +0ms
nock.intercept Net connect enabled for jungle.table.core.windows.net:443 +0ms
azure:table Request: DELETE /WorkerT20190906TS7g3pqWq(PartitionKey='prov1-extended-extended-extended~gecko-b-2-linux-extended-extended',RowKey='my-worker-group-extended-extended~old'), retry: 0 +0ms
nock.common options.host: jungle.table.core.windows.net +0ms
nock.common options.hostname in the end: "jungle.table.core.windows.net" +0ms
nock.common options.host in the end: "jungle.table.core.windows.net:443" +0ms
nock.intercept interceptors for "jungle.table.core.windows.net:443" +1ms
nock.intercept filtering interceptors for basepath https://jungle.table.core.windows.net:443 +0ms
nock.common options.host: jungle.table.core.windows.net:443 +1ms
nock.common options.hostname in the end: "jungle.table.core.windows.net" +0ms
nock.common options.host in the end: "jungle.table.core.windows.net:443" +0ms
nock.intercept Net connect enabled for jungle.table.core.windows.net:443 +0ms
azure:table Request: DELETE /WorkerT20190906TS7g3pqWq(PartitionKey='prov1-extended-extended-extended~gecko-b-2-linux-extended-extended',RowKey='my-worker-group-extended-extended~q'), retry: 0 +1ms
nock.common options.host: jungle.table.core.windows.net +0ms
nock.common options.hostname in the end: "jungle.table.core.windows.net" +0ms
nock.common options.host in the end: "jungle.table.core.windows.net:443" +0ms
nock.intercept interceptors for "jungle.table.core.windows.net:443" +0ms
nock.intercept filtering interceptors for basepath https://jungle.table.core.windows.net:443 +1ms
nock.common options.host: jungle.table.core.windows.net:443 +1ms
nock.common options.hostname in the end: "jungle.table.core.windows.net" +0ms
nock.common options.host in the end: "jungle.table.core.windows.net:443" +0ms
nock.intercept Net connect enabled for jungle.table.core.windows.net:443 +0ms
taskcluster.queue.table.worker INFO: monitor.measure:
taskcluster.queue.table.worker key: deleteEntity.success
taskcluster.queue.table.worker v: 1
taskcluster.queue.table.worker val: 31.806672 +0ms
taskcluster.queue.table.worker INFO: monitor.count:
taskcluster.queue.table.worker key: deleteEntity.success
taskcluster.queue.table.worker v: 1
taskcluster.queue.table.worker val: 1 +0ms
taskcluster.queue.table.worker INFO: monitor.measure:
taskcluster.queue.table.worker key: deleteEntity.success
taskcluster.queue.table.worker v: 1
taskcluster.queue.table.worker val: 32.791403 +0ms
taskcluster.queue.table.worker INFO: monitor.count:
taskcluster.queue.table.worker key: deleteEntity.success
taskcluster.queue.table.worker v: 1
taskcluster.queue.table.worker val: 1 +0ms
taskcluster.queue.table.worker INFO: monitor.measure:
taskcluster.queue.table.worker key: deleteEntity.success
taskcluster.queue.table.worker v: 1
taskcluster.queue.table.worker val: 310.624817 +0ms
taskcluster.queue.table.worker INFO: monitor.count:
taskcluster.queue.table.worker key: deleteEntity.success
taskcluster.queue.table.worker v: 1
taskcluster.queue.table.worker val: 1 +0ms
1) "before each" hook: withEntity for Worker for "queue.listWorkers returns actions with the right context"
nock.common options.hostname in the end: undefined +30s
nock.common options.host in the end: "localhost:443" +1ms
azure:table Request: GET /ArtifactsT20190906TS7g3pqWq(), retry: 0 +30s
nock.common options.host: jungle.table.core.windows.net +0ms
nock.common options.hostname in the end: "jungle.table.core.windows.net" +0ms
This is in the withEntities tear-down, where it's blasting through entity types, querying them, and deleting. A few interesting things:
+30s- that refers to the time since the lastnock.common options.hostnamewhich is just before all themonitor.count/monitor.measurestuff. So 30s went by with no logging.- The
monitor.measureare in ms, so the longest one logged was 310ms. But there are only three of them.
I don't see any further debug logging from taskcluster.queue.table.worker for a while, until
azure:table Request: GET /WorkerT20190906TS7g3pqWq(), retry: 0 +33ms
nock.common options.host: jungle.table.core.windows.net +32ms
nock.common options.hostname in the end: "jungle.table.core.windows.net" +0ms
nock.common options.host in the end: "jungle.table.core.windows.net:443" +0ms
nock.intercept interceptors for "jungle.table.core.windows.net:443" +32ms
nock.intercept filtering interceptors for basepath https://jungle.table.core.windows.net:443 +0ms
nock.common options.host: jungle.table.core.windows.net:443 +0ms
nock.common options.hostname in the end: "jungle.table.core.windows.net" +0ms
nock.common options.host in the end: "jungle.table.core.windows.net:443" +0ms
nock.intercept Net connect enabled for jungle.table.core.windows.net:443 +0ms
taskcluster.queue.table.worker INFO: monitor.measure:
taskcluster.queue.table.worker key: queryEntities.success
taskcluster.queue.table.worker v: 1
taskcluster.queue.table.worker val: 34.370085 +0ms
taskcluster.queue.table.worker INFO: monitor.count:
taskcluster.queue.table.worker key: queryEntities.success
taskcluster.queue.table.worker v: 1
taskcluster.queue.table.worker val: 1 +0ms
azure:table Request: DELETE /WorkerT20190906TS7g3pqWq(PartitionKey='prov1-extended-extended-extended~gecko-b-2-linux-extended-extended',RowKey='my-worker-group-extended-extended~old'), retry: 0 +35ms
nock.common options.host: jungle.table.core.windows.net +35ms
nock.common options.hostname in the end: "jungle.table.core.windows.net" +0ms
nock.common options.host in the end: "jungle.table.core.windows.net:443" +0ms
nock.intercept interceptors for "jungle.table.core.windows.net:443" +35ms
nock.intercept filtering interceptors for basepath https://jungle.table.core.windows.net:443 +0ms
nock.common options.host: jungle.table.core.windows.net:443 +0ms
nock.common options.hostname in the end: "jungle.table.core.windows.net" +0ms
nock.common options.host in the end: "jungle.table.core.windows.net:443" +0ms
nock.intercept Net connect enabled for jungle.table.core.windows.net:443 +0ms
taskcluster.queue.table.worker INFO: monitor.measure:
taskcluster.queue.table.worker key: deleteEntity.success
taskcluster.queue.table.worker v: 1
taskcluster.queue.table.worker val: 32.961968 +0ms
taskcluster.queue.table.worker INFO: monitor.count:
taskcluster.queue.table.worker key: deleteEntity.success
taskcluster.queue.table.worker v: 1
taskcluster.queue.table.worker val: 1 +0ms
That this scan-and-delete happens twice makes sense -- we clean up rows after each test, and after the whole suite. But note here, the same row (RowKey='my-worker-group-extended-extended~old') turns up again and is successfully deleted in 30ms or so.
So it looks like requests to Azure are just disappearing into the ether and timing out. I suspect this isn't a change in our code, but a change in Azure's behavior. It's really common, though :(
| Reporter | ||
Comment 3•6 years ago
|
||
That doesn't explain the distribution of the errors, though -- why so heavily queue?
| Reporter | ||
Comment 4•6 years ago
|
||
https://github.com/taskcluster/azure-entities/blob/master/src/entity.js#L77
var AZURE_TABLE_TIMEOUT = 7 * 1000;
...
const client = new azure.Table(_.defaults({
timeout: AZURE_TABLE_TIMEOUT,
agent: options.agent,
}, options.credentials));
so we do have a timeout that is passed to fast-azure-storage, of far less than 30s.
| Reporter | ||
Comment 5•6 years ago
|
||
And fast-azure storage calls req.setTimeout with the appropriate value (7500 - it adds a half-second so that the server can abort the connection first). Something's not adding up here. I'll see if I can repro with some console.log's in fast-azure-storage.
| Reporter | ||
Comment 6•6 years ago
|
||
azure:authorization Refreshing shared-access-signature +29ms
taskcluster-client Calling: azureTableSAS, retry: 0 +29ms
nock.common options.host: auth.taskcluster.net +30ms
nock.common options.hostname in the end: "auth.taskcluster.net" +0ms
nock.common options.host in the end: "auth.taskcluster.net:443" +0ms
nock.intercept interceptors for "auth.taskcluster.net:443" +30ms
nock.intercept filtering interceptors for basepath https://auth.taskcluster.net:443 +0ms
nock.common options.host: auth.taskcluster.net:443 +0ms
nock.common options.hostname in the end: "auth.taskcluster.net" +0ms
nock.common options.host in the end: "auth.taskcluster.net:443" +0ms
nock.intercept Net connect enabled for auth.taskcluster.net:443 +0ms
superagent GET https://auth.taskcluster.net/v1/azure/jungle/table/TaskGroupsT20190906TbNxtuZRk/read-write +31ms
1) "before each" hook: withEntity for TaskGroup for "queue.listProvisioners returns an empty list"
taskcluster-client Request error calling azureTableSAS (retrying), err: Error: Aborted, JSON: Error: Aborted +30s
nock.common options.hostname in the end: undefined +30s
That's a different error from last time, but certainly something that we could adjust.
| Reporter | ||
Comment 7•6 years ago
|
||
azure:queue Request: DELETE /q89743-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-1/messages/dcd0c4bb-8125-42ed-a24a-6025daa159cb, retry: 0 +2ms
7500
nock.common options.host: jungle.queue.core.windows.net +2ms
nock.common options.hostname in the end: "jungle.queue.core.windows.net" +0ms
nock.common options.host in the end: "jungle.queue.core.windows.net:443" +0ms
nock.intercept interceptors for "jungle.queue.core.windows.net:443" +3ms
nock.intercept filtering interceptors for basepath https://jungle.queue.core.windows.net:443 +0ms
nock.common options.host: jungle.queue.core.windows.net:443 +1ms
nock.common options.hostname in the end: "jungle.queue.core.windows.net" +0ms
nock.common options.host in the end: "jungle.queue.core.windows.net:443" +0ms
nock.intercept Net connect enabled for jungle.queue.core.windows.net:443 +0ms
1) queue.getWorker returns 20 most recent taskIds
nock.common options.hostname in the end: undefined +22s
nock.common options.host in the end: "localhost:443" +1ms
comment 2 also showed that options.hostname in the end: undefined. What's up with that :(
| Reporter | ||
Comment 8•6 years ago
|
||
nock/lib/common.js
7 /**
8 * Normalizes the request options so that it always has `host` property.
9 *
10 * @param {Object} options - a parsed options object of the request
11 */
12 function normalizeRequestOptions(options) {
13 options.proto = options.proto || 'http'
14 options.port = options.port || (options.proto === 'http' ? 80 : 443)
15 if (options.host) {
16 debug('options.host:', options.host)
17 if (!options.hostname) {
18 if (options.host.split(':').length === 2) {
19 options.hostname = options.host.split(':')[0]
20 } else {
21 options.hostname = options.host
22 }
23 }
24 }
25 debug('options.hostname in the end: %j', options.hostname)
26 options.host = `${options.hostname || 'localhost'}:${options.port}`
27 debug('options.host in the end: %j', options.host)
28
29 /// lowercase host names
30 ;['hostname', 'host'].forEach(function(attr) {
31 if (options[attr]) {
32 options[attr] = options[attr].toLowerCase()
33 }
34 })
35
36 return options
37 }
That's some weird almost-semicolon-free JS. But it suggests that somewhere this function is getting called with no host or hostname.
| Reporter | ||
Comment 9•6 years ago
|
||
Hm, but that seems to happen in success cases, too. Probably a false lead.
| Reporter | ||
Comment 10•6 years ago
|
||
I tried annotating nock to log all http requests, hoping i could figure out which request was not timing out after 7s. But I can't repro the problem anymore.
| Reporter | ||
Comment 11•6 years ago
•
|
||
It takes a while, but all I can see is
Error: Errors logged to monitor during test run:
{
"code": "OperationTimedOut",
"incidentId": "c0d47e16-9615-4d7e-bf45-26b99ee2f036",
"message": "Operation could not be completed within the specified time.\nRequestId:6de4942d-5003-00fd-69f3-6559e7000000\nTime:2019-09-08T03:15:07.2579762Z",
"method": "createTask",
"module": "fast-azure-storage",
"name": "OperationTimedOutError",
"note": "Failed to ensure azure queue in queueservice.js",
"params": {
"taskId": "_2keykyOTLemcae66rwOYA"
},
"payload": {
"created": "2019-09-08T03:14:59.531Z",
"deadline": "2019-09-08T03:44:59.531Z",
"dependencies": [],
"expires": "2020-09-08T03:44:59.531Z",
"extra": {},
"metadata": {
"description": "Task created during unit tests",
"name": "Unit testing task",
"owner": "haali@mozilla.com",
"source": "https://github.com/taskcluster/taskcluster-queue"
},
"payload": {},
"priority": "lowest",
"provisionerId": "no-provisioner",
"requires": "all-completed",
"retries": 5,
"routes": [],
"schedulerId": "-",
"scopes": [],
"tags": {},
"taskGroupId": "_2keykyOTLemcae66rwOYA",
"workerType": "gecko-b-1-android"
},
"retries": 0,
"stack": "OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:6de4942d-5003-00fd-69f3-6559e7000000\nTime:2019-09-08T03:15:07.2579762Z\n at /home/dustin/p/taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)",
"statusCode": 500,
"v": 1
}
which is the "normal" (handled) Azure timeout. It takes about 4h to generate that error, which matches the rate at which we see it in CI.
| Reporter | ||
Comment 12•6 years ago
•
|
||
OK, got something:
2019-09-08T13:50:24.516Z taskcluster-client Calling: claimWork, retry: 0
req 1114 POST localhost undefined 60401 /api/queue/v1/claim-work/no-provisioner/gecko-b-1-android
...
1) queue.getWorker returns 20 most recent taskIds
2019-09-08T13:50:39.674Z nock.common options.hostname in the end: undefined
2019-09-08T13:50:39.674Z nock.common options.host in the end: "localhost:443"
2019-09-08T13:50:39.674Z azure:table Request: GET /ArtifactsT20190908TEhVSExR8(), retry: 0
...
2019-09-08T13:50:40.661Z nock.intercept Net connect enabled for jungle.table.core.windows.net:443
evt 1378 response
2019-09-08T13:50:40.696Z taskcluster.queue.table.worker INFO: monitor.measure:
key: deleteEntity.success
v: 1
val: 34.355328
2019-09-08T13:50:40.696Z taskcluster.queue.table.worker INFO: monitor.count:
key: deleteEntity.success
v: 1
val: 1
evt 1114 abort
2019-09-08T13:50:54.526Z taskcluster-client Request error calling claimWork (retrying), err: Error: Aborted, JSON: Error: Aborted
2019-09-08T13:50:54.527Z mocha:runner finished running
42 passing (1m)
1 failing
1) services/queue/test/workerinfo_test.js (real)
queue.getWorker returns 20 most recent taskIds:
Error: Timeout of 30000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/home/dustin/p/taskcluster/services/queue/test/workerinfo_test.js)
The 1114 here is a single call to https.request. There are lots of other requests going on at the same time (up through 1378 as seen above). That particular request is aborted 30s after it is made. The test itself times out before that (at 13:50:39.674) and begins cleaning up, but mocha hangs around until req 1114 is done.
I can't quite figure out from https://nodejs.org/docs/latest-v10.x/api/http.html. Per that page, a request is only aborted when req.abort is called, and nothing in the tc-client code calls that method. The docs even call out:
Note that setting the
timeoutoption or using thesetTimeout()function will not abort the request or do anything besides add a'timeout'event.
so this is definitely not a timeout.
The request is definitely delivered to the local express server:
Sun, 08 Sep 2019 13:50:24 GMT express:router dispatching POST /api/queue/v1/claim-work/no-provisioner/gecko-b-1-android
...
evt 1115 response
2019-09-08T13:50:24.520Z taskcluster-client Success calling: authenticateHawk, (0 retries)
...
2019-09-08T13:50:24.520Z azure:table Request: GET /WorkerT20190908TEhVSExR8(PartitionKey='no-provisioner~gecko-b-1-android',RowKey='my-worker-group-extended-extended~my-worker-extended-extended'), retry: 0
req 1116 GET jungle.table.core.windows.net undefined undefined /WorkerT20190908TEhVSExR8(PartitionKey='no-provisioner~gecko-b-1-android',RowKey='my-worker-group-extended-extended~my-worker-extended-extended')?timeout=7&sv=2014-02-14&tn=WorkerT20190908TEhVSExR8&se=2019-09-08T14%3A14%3A43Z&sp=raud&sig=AWlEPCztY6O0zEcFQN
aBPDzRu5QpsZKWYT%2BoJuim7BM%3D&st=2019-09-08T13%3A34%3A43Z
...
evt 1116 response
2019-09-08T13:50:24.548Z azure:table Error code: ResourceNotFound (404) for GET /WorkerT20190908TEhVSExR8(PartitionKey='no-provisioner~gecko-b-1-android',RowKey='my-worker-group-extended-extended~my-worker-extended-extended') on retry: 0
...
2019-09-08T13:50:24.549Z azure:table Request: GET /ProvisionerT20190908TEhVSExR8(PartitionKey='no-provisioner',RowKey='provisioner'), retry: 0
req 1117 GET jungle.table.core.windows.net undefined undefined /ProvisionerT20190908TEhVSExR8(PartitionKey='no-provisioner',RowKey='provisioner')?timeout=7&sv=2014-02-14&tn=ProvisionerT20190908TEhVSExR8&se=2019-09-08T14%3A14%3A42Z&sp=raud&sig=4ie9WI4GzKJJu3Td9yRfiZdWMvyVLUj4y3JqDzWjCt4%3D&st=2019-09-08T13%3A34%3A42Z
...
2019-09-08T13:50:24.549Z azure:table Request: GET /WorkerTypeT20190908TEhVSExR8(PartitionKey='no-provisioner',RowKey='gecko-b-1-android'), retry: 0
req 1118 GET jungle.table.core.windows.net undefined undefined /WorkerTypeT20190908TEhVSExR8(PartitionKey='no-provisioner',RowKey='gecko-b-1-android')?timeout=7&sv=2014-02-14&tn=WorkerTypeT20190908TEhVSExR8&se=2019-09-08T14%3A14%3A43Z&sp=raud&sig=5D9iBVd22%2FzpoCBkWyoAzOKlhJb4I3CCbQgR7MahEPE%3D&st=2019-09-08T13%3A34%3A43Z
...
2019-09-08T13:50:24.549Z azure:table Request: GET /WorkerT20190908TEhVSExR8(PartitionKey='no-provisioner~gecko-b-1-android',RowKey='my-worker-group-extended-extended~my-worker-extended-extended'), retry: 0
req 1119 GET jungle.table.core.windows.net undefined undefined /WorkerT20190908TEhVSExR8(PartitionKey='no-provisioner~gecko-b-1-android',RowKey='my-worker-group-extended-extended~my-worker-extended-extended')?timeout=7&sv=2014-02-14&tn=WorkerT20190908TEhVSExR8&se=2019-09-08T14%3A14%3A43Z&sp=raud&sig=AWlEPCztY6O0zEcFQNaBPDzRu5QpsZKWYT%2BoJuim7BM%3D&st=2019-09-08T13%3A34%3A43Z
...
2019-09-08T13:50:24.550Z azure:queue Request: GET /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-7/messages, retry: 0
req 1120 GET jungle.queue.core.windows.net undefined undefined /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-7/messages?numofmessages=30&visibilitytimeout=300&timeout=7
...
evt 1117 response
2019-09-08T13:50:24.579Z base:entity Return modify trivially, as no changes was made by modifier
...
evt 1119 response
2019-09-08T13:50:24.580Z azure:table Error code: ResourceNotFound (404) for GET /WorkerT20190908TEhVSExR8(PartitionKey='no-provisioner~gecko-b-1-android',RowKey='my-worker-group-extended-extended~my-worker-extended-extended') on retry: 0
...
2019-09-08T13:50:24.580Z azure:table Request: POST /WorkerT20190908TEhVSExR8, retry: 0
req 1121 POST jungle.table.core.windows.net undefined undefined /WorkerT20190908TEhVSExR8?timeout=7&sv=2014-02-14&tn=WorkerT20190908TEhVSExR8&se=2019-09-08T14%3A14%3A43Z&sp=raud&sig=AWlEPCztY6O0zEcFQNaBPDzRu5QpsZKWYT%2BoJuim7BM%3D&st=2019-09-08T13%3A34%3A43Z
...
evt 1120 response
...
2019-09-08T13:50:24.582Z azure:queue Request: GET /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-6/messages, retry: 0
req 1122 GET jungle.queue.core.windows.net undefined undefined /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-6/messages?numofmessages=30&visibilitytimeout=300&timeout=7
...
evt 1118 response
2019-09-08T13:50:24.583Z azure:table Error code: ResourceNotFound (404) for GET /WorkerTypeT20190908TEhVSExR8(PartitionKey='no-provisioner',RowKey='gecko-b-1-android') on retry: 0
...
2019-09-08T13:50:24.583Z azure:table Request: POST /WorkerTypeT20190908TEhVSExR8, retry: 0
req 1123 POST jungle.table.core.windows.net undefined undefined /WorkerTypeT20190908TEhVSExR8?timeout=7&sv=2014-02-14&tn=WorkerTypeT20190908TEhVSExR8&se=2019-09-08T14%3A14%3A43Z&sp=raud&sig=5D9iBVd22%2FzpoCBkWyoAzOKlhJb4I3CCbQgR7MahEPE%3D&st=2019-09-08T13%3A34%3A43Z
...
evt 1121 response
...
evt 1122 response
...
2019-09-08T13:50:24.612Z azure:queue Request: GET /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-5/messages, retry: 0
req 1124 GET jungle.queue.core.windows.net undefined undefined /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-5/messages?numofmessages=30&visibilitytimeout=300&timeout=7
...
evt 1123 response
...
evt 1124 response
...
2019-09-08T13:50:24.642Z azure:queue Request: GET /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-4/messages, retry: 0
req 1125 GET jungle.queue.core.windows.net undefined undefined /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-4/messages?numofmessages=30&visibilitytimeout=300&timeout=7
...
evt 1125 response
...
2019-09-08T13:50:24.671Z azure:queue Request: GET /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-3/messages, retry: 0
req 1126 GET jungle.queue.core.windows.net undefined undefined /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-3/messages?numofmessages=30&visibilitytimeout=300&timeout=7
...
evt 1126 response
...
2019-09-08T13:50:24.700Z azure:queue Request: GET /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-2/messages, retry: 0
req 1127 GET jungle.queue.core.windows.net undefined undefined /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-2/messages?numofmessages=30&visibilitytimeout=300&timeout=7
...
evt 1127 response
...
2019-09-08T13:50:24.731Z azure:queue Request: GET /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-1/messages, retry: 0
req 1128 GET jungle.queue.core.windows.net undefined undefined /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-1/messages?numofmessages=30&visibilitytimeout=300&timeout=7
...
evt 1128 response
...
2019-09-08T13:50:24.818Z azure:table Request: GET /TasksT20190908TEhVSExR8(PartitionKey='MLAiTJgIQECQOHNRCo5vKg',RowKey='task'), retry: 0
req 1129 GET jungle.table.core.windows.net undefined undefined /TasksT20190908TEhVSExR8(PartitionKey='MLAiTJgIQECQOHNRCo5vKg',RowKey='task')?timeout=7&sv=2014-02-14&tn=TasksT20190908TEhVSExR8&se=2019-09-08T14%3A14%3A40Z&sp=raud&sig=xP5d598W2u26%2FHE8KsBIEWFGGB8fyoa1noN7tfkfpLc%3D&st=2019-09-08T13%3A34%3A40Z
... (reqs 1129-1158, each with a different taskId, for a total of 30 taskIds)
2019-09-08T13:50:24.838Z azure:table Request: GET /TasksT20190908TEhVSExR8(PartitionKey='O0HqF9oLSGycelkuxM5cvg',RowKey='task'), retry: 0
req 1158 GET jungle.table.core.windows.net undefined undefined /TasksT20190908TEhVSExR8(PartitionKey='O0HqF9oLSGycelkuxM5cvg',RowKey='task')?timeout=7&sv=2014-02-14&tn=TasksT20190908TEhVSExR8&se=2019-09-08T14%3A14%3A40Z&sp=raud&sig=xP5d598W2u26%2FHE8KsBIEWFGGB8fyoa1noN7tfkfpLc%3D&st=2019-09-08T13%3A34%3A40Z
...
evt 1129 response
evt 1132 response
evt 1131 response
evt 1130 response
...
2019-09-08T13:50:24.892Z azure:queue Request: POST /q78939-claim/messages, retry: 0
req 1159 POST jungle.queue.core.windows.net undefined undefined /q78939-claim/messages?visibilitytimeout=1&messagettl=604800&timeout=7
... (a whole bunch of nonsense re-queueing, updating Task entities, etc.)
2019-09-08T13:50:25.253Z azure:queue Request: DELETE /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-1/messages/81ceb841-ba56-4a65-9018-6a12e207ebbc, retry: 0
req 1245 DELETE jungle.queue.core.windows.net undefined undefined /q78939-g3okl6atj5fuope2kwappjoz-3fhc2wyxltjsjxehpb7v7cl3-1/messages/81ceb841-ba56-4a65-9018-6a12e207ebbc?popreceipt=AgAAAAMAAAAAAAAAog8xEE1m1QE%3D&timeout=7
A bit of scripting, however, shows that
2019-09-08T13:50:24.830Z azure:table Request: GET /TasksT20190908TEhVSExR8(PartitionKey='6_26Ni5OQfeFKiSq8XU6zQ',RowKey='task'), retry: 0
req 1147 GET jungle.table.core.windows.net undefined undefined /TasksT20190908TEhVSExR8(PartitionKey='6_26Ni5OQfeFKiSq8XU6zQ',RowKey='task')?timeout=7&sv=2014-02-14&tn=TasksT20190908TEhVSExR8&se=2019-09-08T14%3A14%3A40Z&sp=raud&sig=xP5d598W2u26%2FHE8KsBIEWFGGB8fyoa1noN7tfkfpLc%3D&st=2019-09-08T13%3A34%3A40Z
never completes. In fact, none of the events abort, aborted, response, timeout ever occur on that request, even after the test fails -- so it's not even in the node event loop, or mocha wouldn't have exited. It occurs to me it might have emitted error, which I didn't listen for. But if that had occurred, then workclaimer.js should have logged "claimTask from hint failed" which I don't see. I do see this logged if I throw a test exception in the claimTask method.
claims = await Promise.all(hints.map(async (hint) => {
try {
// Try to claim task from hint
let result = await this._monitor.timer('claimTask', this.claimTask(
hint.taskId, hint.runId, workerGroup, workerId, null, hint.hintId,
));
....
} catch (err) {
// Report error, don't block
this._monitor.reportError(err, {
comment: 'claimTask from hint failed',
});
...
}
return 'error-claiming';
}));
...
async claimTask(taskId, runId, workerGroup, workerId, task = null, hintId = null) {
// Load task, if not given
if (!task) {
task = await this._Task.load({taskId}, true);
if (!task) {
return 'task-not-found';
}
}
...
}
Also, the claimWork API method implementation sets a done property in the claim method (via the aborted Promise) after 20 seconds. So if the outer loop in that method:
while (claims.length === 0 && !done) {
was still looping, then we would see the API method return after 20 seconds. It really does look like that this._Task.load(..) just never returned.
I'll try again, this time logging more events on the request.
| Reporter | ||
Comment 13•6 years ago
|
||
2019-09-08T22:04:20.556Z taskcluster-client Calling: claimWork, retry: 0
req 1112 POST localhost undefined 60401 /api/queue/v1/claim-work/no-provisioner/gecko-b-1-android
...
evt 1112 abort
2019-09-08T22:04:50.563Z taskcluster-client Request error calling claimWork (retrying), err: Error: Aborted, JSON: Error: Aborted
evt 1112 error
2019-09-08T22:04:50.566Z mocha:runner finished running
1146 is never finished:
2019-09-08T22:04:20.870Z azure:table Request: GET /TasksT20190908TQPuX6t6T(PartitionKey='WnmWmHdUQUCsA1hrX7NO3w',RowKey='task'), retry: 0
req 1146 GET jungle.table.core.windows.net undefined undefined /TasksT20190908TQPuX6t6T(PartitionKey='WnmWmHdUQUCsA1hrX7NO3w',RowKey='task')?timeout=7&sv=2014-02-14&tn=TasksT20190908TQPuX6t6T&se=2019-09-08T22%3A28%3A46Z&sp=raud&sig=SpisK%2B1GwFRjHEyLO73OwF91jo0bQx%2B0QqmHA%2F8a9KA%3D&st=2019-09-08T21%3A48%3A46Z
So, exactly the same behavior as the last comment, and confirming that none of the events ['error', 'timeout', 'abort', 'aborted', 'response', 'timeout'] occurred on that ClientRequest object and that it was not associated with an open event-loop timer or socket. This looks like a bug in the node http implementation.
| Reporter | ||
Comment 14•6 years ago
|
||
Maybe we're missing an event. https://stackoverflow.com/a/18087021 suggests monkey-patching emit. So let's do that.
| Reporter | ||
Comment 15•6 years ago
|
||
Different error here, but really simple:
req 3 PUT jungle.blob.core.windows.net undefined 443 /?comp=properties&restype=service
2019-09-09T13:15:28.843Z nock.common options.host: jungle.blob.core.windows.net
2019-09-09T13:15:28.844Z nock.common options.hostname in the end: "jungle.blob.core.windows.net"
2019-09-09T13:15:28.844Z nock.common options.host in the end: "jungle.blob.core.windows.net:443"
2019-09-09T13:15:28.844Z nock.intercept interceptors for "jungle.blob.core.windows.net:443"
2019-09-09T13:15:28.844Z nock.intercept filtering interceptors for basepath https://jungle.blob.core.windows.net:443
2019-09-09T13:15:28.844Z nock.common options.host: jungle.blob.core.windows.net:443
2019-09-09T13:15:28.844Z nock.common options.hostname in the end: "jungle.blob.core.windows.net"
2019-09-09T13:15:28.844Z nock.common options.host in the end: "jungle.blob.core.windows.net:443"
2019-09-09T13:15:28.844Z nock.intercept Net connect enabled for jungle.blob.core.windows.net:443
evt 3 socket
evt 3 prefinish
evt 3 finish
1) "before all" hook: withEntity for Artifact for "queue.listProvisioners returns an empty list"
2019-09-09T13:15:58.615Z nock.common options.hostname in the end: undefined
2019-09-09T13:15:58.615Z nock.common options.host in the end: "localhost:443"
2019-09-09T13:15:58.616Z mocha:runner finished running
prefinish and finish are not documented. I'll try adding timestamps to the events.
| Reporter | ||
Comment 16•6 years ago
|
||
2019-09-09T16:33:57.603Z azure:queue Request: POST /q03554-claim/messages, retry: 0
req 1180 POST jungle.queue.core.windows.net undefined undefined /q03554-claim/messages?visibilitytimeout=1&messagettl=604800&timeout=7
2019-09-09T16:33:57.603Z nock.common options.host: jungle.queue.core.windows.net
2019-09-09T16:33:57.603Z nock.common options.hostname in the end: "jungle.queue.core.windows.net"
2019-09-09T16:33:57.603Z nock.common options.host in the end: "jungle.queue.core.windows.net:443"
2019-09-09T16:33:57.603Z nock.intercept interceptors for "jungle.queue.core.windows.net:443"
2019-09-09T16:33:57.603Z nock.intercept filtering interceptors for basepath https://jungle.queue.core.windows.net:443
2019-09-09T16:33:57.603Z nock.common options.host: jungle.queue.core.windows.net:443
2019-09-09T16:33:57.603Z nock.common options.hostname in the end: "jungle.queue.core.windows.net"
2019-09-09T16:33:57.603Z nock.common options.host in the end: "jungle.queue.core.windows.net:443"
2019-09-09T16:33:57.603Z nock.intercept Net connect enabled for jungle.queue.core.windows.net:443
evt 1180 socket 2019-09-09T16:33:57.604Z
evt 1180 prefinish 2019-09-09T16:33:57.604Z
...
evt 1114 abort 2019-09-09T16:34:27.140Z
2019-09-09T16:34:27.140Z taskcluster-client Request error calling claimWork (retrying), err: Error: Aborted, JSON: Error: Aborted
evt 1114 error 2019-09-09T16:34:27.141Z
evt 1114 close 2019-09-09T16:34:27.141Z
2019-09-09T16:34:27.142Z mocha:runner finished running
| Reporter | ||
Comment 17•6 years ago
|
||
https://github.com/nodejs/node/blob/db706da235d80e5dadaab64328bfad9cb313be39/lib/_http_outgoing.js
if (this.finished) {
// This is a queue to the server or client to bring in the next this.
this._finish();
} else if (ret && this[kNeedDrain]) {
this[kNeedDrain] = false;
this.emit('drain');
}
where _finish() is
OutgoingMessage.prototype._finish = function _finish() {
assert(this.socket);
this.emit('prefinish');
};
so that's where the prefinish is coming from (I think ClientRequest is a subclass of OutgoingMessage). "This is a queue to the server or client to bring in the next this." I suspect that is supposed to be "This is a cue to the server or client to bring in the next message" (when multiple messages/requests are queued on a single HTTP connection).
So I think prefinish means "I've sent all my data to the socket". There's a timeout set on that request (7500) so it should have called socket.setTimeout and if nothing else have a timeout event 7500ms later. But it doesn't.
| Reporter | ||
Comment 18•6 years ago
|
||
I suspect I'm quite a way down the wrong trail here. Something changed last week to cause this. It wasn't a new node version.
- Azure?
- Task environment? Did github-worker change?
- We started running a lot more TC CI tasks due to renovate
I'm unassigning myself. If you'd like to give this a try, please do. Alternately, we can disable some of the worst offending tests (such as the claim-30-tasks test) to get back to mostly green checkmarks in GitHub until we solve this.
| Reporter | ||
Comment 19•6 years ago
|
||
Not just queue!
https://taskcluster-artifacts.net/Xc-2Z4HiRjO23VK-0VrGCA/0/public/logs/live_backing.log
[0m 1) services/worker-manager/test/worker_scanner_test.js (real)
"before each" hook: withEntity for Worker for "single worker":
[0m[31m Error: Timeout of 30000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/taskcluster/services/worker-manager/test/worker_scanner_test.js)[0m[90m
| Assignee | ||
Updated•6 years ago
|
| Reporter | ||
Updated•6 years ago
|
| Reporter | ||
Comment 20•6 years ago
|
||
nock appears to be innocent -- even at v10,
[0m 1) services/queue/test/taskgroup_test.js (real)
"before each" hook: withEntity for TaskDependency for "Create two tasks with same taskGroupId":
[0m[31m Error: Timeout of 30000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/taskcluster/services/queue/test/taskgroup_test.js)[0m[90m
[0m
[0m 2) services/queue/test/taskgroup_test.js (real)
"after all" hook: withEntity for TaskDependency for "task-group membership expiration (doesn't drop table)":
[0m[31m Error: Timeout of 30000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/taskcluster/services/queue/test/taskgroup_test.js)[0m[90m
[0m
[0m 3) services/queue/test/workerinfo_test.js (mock)
"before all" hook for "queue.listProvisioners returns an empty list":
[0m[31m TypeError: helper.queueService.client._reset is not a function[0m[90m
at Context.<anonymous> (test/helper.js:118:34)
That last one is pretty interesting!
116 helper.load.cfg('azure.fake', true);
117 helper.queueService = await helper.load('queueService');
118 helper.queueService.client._reset();
Where the QueueService constructor uses a new FakeClient (which has a _reset method) if options.credentials.fake is set. Presumably queueService was already initialized with a real client at that time. How could that possibly be intermittent?
| Reporter | ||
Comment 21•6 years ago
|
||
[0m 1) services/queue/test/claimwork_test.js (real)
claimWork, reclaimTask, reportCompleted:
[0m[31m Error: Timeout of 30000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/taskcluster/services/queue/test/claimwork_test.js)[0m[90m
| Reporter | ||
Comment 22•6 years ago
|
||
[0m[0m
[0m services/queue/test/artifact_test.js (real)[0m
[32m ✓[0m[90m Post Public S3 artifact[0m[31m (15298ms)[0m
[31m 1) "after each" hook for "Post Public S3 artifact"[0m
[92m [0m[32m 1 passing[0m[90m (20s)[0m
[31m 1 failing[0m
[0m 1) "after each" hook for "Post Public S3 artifact":
[0m[31m Error: Errors logged to monitor during test run:
{"code":"OperationTimedOut","message":"Operation could not be completed within the specified time.\nRequestId:2e7df674-f003-0015-0b64-69a41c000000\nTime:2019-09-12T12:22:30.6466616Z","module":"fast-azure-storage","name":"OperationTimedOutError","payload":"<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>OperationTimedOut</Code><Message>Operation could not be completed within the specified time.\nRequestId:2e7df674-f003-0015-0b64-69a41c000000\nTime:2019-09-12T12:22:30.6466616Z</Message></Error>","provisionerId":"no-provisioner","queue":"q37121-g3okl6atj5fuope2kwappjoz-ewlmcjzxabjkgg5zgktac73r-5","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:2e7df674-f003-0015-0b64-69a41c000000\nTime:2019-09-12T12:22:30.6466616Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker"}
{"code":"OperationTimedOut","message":"Operation could not be completed within the specified time.\nRequestId:2e7df674-f003-0015-0b64-69a41c000000\nTime:2019-09-12T12:22:30.6466616Z","module":"fast-azure-storage","name":"OperationTimedOutError","note":"Failed to ensure azure queue in queueservice.js","payload":"<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>OperationTimedOut</Code><Message>Operation could not be completed within the specified time.\nRequestId:2e7df674-f003-0015-0b64-69a41c000000\nTime:2019-09-12T12:22:30.6466616Z</Message></Error>","provisionerId":"no-provisioner","queue":"q37121-g3okl6atj5fuope2kwappjoz-ewlmcjzxabjkgg5zgktac73r-5","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:2e7df674-f003-0015-0b64-69a41c000000\nTime:2019-09-12T12:22:30.6466616Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker"}
{"code":"OperationTimedOut","incidentId":"4f67db50-06ba-47d6-b158-751cfc538dbd","message":"Operation could not be completed within the specified time.\nRequestId:2e7df674-f003-0015-0b64-69a41c000000\nTime:2019-09-12T12:22:30.6466616Z","method":"createTask","module":"fast-azure-storage","name":"OperationTimedOutError","note":"Failed to ensure azure queue in queueservice.js","params":{"taskId":"0tVM1HF7T36Z9-kClsxEsQ"},"payload":{"created":"2019-09-12T12:22:16.925Z","deadline":"2019-09-15T12:22:16.926Z","dependencies":[],"expires":"2020-09-15T12:22:16.926Z","extra":{},"metadata":{"description":"Task created during unit tests","name":"Unit testing task","owner":"jonsafj@mozilla.com","source":"https://github.com/taskcluster/taskcluster-queue"},"payload":{},"priority":"lowest","provisionerId":"no-provisioner","requires":"all-completed","retries":5,"routes":[],"schedulerId":"my-scheduler","scopes":[],"tags":{"purpose":"taskcluster-testing"},"taskGroupId":"dSlITZ4yQgmvxxAi4A8fHQ","workerType":"test-worker"},"provisionerId":"no-provisioner","queue":"q37121-g3okl6atj5fuope2kwappjoz-ewlmcjzxabjkgg5zgktac73r-5","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:2e7df674-f003-0015-0b64-69a41c000000\nTime:2019-09-12T12:22:30.6466616Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker"}
{"code":"OperationTimedOut","message":"Operation could not be completed within the specified time.\nRequestId:082870ff-2003-00f2-1b64-69b411000000\nTime:2019-09-12T12:22:30.6644189Z","module":"fast-azure-storage","name":"OperationTimedOutError","payload":"<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>OperationTimedOut</Code><Message>Operation could not be completed within the specified time.\nRequestId:082870ff-2003-00f2-1b64-69b411000000\nTime:2019-09-12T12:22:30.6644189Z</Message></Error>","provisionerId":"no-provisioner","queue":"q37121-g3okl6atj5fuope2kwappjoz-ewlmcjzxabjkgg5zgktac73r-4","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:082870ff-2003-00f2-1b64-69b411000000\nTime:2019-09-12T12:22:30.6644189Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker"}[0m[90m
at Context.<anonymous> (/taskcluster/libraries/testing/src/with-monitor.js:22:13)
at process.topLevelDomainCallback (domain.js:126:23)
[0m
I don't know why mocha stopped running tests after that failure -- we're not using --bail. Mysteries upon mysteries.
| Reporter | ||
Comment 23•6 years ago
|
||
[0m 1) services/queue/test/canceltask_test.js (real)
defineTask, cancelTask (idempotent):
[0m[31m Error: Timeout of 30000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/taskcluster/services/queue/test/canceltask_test.js)[0m[90m
[0m
[0m 2) "after each" hook for "createTask, cancelTask (idempotent)":
[0m[31m Error: Errors logged to monitor during test run:
{"code":"OperationTimedOut","incidentId":"17d65757-e93d-45c5-9908-83470453bf75","message":"Operation could not be completed within the specified time.\nRequestId:845276c6-5003-00d4-3c64-692fa5000000\nTime:2019-09-12T12:21:41.4148877Z","method":"cancelTask","module":"fast-azure-storage","name":"OperationTimedOutError","params":{"taskId":"-UIWvp4ATm256TezMDa8Cg"},"payload":{},"retries":1,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:845276c6-5003-00d4-3c64-692fa5000000\nTime:2019-09-12T12:21:41.4148877Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1}
{"code":"OperationTimedOut","message":"Operation could not be completed within the specified time.\nRequestId:2e55c6a2-3003-00a9-1e64-69b36d000000\nTime:2019-09-12T12:21:49.3750895Z","module":"fast-azure-storage","name":"OperationTimedOutError","payload":"<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>OperationTimedOut</Code><Message>Operation could not be completed within the specified time.\nRequestId:2e55c6a2-3003-00a9-1e64-69b36d000000\nTime:2019-09-12T12:21:49.3750895Z</Message></Error>","provisionerId":"no-provisioner-extended-extended","queue":"q66910-vofimivt2uifu3wkmxzjrhfm-dljkj3oal2nprldd4cf2aqrs-2","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:2e55c6a2-3003-00a9-1e64-69b36d000000\nTime:2019-09-12T12:21:49.3750895Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker-extended-extended"}
{"code":"OperationTimedOut","message":"Operation could not be completed within the specified time.\nRequestId:2e55c6a2-3003-00a9-1e64-69b36d000000\nTime:2019-09-12T12:21:49.3750895Z","module":"fast-azure-storage","name":"OperationTimedOutError","note":"Failed to ensure azure queue in queueservice.js","payload":"<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>OperationTimedOut</Code><Message>Operation could not be completed within the specified time.\nRequestId:2e55c6a2-3003-00a9-1e64-69b36d000000\nTime:2019-09-12T12:21:49.3750895Z</Message></Error>","provisionerId":"no-provisioner-extended-extended","queue":"q66910-vofimivt2uifu3wkmxzjrhfm-dljkj3oal2nprldd4cf2aqrs-2","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:2e55c6a2-3003-00a9-1e64-69b36d000000\nTime:2019-09-12T12:21:49.3750895Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker-extended-extended"}
{"code":"OperationTimedOut","incidentId":"e4d67ff0-c658-48ca-a3c7-a03d9b96164d","message":"Operation could not be completed within the specified time.\nRequestId:2e55c6a2-3003-00a9-1e64-69b36d000000\nTime:2019-09-12T12:21:49.3750895Z","method":"createTask","module":"fast-azure-storage","name":"OperationTimedOutError","note":"Failed to ensure azure queue in queueservice.js","params":{"taskId":"pVQC0-XjSyKP5kxApWenEA"},"payload":{"created":"2019-09-12T12:19:27.594Z","deadline":"2019-09-15T12:19:27.594Z","dependencies":[],"expires":"2020-09-15T12:19:27.594Z","extra":{},"metadata":{"description":"Task created during unit tests","name":"Unit testing task","owner":"jonsafj@mozilla.com","source":"https://github.com/taskcluster/taskcluster-queue"},"payload":{},"priority":"lowest","provisionerId":"no-provisioner-extended-extended","requires":"all-completed","retries":5,"routes":[],"schedulerId":"my-scheduler-extended-extended","scopes":[],"tags":{"purpose":"taskcluster-testing"},"taskGroupId":"dSlITZ4yQgmvxxAi4A8fHQ","workerType":"test-worker-extended-extended"},"provisionerId":"no-provisioner-extended-extended","queue":"q66910-vofimivt2uifu3wkmxzjrhfm-dljkj3oal2nprldd4cf2aqrs-2","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:2e55c6a2-3003-00a9-1e64-69b36d000000\nTime:2019-09-12T12:21:49.3750895Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker-extended-extended"}
{"code":"OperationTimedOut","message":"Operation could not be completed within the specified time.\nRequestId:84e7d91d-4003-00e9-4a64-699a83000000\nTime:2019-09-12T12:21:49.4011341Z","module":"fast-azure-storage","name":"OperationTimedOutError","payload":"<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>OperationTimedOut</Code><Message>Operation could not be completed within the specified time.\nRequestId:84e7d91d-4003-00e9-4a64-699a83000000\nTime:2019-09-12T12:21:49.4011341Z</Message></Error>","provisionerId":"no-provisioner-extended-extended","queue":"q66910-vofimivt2uifu3wkmxzjrhfm-dljkj3oal2nprldd4cf2aqrs-6","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:84e7d91d-4003-00e9-4a64-699a83000000\nTime:2019-09-12T12:21:49.4011341Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker-extended-extended"}
{"code":"OperationTimedOut","message":"Operation could not be completed within the specified time.\nRequestId:e6d2e46b-2003-007a-1c64-690cc8000000\nTime:2019-09-12T12:21:57.5851169Z","module":"fast-azure-storage","name":"OperationTimedOutError","payload":"<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>OperationTimedOut</Code><Message>Operation could not be completed within the specified time.\nRequestId:e6d2e46b-2003-007a-1c64-690cc8000000\nTime:2019-09-12T12:21:57.5851169Z</Message></Error>","provisionerId":"no-provisioner-extended-extended","queue":"q66910-vofimivt2uifu3wkmxzjrhfm-dljkj3oal2nprldd4cf2aqrs-2","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:e6d2e46b-2003-007a-1c64-690cc8000000\nTime:2019-09-12T12:21:57.5851169Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker-extended-extended"}
{"code":"OperationTimedOut","message":"Operation could not be completed within the specified time.\nRequestId:e6d2e46b-2003-007a-1c64-690cc8000000\nTime:2019-09-12T12:21:57.5851169Z","module":"fast-azure-storage","name":"OperationTimedOutError","note":"Failed to ensure azure queue in queueservice.js","payload":"<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>OperationTimedOut</Code><Message>Operation could not be completed within the specified time.\nRequestId:e6d2e46b-2003-007a-1c64-690cc8000000\nTime:2019-09-12T12:21:57.5851169Z</Message></Error>","provisionerId":"no-provisioner-extended-extended","queue":"q66910-vofimivt2uifu3wkmxzjrhfm-dljkj3oal2nprldd4cf2aqrs-2","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:e6d2e46b-2003-007a-1c64-690cc8000000\nTime:2019-09-12T12:21:57.5851169Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker-extended-extended"}
{"code":"OperationTimedOut","incidentId":"90284588-acb9-4a11-92fa-a9cad8bcf46b","message":"Operation could not be completed within the specified time.\nRequestId:e6d2e46b-2003-007a-1c64-690cc8000000\nTime:2019-09-12T12:21:57.5851169Z","method":"createTask","module":"fast-azure-storage","name":"OperationTimedOutError","note":"Failed to ensure azure queue in queueservice.js","params":{"taskId":"pVQC0-XjSyKP5kxApWenEA"},"payload":{"created":"2019-09-12T12:19:27.594Z","deadline":"2019-09-15T12:19:27.594Z","dependencies":[],"expires":"2020-09-15T12:19:27.594Z","extra":{},"metadata":{"description":"Task created during unit tests","name":"Unit testing task","owner":"jonsafj@mozilla.com","source":"https://github.com/taskcluster/taskcluster-queue"},"payload":{},"priority":"lowest","provisionerId":"no-provisioner-extended-extended","requires":"all-completed","retries":5,"routes":[],"schedulerId":"my-scheduler-extended-extended","scopes":[],"tags":{"purpose":"taskcluster-testing"},"taskGroupId":"dSlITZ4yQgmvxxAi4A8fHQ","workerType":"test-worker-extended-extended"},"provisionerId":"no-provisioner-extended-extended","queue":"q66910-vofimivt2uifu3wkmxzjrhfm-dljkj3oal2nprldd4cf2aqrs-2","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:e6d2e46b-2003-007a-1c64-690cc8000000\nTime:2019-09-12T12:21:57.5851169Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker-extended-extended"}[0m[90m
at Context.<anonymous> (/taskcluster/libraries/testing/src/with-monitor.js:22:13)
at process.topLevelDomainCallback (domain.js:126:23)
[0m
| Reporter | ||
Comment 24•6 years ago
|
||
Similar to comment 22
[0m[0m
[0m services/queue/test/artifact_test.js (real)[0m
[32m ✓[0m[90m Post Public S3 artifact[0m[31m (36924ms)[0m
[31m 1) "after each" hook for "Post Public S3 artifact"[0m
[92m [0m[32m 1 passing[0m[90m (42s)[0m
[31m 1 failing[0m
[0m 1) "after each" hook for "Post Public S3 artifact":
[0m[31m Error: Errors logged to monitor during test run:
{"code":"OperationTimedOut","message":"Operation could not be completed within the specified time.\nRequestId:87a463c2-b003-00b3-1764-699c02000000\nTime:2019-09-12T12:23:14.1890006Z","module":"fast-azure-storage","name":"OperationTimedOutError","payload":"<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>OperationTimedOut</Code><Message>Operation could not be completed within the specified time.\nRequestId:87a463c2-b003-00b3-1764-699c02000000\nTime:2019-09-12T12:23:14.1890006Z</Message></Error>","provisionerId":"no-provisioner","queue":"q67094-g3okl6atj5fuope2kwappjoz-ewlmcjzxabjkgg5zgktac73r-7","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:87a463c2-b003-00b3-1764-699c02000000\nTime:2019-09-12T12:23:14.1890006Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker"}
{"code":"OperationTimedOut","message":"Operation could not be completed within the specified time.\nRequestId:87a463c2-b003-00b3-1764-699c02000000\nTime:2019-09-12T12:23:14.1890006Z","module":"fast-azure-storage","name":"OperationTimedOutError","note":"Failed to ensure azure queue in queueservice.js","payload":"<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>OperationTimedOut</Code><Message>Operation could not be completed within the specified time.\nRequestId:87a463c2-b003-00b3-1764-699c02000000\nTime:2019-09-12T12:23:14.1890006Z</Message></Error>","provisionerId":"no-provisioner","queue":"q67094-g3okl6atj5fuope2kwappjoz-ewlmcjzxabjkgg5zgktac73r-7","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:87a463c2-b003-00b3-1764-699c02000000\nTime:2019-09-12T12:23:14.1890006Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker"}
{"code":"OperationTimedOut","incidentId":"2214646c-9e7f-4bab-b543-4b3523bf3626","message":"Operation could not be completed within the specified time.\nRequestId:87a463c2-b003-00b3-1764-699c02000000\nTime:2019-09-12T12:23:14.1890006Z","method":"createTask","module":"fast-azure-storage","name":"OperationTimedOutError","note":"Failed to ensure azure queue in queueservice.js","params":{"taskId":"HVmvOwF3TyqDq0G2w2dKIw"},"payload":{"created":"2019-09-12T12:22:46.900Z","deadline":"2019-09-15T12:22:46.900Z","dependencies":[],"expires":"2020-09-15T12:22:46.900Z","extra":{},"metadata":{"description":"Task created during unit tests","name":"Unit testing task","owner":"jonsafj@mozilla.com","source":"https://github.com/taskcluster/taskcluster-queue"},"payload":{},"priority":"lowest","provisionerId":"no-provisioner","requires":"all-completed","retries":5,"routes":[],"schedulerId":"my-scheduler","scopes":[],"tags":{"purpose":"taskcluster-testing"},"taskGroupId":"dSlITZ4yQgmvxxAi4A8fHQ","workerType":"test-worker"},"provisionerId":"no-provisioner","queue":"q67094-g3okl6atj5fuope2kwappjoz-ewlmcjzxabjkgg5zgktac73r-7","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:87a463c2-b003-00b3-1764-699c02000000\nTime:2019-09-12T12:23:14.1890006Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker"}
{"code":"OperationTimedOut","message":"Operation could not be completed within the specified time.\nRequestId:79b389a8-5003-0135-5664-698e85000000\nTime:2019-09-12T12:23:14.7239282Z","module":"fast-azure-storage","name":"OperationTimedOutError","payload":"<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>OperationTimedOut</Code><Message>Operation could not be completed within the specified time.\nRequestId:79b389a8-5003-0135-5664-698e85000000\nTime:2019-09-12T12:23:14.7239282Z</Message></Error>","provisionerId":"no-provisioner","queue":"q67094-g3okl6atj5fuope2kwappjoz-ewlmcjzxabjkgg5zgktac73r-1","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:79b389a8-5003-0135-5664-698e85000000\nTime:2019-09-12T12:23:14.7239282Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker"}
{"code":"OperationTimedOut","message":"Operation could not be completed within the specified time.\nRequestId:234ad7d6-1003-003d-2b64-69d3a3000000\nTime:2019-09-12T12:23:14.7690700Z","module":"fast-azure-storage","name":"OperationTimedOutError","payload":"<?xml version=\"1.0\" encoding=\"utf-8\"?><Error><Code>OperationTimedOut</Code><Message>Operation could not be completed within the specified time.\nRequestId:234ad7d6-1003-003d-2b64-69d3a3000000\nTime:2019-09-12T12:23:14.7690700Z</Message></Error>","provisionerId":"no-provisioner","queue":"q67094-g3okl6atj5fuope2kwappjoz-ewlmcjzxabjkgg5zgktac73r-6","retries":0,"stack":"OperationTimedOutError: Operation could not be completed within the specified time.\nRequestId:234ad7d6-1003-003d-2b64-69d3a3000000\nTime:2019-09-12T12:23:14.7690700Z\n at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27\n at process._tickCallback (internal/process/next_tick.js:68:7)","statusCode":500,"v":1,"workerType":"test-worker"}[0m[90m
at Context.<anonymous> (/taskcluster/libraries/testing/src/with-monitor.js:22:13)
at process.topLevelDomainCallback (domain.js:126:23)
[0m
| Reporter | ||
Comment 25•6 years ago
|
||
WAT
https://taskcluster-artifacts.net/Zd_0OJQVTBGDgGufQWZhTg/1/public/logs/live_backing.log
[0m 1) services/queue/test/queueservice_test.js (real)
deleteUnusedWorkerQueues (can delete queues):
[0m[31m QueueBeingDeletedError: The specified queue is being deleted.
RequestId:e31aa8b4-7003-004b-3a66-69571f000000
Time:2019-09-12T12:34:38.6675581Z[0m[90m
at /taskcluster/node_modules/fast-azure-storage/lib/queue.js:320:27
at process._tickCallback (internal/process/next_tick.js:68:7)
[0m
| Reporter | ||
Comment 26•6 years ago
|
||
[0m 1) services/queue/test/taskgroup_test.js (real)
"before all" hook: withEntity for Artifact for "Create two tasks with same taskGroupId":
[0m[31m Error: Timeout of 30000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/taskcluster/services/queue/test/taskgroup_test.js)[0m[90m
| Reporter | ||
Comment 27•6 years ago
•
|
||
I tried bisecting but it didn't look like there was much of a pattern. I'm running 30 of each of the merge commits from the last
few days:
- 679ab1223 - 2019-09-12 11:14:08 - 0% success
- 2061ff541 - 2019-09-12 08:34:46 - 40%
- 6d0e8d031 - 2019-09-12 08:28:05 - 36%
- 1942d582c - 2019-09-12 08:21:30 - 16%
- 366d745ab - 2019-09-12 08:16:36 - 20%
- aa2d84d57 - 2019-09-12 07:21:20 - 43%
- 10079796d - 2019-09-12 07:15:07 - 10%
- 679ab1223 - 2019-09-12 11:14:08 - 0%
- ee4ca085a - 2019-09-11 16:42:38 - 6%
- de60ad675 - 2019-09-11 11:44:53 - 0%
- e8cd2be62 - 2019-09-11 14:36:50 - 10%
- d8b14fc93 - 2019-09-11 09:40:43 - 20%
- e8f93fa19 - 2019-09-11 12:15:30 - 23%
- 295d1b0e0 - 2019-09-11 11:47:26 - 33%
- b8b0bd3df - 2019-09-11 11:20:14 - 6%
- dcd863102 - 2019-09-11 11:01:02 - 20%
- 354f0f131 - 2019-09-11 10:20:30 - 26%
- 334e1cae7 - 2019-09-11 09:58:54 - 3%
- 8a635531e - 2019-09-11 09:51:45 - 33%
- ebe546c37 - 2019-09-11 08:57:08 - 13%
- 34c7df13d - 2019-09-11 08:56:00 - 13%
- 2f83dde16 - 2019-09-10 15:53:38 - 16%
- 09cba97fa - 2019-09-10 15:38:46 - 43%
- 05d832120 - 2019-09-10 15:35:03 - 20%
- cdbcb1d40 - 2019-09-10 15:34:12 - 36%
- 8f05b7ba6 - 2019-09-10 15:33:09 - 30%
- 60a35ae3b - 2019-09-10 12:16:18 - 23%
- 29ed716d0 - 2019-09-10 14:28:35 - 20%
- 5d15a50d8 - 2019-09-10 14:28:16 - 13%
- 2d455a667 - 2019-09-10 13:49:34 - 13%
- 5229312cc - 2019-09-10 13:47:52 - 13%
- 7a93dbfcb - 2019-09-10 13:21:30 - 23%
- 581aa782e - 2019-09-10 13:20:24 - 30%
- c646022da - 2019-09-10 12:20:30 - 10%
- a40620a0a - 2019-09-10 12:20:12 - 26%
- 310ce70ca - 2019-09-10 11:32:15 - 16%
- e34c2de30 - 2019-09-10 11:30:31 - 26%
- e6143f728 - 2019-09-10 10:16:38 - 30%
- ce7fb86b4 - 2019-09-10 09:57:51 - 13%
- 2c2979421 - 2019-09-10 09:29:43 - 30%
- df1739639 - 2019-09-10 09:25:59 - 36%
- edcb3412c - 2019-09-09 16:08:13 - 43%
- fa701bb48 - 2019-09-09 14:50:24 - 26%
- 6058c82d9 - 2019-09-09 11:07:03 - 36%
- 33ec08712 - 2019-09-09 12:26:06 - 26%
- aa3957a6d - 2019-09-09 11:07:29 - 26%
- c0994bafc - 2019-09-09 10:08:30 - 36%
- 814c70e6b - 2019-09-09 09:22:58 - 20%
- ef8e43ab9 - 2019-09-06 14:04:48 - 16%
- b519aca7e - 2019-09-06 15:05:08 - 26%
- 953a1bbc5 - 2019-09-06 15:04:00 - 10%
- 162bd3d63 - 2019-09-06 12:44:28 - 3%
- b1d1039c7 - 2019-09-05 16:39:12 - 36%
- 88016ca3c - 2019-09-05 15:39:36 - 36%
- 6f05b0bad - 2019-09-05 15:26:13 - 23%
- d9bc71745 - 2019-09-05 14:03:22 - 50%
- 1a3ba2de1 - 2019-09-05 13:48:58 - 66%
- 1a975a185 - 2019-09-05 10:22:16 - 36%
- bbc182a7d - 2019-09-04 17:07:20 - 16%
- e68cfac9a - 2019-09-04 16:47:33 - 20%
- 6bb1977ab - 2019-09-04 16:41:09 - 23%
- c2ef83f3a - 2019-09-04 16:39:44 - 16%
- 346f07b73 - 2019-09-04 15:09:39 - 36%
- 5e880a71a - 2019-09-04 14:38:49 - 50%
- 15cc441b4 - 2019-09-04 14:38:25 - 16%
- 7382c2dfc - 2019-09-04 14:11:54 - 23%
- a20e5a56f - 2019-09-04 13:13:05 - 50%
- 8d487cd32 - 2019-09-04 12:52:38 - 36%
- 83bca3acd - 2019-09-03 14:58:29 - 23%
| Reporter | ||
Comment 28•6 years ago
|
||
So, something has changed at Azure. This problem started in earnest around Sept 6, right about the same time that auth test started timing out because there were too many containers, and I added a "test:cleanup" test to delete containers.
| Reporter | ||
Comment 29•6 years ago
|
||
Filpping through the various metrics in the Azure portal doesn't show anything that changed dramatically during that time. I see elevated numbers over that weekend, but that's because I was running a bunch of tests at that time. And I see a huge spike from today, but that's from comment 27 :)
| Reporter | ||
Comment 30•6 years ago
|
||
OK, let's see if a new (non-classic) Azure account works better.
If so, I don't know if that makes me happy or sad, as it means that someday our production Azure will fall apart, right?
| Reporter | ||
Comment 31•6 years ago
|
||
Well, https://tools.taskcluster.net/groups/YGK7A6KqTL6Yiv8OITeq2w/tasks/DXOR8HegTd28po0GZzXBgA/runs/0/logs/public%2Flogs%2Flive_backing.log was on bug 1580970's merge to master.
[0m 1) services/queue/test/createtask_test.js (real)
"before all" hook: withEntity for Artifact for "createTask":
[0m[31m Error: Timeout of 30000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/taskcluster/services/queue/test/createtask_test.js)[0m[90m
and that's still with a bunch of queue tests skipped.
So I think we can determine, something changed at Azure that's not related to a particular storage account.
| Reporter | ||
Comment 32•6 years ago
|
||
It doesn't look like anything changed with github-worker at that time:
https://github.com/taskcluster/mozilla-history/commits/master/AWSWorkerTypes/github-worker
(that would be a possibility even though I've replicated "locally" - "local" for me is in EC2 on Ubuntu)
| Reporter | ||
Comment 33•6 years ago
|
||
1) services/worker-manager/test/provider_google_test.js (real)
"before each" hook: withEntity for WorkerPoolError for "provisioning loop":
Error: Timeout of 30000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves. (/taskcluster/services/worker-manager/test/provider_google_test.js)
```
| Reporter | ||
Comment 35•6 years ago
|
||
https://github.com/taskcluster/taskcluster/pull/1393 disables tests against Azure.
| Reporter | ||
Comment 36•6 years ago
|
||
fast-azure-storage does have a custom https.Agent subclass which overrides private methods. That could explain some of the weird behavior of the Node http library seen above. It looks like it's built for an older version of Node.
I think the choices here are:
- Rewrite our entire Azure interface to use the newer Azure SDKs instead of fast-azure-storage
- Rewrite everything to use Postgres
I'm going to continue to hold my breath and hope nothing gets worse with Azure.
https://github.com/taskcluster/taskcluster/pull/1994 fixes #1393 to actually skip things..
Updated•5 years ago
|
| Reporter | ||
Comment 37•5 years ago
|
||
I would argue we FIXED this ;)
Description
•