(In reply to Pete Moore [:pmoore][:pete] from comment #8) > (In reply to Dustin J. Mitchell [:dustin] pronoun: he from comment #5) > > > What did you mean by "special"? > > Clearly this doesn't happen for all provisionerId/workerType/workerId combinations, otherwise we'd have a downtime. But since it happens consistently and reproducibly for the given provisionerId/workerType/workerId combination, that makes it special; it is behaving differently to the many workers that are happily running tasks at the moment in production without exhibiting this issue. To recap: We have three reproducible queries (`claimTask`, `pendingTasks`, `listWorkers`). The `claimTask` call that gets back zero tasks (immediately), despite: 1) the queue should hold the connection for 20s since the queue implements 20s long-polling 2) there are pending tasks for this worker type (according to the `pendingTasks` query) 3) the worker is not listed as quarantined nor as not-quarantined (according to the `listWorkers` query)
Bug 1519849 Comment 9 Edit History
Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.
(In reply to Pete Moore [:pmoore][:pete] from comment #8) > (In reply to Dustin J. Mitchell [:dustin] pronoun: he from comment #5) > > > What did you mean by "special"? > > Clearly this doesn't happen for all provisionerId/workerType/workerId combinations, otherwise we'd have a downtime. But since it happens consistently and reproducibly for the given provisionerId/workerType/workerId combination, that makes it special; it is behaving differently to the many workers that are happily running tasks at the moment in production without exhibiting this issue. To recap: We have three reproducible queries (`claimTask`, `pendingTasks`, `listWorkers`). The `claimTask` call gets back zero tasks (immediately), despite: 1) the queue should hold the connection for 20s since the queue implements 20s long-polling 2) there are pending tasks for this worker type (according to the `pendingTasks` query) 3) the worker is not listed as quarantined nor as not-quarantined (according to the `listWorkers` query)
(In reply to Pete Moore [:pmoore][:pete] from comment #8) > (In reply to Dustin J. Mitchell [:dustin] pronoun: he from comment #5) > > > What did you mean by "special"? > > Clearly this doesn't happen for all provisionerId/workerType/workerId combinations, otherwise we'd have a downtime. But since it happens consistently and reproducibly for the given provisionerId/workerType/workerId combination, that makes it special; it is behaving differently to the many workers that are happily running tasks at the moment in production without exhibiting this issue. To recap: We have three reproducible queries (`claimTask`, `pendingTasks`, `listWorkers`). The `claimTask` call gets back zero tasks (immediately), despite: 1) the queue should hold the connection for 20s since the queue implements 20s long-polling 2) there are pending tasks for this worker type (according to the `pendingTasks` query) 3) the worker is not listed as quarantined nor as not-quarantined (according to the `listWorkers` query) If the worker is not listed as quarantined, and not listed as not-quarantined, calling `claimTask` should return one of the pending tasks, and the worker should be listed in future `listWorkers` calls. This is not happening.
(In reply to Dustin J. Mitchell [:dustin] pronoun: he from comment #5) > What did you mean by "special"? Clearly this doesn't happen for all provisionerId/workerType/workerId combinations, otherwise we'd have a downtime. But since it happens consistently and reproducibly for the given provisionerId/workerType/workerId combination, that makes it special; it is behaving differently to the many workers that are happily running tasks at the moment in production without exhibiting this issue. To recap: We have three reproducible queries (`claimTask`, `pendingTasks`, `listWorkers`). The `claimTask` call gets back zero tasks (immediately), despite: 1) the queue should hold the connection for 20s since the queue implements 20s long-polling 2) there are pending tasks for this worker type (according to the `pendingTasks` query) 3) the worker is not listed as quarantined nor as not-quarantined (according to the `listWorkers` query) If the worker is not listed as quarantined, and not listed as not-quarantined, calling `claimTask` should return one of the pending tasks, and the worker should be listed in future `listWorkers` calls. This is not happening.
(In reply to Dustin J. Mitchell [:dustin] pronoun: he from comment #5) > What did you mean by "special"? Clearly this doesn't happen for all provisionerId/workerType/workerId combinations, otherwise we'd have a downtime. But since it happens consistently and reproducibly for the given provisionerId/workerType/workerId combination, that makes it special; it is behaving differently to the many workers that are happily running tasks at the moment in production without exhibiting this issue. To recap: We have three reproducible queries (`claimTask`, `pendingTasks`, `listWorkers`). The `claimTask` call gets back zero tasks (immediately), despite: 1) the queue should hold the connection for 20s when returning no tasks, since the queue implements 20s long-polling 2) there are pending tasks for this worker type (according to the `pendingTasks` query) 3) the worker is not listed as quarantined nor as not-quarantined (according to the `listWorkers` query) If the worker is not listed as quarantined, and not listed as not-quarantined, calling `claimTask` should return one of the pending tasks, and the worker should be listed in future `listWorkers` calls. This is not happening.
To recap: We have three reproducible queries (`claimTask`, `pendingTasks`, `listWorkers`). The `claimTask` call gets back zero tasks (immediately), despite: 1) the queue should hold the connection for 20s when returning no tasks, since the queue implements 20s long-polling 2) there are pending tasks for this worker type (according to the `pendingTasks` query) 3) the worker is not listed as quarantined nor as not-quarantined (according to the `listWorkers` query) If the worker is not listed as quarantined, and not listed as not-quarantined, calling `claimTask` should return one of the pending tasks, and the worker should be listed in future `listWorkers` calls. This is not happening.