Closed
Bug 1243862
Opened 9 years ago
Closed 9 years ago
Provisioner failed to complete provisioning iteration
Categories
(Taskcluster :: Services, defect)
Taskcluster
Services
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: garndt, Unassigned)
Details
At 19:28 UTC the provisioner started a provisioning iteration that never completed. After 25 minutes I restarted the provisioner but failed to save the heroku logs. When I was skimming the logs I didn't see any crashes that jumped out. Mostly it was just the provisioner reporting stats to influx.
Dead man snitch didn't report the provisioner as not reporting in. This was noticed because a monitor for some influx stats threw an alert when no iterations were logged for more than 10 minutes.
Also, around the same time we had some ec2 instances failing to pull files from s3. This is probably purely coincidental but thought I would mention it since there seemed to be a different AWS hiccup around the same time.
Updated•9 years ago
|
Flags: needinfo?(jhford)
Reporter | ||
Comment 1•9 years ago
|
||
I'm not sure if it's an issue with me receiving emails or dead man snitch sending them but I didn't receive the email that the provisioner failed to check in, but I did receive an email that it started reporting again 40 minutes after I restarted it.
Comment 2•9 years ago
|
||
(In reply to Greg Arndt [:garndt] from comment #1)
> I'm not sure if it's an issue with me receiving emails or dead man snitch
> sending them but I didn't receive the email that the provisioner failed to
> check in, but I did receive an email that it started reporting again 40
> minutes after I restarted it.
I reliably get them, and I can't see much different between the two emails. I've sent you a copy of the raw email so you can see if something is hitting filters you have.
Flags: needinfo?(jhford)
Comment 3•9 years ago
|
||
I restarted the provisioner again this morning due to a ~4h outage. I also wasn't able to capture logs I'm afraid.
Reporter | ||
Comment 4•9 years ago
|
||
I have typically gotten them, and was receiving them last night during the downtime so they're not being filtered. It must have been a hiccup yesterday where there was a delay in getting the emails from them.
Comment 5•9 years ago
|
||
That is strange, they were arriving every 10 minutes, and didn't all arrive at once. Is it possible there were two outages?
Comment 6•9 years ago
|
||
This issue was fixed by rewriting the aws-manager.js file, switching to node 4, statically transpiling and removing the multi-region-aws-sdk library
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•6 years ago
|
Component: AWS-Provisioner → Services
You need to log in
before you can comment on or make changes to this bug.
Description
•