Closed Bug 1311981 Opened 8 years ago Closed 8 years ago

Recover after Heroku/Github DNS issues on Friday October 21, 2016

Categories

(Taskcluster :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jhford, Unassigned)

References

Details

Attachments

(1 file)

It looks like there are DNS issues with Heroku causing our services to be at reduced availability and also DNS issues with Github.

Trees are closed.

Heroku acknowledges that there are DNS issues and is working on a fix.
See Also: → 1311964
Depends on: 1311984
From Heroku:

We are seeing many services return to normal following remediation by our upstream provider. Our engineers will continue to monitor the issue.


I am still seeing tons of failures to resolve in the provisioner from:

2016-10-21T13:25:38.327241+00:00 app[web.1]: Fri, 21 Oct 2016 13:25:38 GMT base:api Error occurred handling: /list-worker-type-summaries, err: Error: getaddrinfo ENOTFOUND auth.taskcluster.net auth.taskcluster.net:443, as JSON: {"code":"ENOTFOUND","errno":"ENOTFOUND","syscall":"getaddrinfo","hostname":"auth.taskcluster.net","host":"auth.taskcluster.net","port":443,"incidentId":"2a364198-b8d1-4bff-a8e4-275ff9919ae1"}, incidentId: 362e8852-cb8c-4865-89e6-9da45d4467cb Error: getaddrinfo ENOTFOUND auth.taskcluster.net auth.taskcluster.net:443
Restarted auth and provisioner, provisioner seems to be back online.
I've also restarted (assume taskcluster- prefix):

* mozilla-taskcluster
* scheduler-taskcluster-net
* events
* github
* httpbin
* index
* secrets

These are all the services that I know that were running when restart was needed.
Github did post something to their status page:


October 21, 2016
15:14 CEST
A global event is affecting an upstream DNS provider. GitHub services may be intermittently available at this time.
See Also: 1311964
Trees have been reopened and I see tasks flowing in.  I think we're ok.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: