Since around 13:00 PDT we are seeing intermittent test failures on inbound and try (and probably other trees) caused by 500 errors at ftp.mozilla.org. Not sure if this is related to bug 804119. Possibly related, I am also getting intermittent 500 errors when making requests to http://tbpl.mozilla.org, including requests for static files on that site.
Between the failing tests and TBPL failing to load much of the time, I decided to close the trees because of this.
Is this a duplicate of bug 804658?
These problems seem to have stopped or at least reduced enough that I am re-opening the trees.
This seems to be fixed at the moment.
There isn't any obvious relation between these two sites, from an infrastructure perspective. ftp.mozilla.org is hosted on the FTP cluster in SCL3. Frontend by SCL3 Zeus, data stored on SCL3 NetApp. tbpl.mozilla.org is hosted on the genericrhel6 cluster in PHX1. Frontend by PHX1 Zeus. There was a temporary issue with genericrhel6 throwing unexpected 500 errors earlier today, resolved around 30min ago. There was also a NetApp issue in SCL3. This has been resolved as well, and an upstream ticket is opened with the vendor to diagnose the underlying issue. My impression is that both of these issues should now be resolved, and no recurrence is expected.