Closed Bug 748939 Opened 11 years ago Closed 11 years ago

talos builds are all red due to 404 errors from hg.mozilla.org

Categories

(mozilla.org Graveyard :: Server Operations, task)

task
Not set
blocker

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 748940

People

(Reporter: dbaron, Assigned: bkero)

Details

(Whiteboard: [holding trees closed])

Right now:

https://tbpl.mozilla.org/ gives "Loading failed: error" as a result of failures to load things off hg.mozilla.org

https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml fails to load (Firefox gives: "The connection was reset / The connection to the server was reset while the page was loading.")

...

well, now it seems to be responding again.

But, in the process, the downtime caused a large number of talos runs to fail (see all the red on https://tbpl.mozilla.org/?tree=Mozilla-Inbound .


This is the sort of thing that shouldn't go down without a downtime notice.
(And, to be clear, the couldn't-load was at 1:25pm, not during this morning's downtime.)
ok, an additional round of talos runs failed since I filed, so I'm saying this bug is still present and effectively holding the tree hostage.

In particular, a recent failure run is:
https://tbpl.mozilla.org/php/getParsedLog.php?id=11202506&tree=Mozilla-Inbound
which hit:

INFO: talos.json URL: http://hg.mozilla.org/integration/mozilla-inbound/raw-file/9d00516b0ad7/testing/talos/talos.json
ERROR: HTTP Error 404: Not Found

at timestamp 2012-04-25 13:48:27.044076
Severity: normal → blocker
Summary: hg.mozilla.org was flaky / mostly down → hg.mozilla.org is flaky
Assignee: server-ops → mburns
Assignee: mburns → bkero
Whiteboard: [holding trees closed]
This is related to some behavior of either hgweb or mod_wsgi on the hg web nodes.  A short term fix it to restart the web heads, which will behave fine for an indeterminate period afterwards, but they will stop responding to requests after a while.  This problem was also reported earlier today and should have been fixed around 13:57.

Are you still experiencing any sort of outage?
Though I'd note that it's possible the flakiness I observed is different from the problem that's happening to the talos slaves.
ok, I'm thinking it's 2 different problems, given that hg.m.o flakiness was only brief, and I can't find a single successful talos run since around 9:30am (this morning's downtime window).

Let's make this bug about the more serious of the problems.
Summary: hg.mozilla.org is flaky → talos builds are all red due to 404 errors from hg.mozilla.org
Also, it seems at least probable that this is fallout from:
http://groups.google.com/group/mozilla.dev.planning/msg/01823de33a10356b
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 748940
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.