The try repo has once again grown more heads than it can handle, we need to reset it. Looking to do this in the downtime on the morning of August 20th.
Sorry to make this critical but try-server seems to be not usable until this is fixed. TERM=linux USER=cltbld _=/tools/buildbot/bin/buildbot closing stdin using PTY: False requesting all changes abort: HTTP Error 414: Request-URI Too Large elapsedTime=0.617869 program finished with exit code 255 === Output ended ===
Severity: normal → critical
I'm downgrading this back to normal severity because the evidence is that the 414 error is another symptom of hg.m.o not working properly (bug 511258). There are only four of these errors, across all the try builds in the last day or so, counted against many more failed clones in the style of bug 511258, and lots of successful clones, so it's not a systematic problem. That's not to say that cleaning out all the heads isn't necessary, just that it's not the root cause.
Severity: critical → normal
One theory about the underlying hg problem is that an intensive server process can crowd out other server processes for memory, so I think it might be worth stripping the extra heads here. I would imagine that the extra heads add significantly to the working set of the server process, so while we get more RAM installed in the hg hosts (someone is on the way to the colo, I believe!), could we strip these heads as well?
5 out of 7 try server columns failed for me last time I pushed, so this isn't just sporadic.
We went ahead and tried this but it hasn't been successful. After triggering two try server runs (16 builds in total), there were 8 successful clones and 8 transaction aborts on premature EOFs. Given that's 50/50 I think we should start hammering on specific proxy+endpoint combinations until we find the culprit. Lets do that in bug 511258. On the Releng side I had to restart the master on sm-try-master, since a reconfig didn't convince the HgPoller to forget about the most recent revision from the old repo (ala bug 500246).
Assignee: bhearsum → thardcastle
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Component: Release Engineering → Server Operations
OS: Mac OS X → All
QA Contact: release → mrz
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.