Closed
Bug 778062
Opened 12 years ago
Closed 12 years ago
Try server appears to have been reset entirely
Categories
(Infrastructure & Operations :: Infrastructure: Other, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mattwoodrow, Assigned: cshields)
Details
https://hg.mozilla.org/try/ is showing no changesets at all, and hg outgoing (locally) shows the entire m-c history.
This probably needs to be recloned from m-c
Comment 1•12 years ago
|
||
cshields was doing some more experimenting a few hours ago, probably using a single node with upgraded hg and tweaked setup again. Last time that was done there was some rules on the Zeus load balancer to make sure all try requests went to that node. Perhaps that was not all undone when he finished ?
Bug 777521 was to reset try at the weekend.
Updated•12 years ago
|
Assignee: server-ops-infra → server-ops-devservices
Component: Server Operations: Infrastructure → Server Operations: Developer Services
QA Contact: jdow → shyam
Comment 2•12 years ago
|
||
Try is unusable due to this (and has been since at least the time of comment 0); bumping severity.
Severity: normal → blocker
OS: Mac OS X → All
Hardware: x86 → All
Updated•12 years ago
|
Assignee: server-ops-devservices → ashish
Updated•12 years ago
|
Assignee: ashish → cshields
Comment 3•12 years ago
|
||
Pushlog on /try is empty. Corey is resetting try to bring it back to a consistent state.
Comment 4•12 years ago
|
||
I've already restarted the buildbot scheduler, it should pick up new changesets as soon as Corey is finished and someone does a push. If not ping catlee/bhearsum/rail.
Severity: blocker → normal
Component: Server Operations: Developer Services → Server Operations: Infrastructure
OS: All → Mac OS X
Hardware: All → x86
Comment 5•12 years ago
|
||
(In reply to Nick Thomas [:nthomas] from comment #4)
> I've already restarted the buildbot scheduler, it should pick up new
> changesets as soon as Corey is finished and someone does a push. If not ping
> catlee/bhearsum/rail.
This unfortunately meant that build were scheduled on 20 or so pushes cloned from the tip of m-c. Manually cancelled using buildapi; but is there a way we can avoid this?
Assignee | ||
Comment 6•12 years ago
|
||
Try is reset and back to a usable state (we still have the issue of the growing heads that will cause problems shortly down the road)
What happened is while prepping for 777521 last night it appears some of the steps were accidentally triggered by an admin but thought to be cancelled in time. This was one of the admins who has been working around the clock on the try issues so I attribute this mistake to admin fatigue. These scripts are only used for /try (not a threat to any other repo) and we will add a confirmation stop to prevent this in the future.
While this bug came in about 4 hours ago, the repo has been broken for more like 6 hours.
I apologize for the inconvenience. Try should be fixed for now but we have bigger architectural issues to address still in fixing 770811
for the admins: some reason the pushlog that copied over did not match source (even though nothing changed mid flight) and had to be copied by hand. While this is no problem, I forgot to fix the g+w perm on the new pushlog and with the new ordering of the hooks, the first commit was able to be done without inserting into pushlog and screwed things up. Reset the repo again after this.
mattwoodrow confirmed it working on irc.
Assignee | ||
Updated•12 years ago
|
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment 7•12 years ago
|
||
(In reply to Ed Morley [:edmorley] from comment #5)
> (In reply to Nick Thomas [:nthomas] from comment #4)
> > I've already restarted the buildbot scheduler, it should pick up new
> > changesets as soon as Corey is finished and someone does a push. If not ping
> > catlee/bhearsum/rail.
>
> This unfortunately meant that build were scheduled on 20 or so pushes cloned
> from the tip of m-c. Manually cancelled using buildapi; but is there a way
> we can avoid this?
I'm surprised about this, it would indicate the pushlog was empty/smaller prior to Nick's scheduler restart, and then had entries added to it later. Maybe Corey did a pull or something that caused the pushlog hook to fire? Hard to be sure.
Assignee | ||
Comment 8•12 years ago
|
||
(In reply to Ben Hearsum [:bhearsum] from comment #7)
> I'm surprised about this, it would indicate the pushlog was empty/smaller
> prior to Nick's scheduler restart, and then had entries added to it later.
> Maybe Corey did a pull or something that caused the pushlog hook to fire?
> Hard to be sure.
An empty pushlog is what started this problem.
In addition, for some reason our try reset scripts resulted in a corrupt pushlog (which would have shown up empty after the repo itself looked "good") until it was fixed by hand.
Updated•11 years ago
|
Component: Server Operations: Infrastructure → Infrastructure: Other
Product: mozilla.org → Infrastructure & Operations
You need to log in
before you can comment on or make changes to this bug.
Description
•