Closed Bug 538739 Opened 15 years ago Closed 15 years ago

Revive all RelEng machines/VMs that died when colo failed

Categories

(Release Engineering :: General, defect, P1)

x86
All
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: joduinn, Assigned: joduinn)

Details

At 23:13, IT called to warn that MPT colo started overheating approx 22:45. Colo is now "repaired", but we're still figuring out what systems powered down as a safety measure. This bug is to track what needs to be revived.
production-1.9-master bm-win2000-01 fx-win32-1.9-slave04 staging-opsi try-win32-slave08 moz2-win32-slave11 moz2-linux64-slave10 jhford-temp-linux64 moz2-win32-slave59 moz2-linux-slave11 try-win32-slave06 moz2-linux-slave45 moz2-linux-slave10 moz2-linux-slave41 moz2-win32-slave14 moz2-linux-slave46 mozillabuild-builder moz2-linux-slave42 bm-l10n-db moz2-win32-slave04 cerberus-vm try-win32-slave26 try-win32-slave28 nb-l10n-dashboard01 nb-l10n-db
Axel, I've powered these VMs back up, but cant find anything about how these should be configured to get back into production. Can you have a look at these and revive as nessesary? bm-l10n-db nb-l10n-dashboard01 nb-l10n-db
These are the remaining slaves that still need work. fx-win32-1.9-slave04 moz2-win32-slave04 cerberus-vm I'm not sure if I'm following wrong docs, or if the doc needs work, but, regardless, I could not figure these three machines out tonight; I will look again in the morning.
nb-l10n might be the clones for AMS, not sure why're they're even in the same colo. The db comes up allright, I have repaired bm-l10n-dashboard01, which apparently didn't die as a VM, but as a buildbot.
(In reply to comment #3) > These are the remaining slaves that still need work. > > fx-win32-1.9-slave04 > moz2-win32-slave04 These are both now fine, with green builds. > cerberus-vm Should this VM be revived, and if so, how? The support doc linked in inventory does not seem to apply to this VM - missing drives, mismatched directory names, etc. Looked around the VM, I found Thunderbird 1.8 l10n files, but the most recent were months old (from June 2009). Nothing obvious that looks like it was enabled before colo failure and now needs to be revived. For now, I'm leaving this VM off until we figure out whats needed here. If we need it running, a pointer to the right support doc would be great help.
The nb-* stuff should only be in AMS and not anywhere else. I'll check.
(In reply to comment #5) > (In reply to comment #3) > > cerberus-vm > Should this VM be revived, and if so, how? cerberus-vm used to do nightly l10n for Fx2 and Tb2. Fx2 is RIP of course, and in bug 491077 the Tb builds got moved elsewhere. So I think this should be idle now. There may be other Fx2/Tb2 machines that didn't get get cleaned up yet, eg karma.
All done here. Reopen if I missed anything. (Filed bug#539142 to mothball some obsolete VMs discovered during cleanup)
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.