Closed
Bug 546490
Opened 14 years ago
Closed 14 years ago
some new linux ix machines hung at grub
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 546424
People
(Reporter: bhearsum, Unassigned)
References
Details
Some of the Linux ix machines from bug 545134 are hung at a 'GRUB' screen. Power cycling doesn't help. Can someone look into this, please? mv-moz2-linux-ix-slave{01,08} are both in this state, and possibly others.
Comment 1•14 years ago
|
||
Dropping sev, no one's at the office right now to look. What changed? They were working when I handed them off and survived lots of reboots in the process.
Severity: critical → major
Comment 2•14 years ago
|
||
One possibility is that we changed the SATA mode from IDE to AHCI. The latter has about 50x better drive performance. But they survived plenty of reboots after that too.
Comment 3•14 years ago
|
||
Ben, What steps did you undertake before the reboot? Where did you get the kernel from?
Comment 4•14 years ago
|
||
I reverted slave01 back to IDE but same results. At this point it'll have to sit until the morning when someone can boot off a rescue image. Sounds like GRUB is missing its stage2 image or /boot is gone/broken. Worst case these will all need to be re-imaged and we don't have the master image onsite that I'm aware of.
Comment 5•14 years ago
|
||
(In reply to comment #3) > Ben, What steps did you undertake before the reboot? Where did you get the > kernel from? Also, bhearsum, are there any recent changes to puppet manifests that might have horked these machines all automatically?
Reporter | ||
Comment 6•14 years ago
|
||
(In reply to comment #3) > Ben, What steps did you undertake before the reboot? Where did you get the > kernel from? I don't know where the kernel came from; Catlee did that work. > Also, bhearsum, are there any recent changes to puppet manifests that might > have horked these machines all automatically? If there's a Puppet change that broke things I'd be very surprised to find any of them working, but I'll certainly have a look.
Comment 7•14 years ago
|
||
mrz notes catlee slave01, changed SATA type rebooted okay, builds faster applied BIOS change across all Linux slaves Noticed not seeing all 4GB RAM, install PAE kernel slave01 - installed PAE yum install kernel-pae-i686 rebooted via console manually selected PAE kernel no NIC driver slave01 was booting by default into PAE kernel with network slave8 & 25 SATA mode changed PAE kernel installed, not default
Comment 9•14 years ago
|
||
I believe catlee has this under control, punting over.
Assignee: mrz → nobody
Component: Server Operations → Release Engineering
QA Contact: mrz → release
Comment 10•14 years ago
|
||
running grub-install /dev/sda seems to have fixed both slave01, and slave08. Still not sure what the initial cause was. There's a rescue .iso image on the desktop of admin.b.m.o. If you boot the slave off of that, and then select 'grubdisk' from the first menu, then "AUTOMAGIC BOOT", and then select the 2.6.18 kernel to boot from, you can then run 'grub-install /dev/sda' as root.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
Assignee | ||
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•