Closed Bug 437049 Opened 17 years ago Closed 17 years ago

bm-vmware05 is "not responding" according to the VI Client

Categories

(mozilla.org Graveyard :: Server Operations, task)

x86
macOS
task
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: mrz)

Details

This host has a few tier1 machines on it, filing as blocker.
Summary: bm-vmware05 is "not responding" according tot he VI Client → bm-vmware05 is "not responding" according to the VI Client
Just realized that even though the host is connected, the tier1 machines are still reachable. downgrading
Severity: blocker → critical
Assignee: server-ops → mrz
Looks removed from VC too. This happened once before on another ESX host. The VC process on the ESX host was wedged and needed a restart. For posterity, this is done by: service mgmt-vmware stop service vmware-vpxa stop service mgmt-vmware start service vmware-vpxa start I will attempt to do that now!
already tried that, removed it from VI and can't re-add it.
bm-vmware04 refuses to see or re-discover netapp-b's LUNs. This mostly affects fx-linux64-slave1 - it's running but the host can't find it's datastore (so very likely something is not working on the VM). I'm trying to move the existing VMs off bm-vmare05 so I can reboot it. It's slower than usual probably because of netapp-b's missing LUNs. I can either go at this (slowly) or take a quicker route and reboot the whole ESX host taking 5 VMs down with it. Currently going on the slow path...
Currently shutting down 4 VMs (try1-win32-slave, prometheus-vm, tb-linux-tbox, try1-linux-slave) on bm-vmware04. Managed to move "egg" to bm-vmware02.
(In reply to comment #5) > Currently shutting down 4 VMs (try1-win32-slave, prometheus-vm, tb-linux-tbox, > try1-linux-slave) on bm-vmware04. Managed to move "egg" to bm-vmware02. > ...errr...typo. That should be bm-vmware05, not bm-vmware04.
We started the VMs at 3pm PDT after mrz got the host rebooted and back in the VI. prometheus-vm had some disk corruption and I had to restore some parts of perl from staging-prometheus-vm after fsck was finished. Asking rpm to verify all the installed packages gave only config changes after that. I see that there were 4 attempts to move prometheus-vm to different storage and it's off now. So what's the story with this VM ? Can I turn it back on and provide updates to our nightly testers ? And what's the overall status of the host ?
this was resolved yesterday and is related to the larger netapp perf issues.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.