Closed Bug 1017479 Opened 10 years ago Closed 10 years ago

HP Health on node6.bagheera.metrics.scl3.mozilla.com is UNKNOWN: UNKNOWN - hanging hpasmdcli processes

Categories

(Infrastructure & Operations :: DCOps, task)

Other
Other
task
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nagiosapi, Assigned: Usul)

References

()

Details

(Whiteboard: [id=nagios1.private.scl3.mozilla.com:363046])

Automated alert report from nagios1.private.scl3.mozilla.com:

Hostname: node6.bagheera.metrics.scl3.mozilla.com
Service:  HP Health
State:    UNKNOWN
Output:   UNKNOWN - hanging hpasmdcli processes

Runbook:  http://m.allizom.org/HP+Health
  Using Proliant Standard
 	IPMI based System Health Monitor
  Starting Proliant Standard
 	IPMI based System Health Monitor (hpasmlited): 
hpasmlited: Not able to initialize HP iLO Management Controller: Device or resource busy

Means we probably need to reboot the host , tmary would that be ok ?
Looks like the host (intentional ?) rebooted..

No longer able to SSH into this host (stuck just after "Last login:....")

--
Severity: normal → critical
Assignee: nobody → ludovic
(In reply to T [:tmary] Meyarivan from comment #3)
> Looks like the host (intentional ?) rebooted..
> 
> No longer able to SSH into this host (stuck just after "Last login:....")

I got to the console - missed the opportunity to login (had an old password file). Then lost the screen on the virtual console. Tried to reset the box but that failed.

waiting 10 minutes before trying again.
After reboot I was able to ssh to it again.
Status: NEW → RESOLVED
Closed: 10 years ago
Component: Server Operations: MOC → Server Operations: DCOps
QA Contact: bpannabecker → dmoore
Resolution: --- → FIXED
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.