sumocelery1.webapp.phx1.mozilla.com crashed

RESOLVED FIXED

Status

Infrastructure & Operations
WebOps: Other
P3
normal
RESOLVED FIXED
6 years ago
5 years ago

People

(Reporter: ericz, Assigned: phrawzty)

Tracking

Details

(Whiteboard: [triaged 20120828])

(Reporter)

Description

6 years ago
sumocelery1.webapp.phx1.mozilla.com crashed and the ilo was not working.  The logs show nothing, but it has an old kernel, bios and raid controller firmware.  All three should be upgraded.
(Reporter)

Comment 1

6 years ago
I was told this is a webops application now.  Please provide feedback on when I can upgrade firmware, kernel and do a yum update on this box, with a reboot.  Also, let me know if there are any packages I shouldn't upgrade.
Assignee: eziegenhorn → server-ops-webops
Component: Server Operations → Server Operations: Web Operations
QA Contact: phong → cshields

Updated

6 years ago
Whiteboard: [pending triage]

Comment 2

6 years ago
@r1cky: when can we take this node down to work on it?
Whiteboard: [pending triage] → [triaged 20120824][waiting][webdev]
@jakem: How long will it take and how soon can it be? I would just need to notify the community about email being delayed and things like that.
An hour tops...
OK, just let me know when you want to do it and let's do it!
Aj will take care of this Monday at 10am PDT
Assignee: server-ops-webops → afernandez
Component: Server Operations: Web Operations → Server Operations
QA Contact: cshields → jdow
Whiteboard: [triaged 20120824][waiting][webdev] → [triaged 20120824]
I am starting to work on this, should be done by 11 am PDT.
Hardware: HP - BL460c G6
ilo2 updated: 2.01 -> 2.09
BIOS: 08/16/2010 -> 05/05/2011
RAID Controller (Storage Controller): 3.52 -> 5.70
kernel: 2.6.32-71.el6.x86_64 -> 2.6.32-279.2.1
Since no excluded packages were mentioned, 301 packages were updated.

Ricky please verify that all is well, thank you.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
I just saw this error come in:

AMQPConnectionException: (320, u"CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'", (0, 0), '')


Related?
Checked with :r1cky on irc, said he only received that error @ 11:02 PDT (2:02pm EST).
Has not received any more and still verifying that all is well.
Just got another one at 11:25 PDT (2:25pm EDT).

AMQPConnectionException: (320, u"CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'", (0, 0), '')

Updated

6 years ago
Assignee: afernandez → server-ops-webops
Status: RESOLVED → REOPENED
Component: Server Operations → Server Operations: Web Operations
QA Contact: jdow → cshields
Resolution: FIXED → ---
To clarify, so far celery seems to be working well. We just have had about 3 of those AMQPConnectionException.

Updated

6 years ago
Priority: -- → P3
Whiteboard: [triaged 20120824] → [triaged 20120828]
(Assignee)

Comment 13

6 years ago
Hello everybody,

This bug has been languishing in the webops queue for a couple of months now, and I'd like to either :
a) establish a concrete set of actionable items required to close this bug, or
b) close the bug and ever forward, forward to the future !

Feedback welcome.  Thanks !
Flags: needinfo?
I haven't seen any more errors after the day the work was done. If there isn't anything left to do, this is fixed from my end.
Flags: needinfo?
(Assignee)

Comment 15

6 years ago
Excellent - thanks for the rapid feedback.
Assignee: server-ops-webops → dmaher
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → FIXED
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.