Closed
Bug 1012281
Opened 11 years ago
Closed 11 years ago
puppet foreman plugin OOM'ing httpd in scl3, corp.phx1
Categories
(Infrastructure & Operations :: Infrastructure: Puppet, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: nagiosapi, Assigned: Atoll)
References
()
Details
(Whiteboard: [id=nagios1.private.scl3.mozilla.com:358911])
Automated alert report from nagios1.private.scl3.mozilla.com:
Hostname: puppetmaster2.private.scl3.mozilla.com
Service: Puppetmaster backend httpd
State: CRITICAL
Output: CRITICAL - Socket timeout after 10 seconds
Runbook: http://m.allizom.org/Puppetmaster+backend+httpd
Updated•11 years ago
|
Assignee: nobody → infra
Component: Server Operations: MOC → Infrastructure: Puppet
Product: mozilla.org → Infrastructure & Operations
QA Contact: jdow
Comment 1•11 years ago
|
||
fixed itself.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
:jakem added a firewall rule for the puppet servers that permits them to contact the third Zeus; it's unclear why this weekend's upgrade work exposed the missing firewall rule, but we also discovered that the foreman reporting plugin does network timeouts *very badly*, OOM'ing httpd gradually.
Reopening until we reenable foreman tomorrow.
Assignee: infra → rsoderberg
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Summary: Puppetmaster backend httpd on puppetmaster2.private.scl3.mozilla.com is CRITICAL: CRITICAL - Socket timeout after 10 seconds → puppet foreman plugin OOM'ing httpd in scl3, corp.phx1
:jakem resolved the sticky routing issues on the Zeus, so in theory we should be good to go.
Comment 4•11 years ago
|
||
Is this still an issue?
Nope.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•