Closed
Bug 1280498
Opened 8 years ago
Closed 8 years ago
generic3.webapp.phx1 Using 512 out of 512 Clients
Categories
(Infrastructure & Operations :: IT-Managed Tools, task)
Infrastructure & Operations
IT-Managed Tools
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mdevney, Unassigned)
Details
(Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/3137])
< nagios-phx1> Thu 16:13:16 UTC [1017] generic3.webapp.phx1.mozilla.com:httpd max clients is WARNING: Using 512 out of 512 Clients (http://m.mozilla.org/httpd+max+clients) From http://generic3.webapp.phx1.mozilla.com/server-status yup, confirming that it's using all 512 clients, and all 8 threads of each. The vast majority of these listed are for pastebin.mozilla.org, in state Logging. Example: Srv PID Acc M CPU SS Req Conn Child Slot Client VHost Request 0-8 5972 24/558/79354 L 12.16 22194 0 1425.2 147.96 20421.02 172.6.192.161 pastebin.mozilla.org GET /?dl=8877534 HTTP/1.1 Despite all these threads trying to log iostat doesn't show much disk access happening: avg-cpu: %user %nice %system %iowait %steal %idle 27.35 0.00 5.41 0.40 0.00 66.83 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sdb 15.60 558.40 36.80 2792 184 sda 0.00 0.00 0.00 0 0 dm-0 18.00 558.40 36.80 2792 184 dm-1 0.00 0.00 0.00 0 0
Reporter | ||
Comment 1•8 years ago
|
||
Error reported on zlb: Node 10.8.81.93:80: Monitor failed. A Monitor that was assigned to this node failed. First failed 17 seconds ago. (This error was reported by zlb1.internal.private.phx1.mozilla.com)
Reporter | ||
Comment 2•8 years ago
|
||
System ram is full - can't just increase maxclients. That will make it fall over. No hung nor missing NFS mounts. Not seeing historical data on zlb as the runbook suggests, but given today's place in work week it's very likely there is unusually high load right now.
Reporter | ||
Comment 3•8 years ago
|
||
bunch of this stuff in httpd's error.log /usr/bin/diff3: standard output: Broken pipe /usr/bin/diff3: write failed /usr/bin/diff3: standard output: Broken pipe /usr/bin/diff3: write failed
Reporter | ||
Comment 4•8 years ago
|
||
highest ram users PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 16155 apache 20 0 880m 90m 3716 S 0.0 0.8 12:03.33 httpd 16165 apache 20 0 869m 87m 3660 S 0.0 0.7 44:16.49 httpd 16153 apache 20 0 880m 85m 3684 S 0.0 0.7 12:18.08 httpd 16154 apache 20 0 874m 82m 3684 S 0.0 0.7 12:17.30 httpd 16164 apache 20 0 805m 81m 3656 S 0.0 0.7 46:43.14 httpd 16160 apache 20 0 805m 80m 3664 S 0.0 0.7 45:25.08 httpd [root@generic3.webapp.phx1 httpd]# ps -ef | grep httpd | wc -l 585
Comment 5•8 years ago
|
||
jedi restarted httpd and alerts cleared.
Comment 6•8 years ago
|
||
Memory use has been climbing steadily since this morning until it ran out, so as long as it doesn't start rising again we should be ok. Thus far, it looks to be stable and in line with the other nodes, so hopefully this was a one-off event. https://graphite-scl3.mozilla.org/dashboard/#generic-prod-webheads
Comment 7•8 years ago
|
||
All clear since then. Calling this one.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•