Closed Bug 1280498 Opened 8 years ago Closed 8 years ago

generic3.webapp.phx1 Using 512 out of 512 Clients

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: mdevney, Unassigned)

Details

(Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/3137])

matthew devney [:jedi]

Reporter

Description

•

8 years ago

< nagios-phx1> Thu 16:13:16 UTC [1017] generic3.webapp.phx1.mozilla.com:httpd max clients is WARNING: Using 512 out of 512 Clients (http://m.mozilla.org/httpd+max+clients)


From http://generic3.webapp.phx1.mozilla.com/server-status yup, confirming that it's using all 512 clients, and all 8 threads of each.  

The vast majority of these listed are for pastebin.mozilla.org, in state Logging.
Example:
Srv	PID	Acc	M	CPU 	SS	Req	Conn	Child	Slot	Client	VHost	Request
0-8	5972	24/558/79354	L 	12.16	22194	0	1425.2	147.96	20421.02 	172.6.192.161	pastebin.mozilla.org	GET /?dl=8877534 HTTP/1.1


Despite all these threads trying to log iostat doesn't show much disk access happening:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          27.35    0.00    5.41    0.40    0.00   66.83

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sdb              15.60       558.40        36.80       2792        184
sda               0.00         0.00         0.00          0          0
dm-0             18.00       558.40        36.80       2792        184
dm-1              0.00         0.00         0.00          0          0

:kanban

Updated

•

8 years ago

Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/3137]

matthew devney [:jedi]

Reporter

Comment 1

•

8 years ago

Error reported on zlb:

Node 10.8.81.93:80: Monitor failed. A Monitor that was assigned to this node failed. First failed 17 seconds ago.  (This error was reported by zlb1.internal.private.phx1.mozilla.com)

matthew devney [:jedi]

Reporter

Comment 2

•

8 years ago

System ram is full - can't just increase maxclients.  That will make it fall over.
No hung nor missing NFS mounts.  
Not seeing historical data on zlb as the runbook suggests, but given today's place in work week it's very likely there is unusually high load right now.

matthew devney [:jedi]

Reporter

Comment 3

•

8 years ago

bunch of this stuff in httpd's error.log

/usr/bin/diff3: standard output: Broken pipe
/usr/bin/diff3: write failed
/usr/bin/diff3: standard output: Broken pipe
/usr/bin/diff3: write failed

matthew devney [:jedi]

Reporter

Comment 4

•

8 years ago

highest ram users

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                               
16155 apache    20   0  880m  90m 3716 S  0.0  0.8  12:03.33 httpd                                 
16165 apache    20   0  869m  87m 3660 S  0.0  0.7  44:16.49 httpd                                 
16153 apache    20   0  880m  85m 3684 S  0.0  0.7  12:18.08 httpd                                 
16154 apache    20   0  874m  82m 3684 S  0.0  0.7  12:17.30 httpd                                 
16164 apache    20   0  805m  81m 3656 S  0.0  0.7  46:43.14 httpd                                 
16160 apache    20   0  805m  80m 3664 S  0.0  0.7  45:25.08 httpd                   

[root@generic3.webapp.phx1 httpd]# ps -ef | grep httpd | wc -l
585

Keegan Ferrando [:fauweh]

Comment 5

•

8 years ago

jedi restarted httpd and alerts cleared.

Eric Ziegenhorn :ericz

Comment 6

•

8 years ago

Memory use has been climbing steadily since this morning until it ran out, so as long as it doesn't start rising again we should be ok.  Thus far, it looks to be stable and in line with the other nodes, so hopefully this was a one-off event.  https://graphite-scl3.mozilla.org/dashboard/#generic-prod-webheads

Ryan Watson [:w0ts0n]

Comment 7

•

8 years ago

All clear since then. Calling this one.

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

generic3.webapp.phx1 Using 512 out of 512 Clients

Categories

(Infrastructure & Operations :: IT-Managed Tools, task)

Tracking

(Not tracked)

People

(Reporter: mdevney, Unassigned)

References

Details

(Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/3137])

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7