Closed
Bug 1124130
Opened 9 years ago
Closed 8 years ago
High load on git1.dmz.scl3.mozilla.com
Categories
(Developer Services :: Git, defect)
Developer Services
Git
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: rwatson, Assigned: hwine)
References
Details
Attachments
(2 files)
Seeing lots of high load alerts on git this morning: nagios-scl3 Wed 02:49:06 PST [5650] git1.dmz.scl3.mozilla.com:Load is CRITICAL: CRITICAL - load average: 132.56, 162.98, 177.21
Assignee | ||
Comment 1•9 years ago
|
||
/me got paged by failures to push to git from the vcs-sync system as load spiked to 300 Looks to have started around 0930 UTC
Assignee | ||
Comment 2•9 years ago
|
||
During load, seeing quite a few requests from an older osx git client via "tail -f access_log" -- looks like that just started "recently": [root@git1.dmz.scl3 httpd]# egrep -c "\(Apple Git-33\)\"$" access_log* access_log:1641 access_log-20141228:0 access_log-20150104:0 access_log-20150111:0 access_log-20150118:0
Assignee | ||
Comment 3•9 years ago
|
||
/me notes box is configured with only 2GB swap, may want to try increase for peak loads like this also "khugepaged" makes an appearance in top -- issues with at reported on web seem to match what we're seeing: https://bugzilla.redhat.com/show_bug.cgi?id=879801 trying https://bugzilla.redhat.com/show_bug.cgi?id=879801#c17
Assignee | ||
Comment 4•9 years ago
|
||
Applied: [root@git1.dmz.scl3 httpd]# cat /sys/kernel/mm/redhat_transparent_hugepage/defrag [always] madvise never [root@git1.dmz.scl3 httpd]# echo never > !$ echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag [root@git1.dmz.scl3 httpd]# !cat cat /sys/kernel/mm/redhat_transparent_hugepage/defrag always madvise [never]
Assignee | ||
Comment 5•9 years ago
|
||
Okay, I'm happy with that result :)
Assignee | ||
Comment 6•9 years ago
|
||
Hmm, less sure comment 4 has anything to do with it. The end of the last event had a similar drop off, and it doesn't appear we took any action. See bug 1087640 attachment 8510038 [details] ni :bkero & :gps to render opinion on leaving change in comment 4 applied, which was based on https://bugzilla.redhat.com/show_bug.cgi?id=879801#c17
Assignee: nobody → hwine
Status: NEW → ASSIGNED
Flags: needinfo?(gps)
Flags: needinfo?(bkero)
OS: Mac OS X → All
Hardware: x86 → All
See Also: → 1087640
Comment 7•9 years ago
|
||
I don't have an opinion on the kernel change because I'm not familiar with the subject matter. I reckon this is Git doing repacks somewhere. Do we have a CRON job doing periodic repacks? This would help prevent random repacks on client-initiated server-side operations and would put us in more control of server behavior.
Flags: needinfo?(gps)
Assignee | ||
Comment 8•9 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #7) > I reckon this is Git doing repacks somewhere. > > Do we have a CRON job doing periodic repacks? This would help prevent random > repacks on client-initiated server-side operations and would put us in more > control of server behavior. No - opened bug 1124754 for this work
See Also: → 1124754
Comment 9•9 years ago
|
||
I too don't know enough about the effects of hugepage defragging on system performance on loaded systems to advise on whether to keep it on. Likely if it is still in this state now it doesn't make much difference in performance.
Flags: needinfo?(bkero)
Comment 10•9 years ago
|
||
Socket timeout errors 8:42 AM <@nagios-scl3> Tue 08:42:48 PDT [5194] git1.dmz.scl3.mozilla.com:http - gitweb Port 80 is CRITICAL: CRITICAL - Socket timeout after 60 seconds (http://m.mozilla.org/http+-+gitweb+Port+80) & Host: git-zlb.vips.scl3.mozilla.com Service: HTTP - Port 80 Service State: CRITICAL
Assignee | ||
Comment 12•8 years ago
|
||
no longer meaningful in light of bug 1277297
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•