Closed
Bug 1004674
Opened 11 years ago
Closed 11 years ago
Fix rsync://releases-rsync.mozilla.org/
Categories
(Infrastructure & Operations Graveyard :: WebOps: Product Delivery, task)
Infrastructure & Operations Graveyard
WebOps: Product Delivery
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: gozer, Assigned: gozer)
References
Details
Attachments
(1 file)
482 bytes,
patch
|
Details | Diff | Splinter Review |
Looks like the rsync service is broken, as in:
$> rsync rsync://releases-rsync.mozilla.org/
rsync: did not see server greeting
rsync error: error starting client-server protocol (code 5) at main.c(1635) [Receiver=3.1.0]
Looking at the rsync boxes, we are being hit by xinetd rate limiting the ZLBs
[root@rsync1.dmz.scl3 ~]# tail -f /var/log/messages
May 1 13:16:48 rsync1 xinetd[3146]: START: rsync pid=29416 from=::ffff:10.22.74.212
May 1 13:16:48 rsync1 xinetd[3146]: START: rsync pid=29417 from=::ffff:10.22.74.210
May 1 13:16:49 rsync1 xinetd[3146]: EXIT: rsync status=12 pid=29406 duration=3(sec)
May 1 13:16:49 rsync1 xinetd[3146]: EXIT: rsync status=12 pid=29407 duration=2(sec)
May 1 13:16:49 rsync1 xinetd[3146]: FAIL: rsync per_source_limit from=::ffff:10.22.74.208
Assignee | ||
Updated•11 years ago
|
Assignee: server-ops-webops → gozer
Status: NEW → ASSIGNED
Assignee | ||
Comment 1•11 years ago
|
||
Also problematic was that Zeus was doing health-check every 5 seconds, causing lots of rsync process churn.
Switched to a calmer health-check seemed to do the trick and quiesced the rsync boxes.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment 2•11 years ago
|
||
Not sure on the structure of the rsyncs, but are are multiple deamons supposed to be running?
This is creating a "high" load on both nodes;
rsync1.dmz.scl3.mozilla.com:Load is CRITICAL: CRITICAL - load average:26.66, 26.55, 25.80
Fri 09:50:56 PDT [5106] rsync2.dmz.scl3.mozilla.com:Load is CRITICAL: CRITICAL - load average: 28.55, 25.31, 20.37
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 3•11 years ago
|
||
As per irc, both alerts have been downtimed for 1day (24 hours).
Assignee | ||
Comment 4•11 years ago
|
||
Rsyncd processes are managed by xinetd, and since Zeus isn't checking aggressively anymore, I need to look at if this is normal usage or caused by some problem.
Need to do a bit more digging.
Assignee | ||
Comment 5•11 years ago
|
||
Looking at these, looks like it's not a bug. Just looks like many clients doing a fairly slow rsync of our content.
I am assuming since we've been broken for a while, that now, a few folks out there are finally playing catch-up.
For now, I'll keep this bug open and keep watching over stuff.
I would hope this would just be the symptom of many mirrors out there having to sync-up lots of content. I'll have to wait and see some.
Comment 6•11 years ago
|
||
Sat 11:06:53 PDT [5844] rsync2.dmz.scl3.mozilla.com:Load is CRITICAL: CRITICAL - load average: 28.47, 27.63, 25.24 (http://m.mozilla.org/Load)
Sat 11:08:54 PDT [5845] rsync1.dmz.scl3.mozilla.com:Load is CRITICAL: CRITICAL - load average: 28.14, 28.10, 27.89
Assignee | ||
Comment 7•11 years ago
|
||
The service is working now, and can expect up to 30 rsync clients per-box, so I just cranked up the nagios load check's limits.
In most cases, these will be relatively idle processes blocked on the client's ability to suck down data.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Updated•9 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•