unable to build dev env: rsync: connection unexpectedly closed

RESOLVED WORKSFORME

Status

--
critical
RESOLVED WORKSFORME
a year ago
a year ago

People

(Reporter: glob, Unassigned)

Tracking

Details

(Reporter)

Description

a year ago
likely fallout from bug 1349407.  i should have tested it with a full rebuild.

hgweb> rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
hgweb> rsync error: error in rsync protocol data stream (code 12) at io.c(605) [Receiver=3.0.9]
hgrb> rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
hgrb> rsync error: error in rsync protocol data stream (code 12) at io.c(605) [Receiver=3.0.9]
rbweb> rsync: connection unexpectedly closed (0 bytes received so far) [receiver]
rbweb> rsync error: error in rsync protocol data stream (code 12) at io.c(600) [receiver=3.0.6]
ERROR hgweb> rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
ERROR hgweb> rsync error: error in rsync protocol data stream (code 12) at io.c(605) [Receiver=3.0.9]
ERROR rbweb> rsync: connection unexpectedly closed (0 bytes received so far) [receiver]
ERROR rbweb> rsync error: error in rsync protocol data stream (code 12) at io.c(600) [receiver=3.0.6]
ERROR hgrb> rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
ERROR hgrb> rsync error: error in rsync protocol data stream (code 12) at io.c(605) [Receiver=3.0.9]
(Reporter)

Comment 2

a year ago
docker-machine version 0.10.0, Docker version 17.03.1-ce, boot2docker 17.03.0-ce on osx 10.12.3
I haven't attempted to reproduce this yet. However, I noticed today that my `rsync` processes between TCP endpoints are now just as slow as volume/mount-based rsyncs which bug 1349407 purportedly fixed by using TCP endpoints. I have no clue WTF is going on. Docker is my bane.

I guess that means we can back out the switch from volume-based rsyncs since it didn't work out in the end.
As part of hacking on the bmoweb image yesterday, I think I found the source of slowness: fchownat() is super slow on overlay2. If you do an internet search for "fchownat overlay2" many of the results mention Docker (probably because Docker is one of the major users of overlay2). I suspect the underlying issue is "touching" file's metadata, thus triggering COW at the filesystem level or something.

We could potentially speed up rsync by telling it to avoid syncing certain metadata, such as uid, gid, and mtime values. I think processes that access the v-c-t rsync are running as root, so the uid/gid shouldn't matter.
Replacing `rsync -a` with `rsync -rlp` sped up `d0cker build-hgmo` from 165s to 39s. Good grief.
(Reporter)

Comment 7

a year ago
from bug 1349407:

Pushed by gszorc@mozilla.com:
https://hg.mozilla.org/hgcustom/version-control-tools/rev/63f19cfacc5c
docker: stop using `rsync -a` because owner/group updates are slow on overlayfs

i had no issues with a full dev env rebuild after with this change.
I'm going to call this WFM then.
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.