Closed Bug 924389 Opened 11 years ago Closed 7 years ago

reportor has tons of rsync errors in cronmail

Categories

(Release Engineering :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: bhearsum, Unassigned)

References

Details

Date: Tue,  8 Oct 2013 07:00:27 -0700 (PDT)
From: Cron Daemon <root@relengwebadm.private.scl3.mozilla.com>
To: release@mozilla.com
Subject: Cron <root@relengwebadm> rsync -r --links --delete syncbld@cruncher.build.mozilla.org:/var/www/html/builds/ /mnt/netapp/relengweb/builddata/reports/

directory has vanished: "/var/www/html/builds/reportor/2013-10-05:13"
IO error encountered -- skipping file deletion
rsync warning: some files vanished before they could be transferred (code 24) at main.c(1518) [generator=3.0.9]


Looks like this particular one started on September 28th.
I would guess that a cron to delete older reportor output (or equivalent in the code doing the generation) was added then. catlee ?
Flags: needinfo?(catlee)
there's a cronjob on cruncher that also cleans up old reportor output. I suspect this gets run at the same time the rsync is running, and so stuff gets deleted after rsync has figured out which files to transfer
Flags: needinfo?(catlee)
(In reply to Chris AtLee [:catlee] from comment #3)
> there's a cronjob on cruncher that also cleans up old reportor output. I
> suspect this gets run at the same time the rsync is running, and so stuff
> gets deleted after rsync has figured out which files to transfer

Dustin, can we adjust the timing on one of these jobs to reduce the possibility of hitting this? I don't think these machines are in PuppetAgain, but I'm happy to write the patch if it's in a repo I have access to.
Flags: needinfo?(dustin)
Cruncher's managed by infra puppet.  I don't remember the name "reportor", and it seems to be in catlee's and buildduty's crontabs, so I don't think I have any involvement here.
Flags: needinfo?(dustin)
(In reply to Dustin J. Mitchell [:dustin] (I read my bugmail; don't needinfo me) from comment #5)
> Cruncher's managed by infra puppet.  I don't remember the name "reportor",
> and it seems to be in catlee's and buildduty's crontabs, so I don't think I
> have any involvement here.

This is relengweb's cron, actually. It's a bit confusing because the cronjob in question is running an rsync against cruncher.

> From: Cron Daemon <root@relengwebadm.private.scl3.mozilla.com>
Oh, I thought you were suggesting moving the cleanup cronjob.  How would you like the timing set up?  It's currently */5.
(In reply to Dustin J. Mitchell [:dustin] (I read my bugmail; don't needinfo me) from comment #7)
> Oh, I thought you were suggesting moving the cleanup cronjob.  How would you
> like the timing set up?  It's currently */5.

If the rsync and delete ones are currently both running every 5 minutes can we offset them so that one runs at 2min/7min/12min etc. and the other at 5/10/15/etc?
That could be tricky with cron syntax -- but why is a daily report getting cleaned up every 5m?
(In reply to Dustin J. Mitchell [:dustin] (I read my bugmail; don't needinfo me) from comment #9)
> That could be tricky with cron syntax -- but why is a daily report getting
> cleaned up every 5m?

Oh, looks like I was confused. The cleanup _does_ happen on cruncher...and different cleanups happen either @daily or @hourly by the looks of it. Seems I can just adjust those cronjobs to address this.
I switched the @daily cronjobs to "2 0 * * *" and the hourly ones to "2 * * * *". Hopefully that helps reduce the frequency of this.
Frequency is definitely reduced, not quite gone yet though...
We could also roll the rsync and cleanup into a single crontask, if you'd like.  I'll be happy to install a shell script on relengwebadm.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.