Slaverebooter hung on Aug 25



5 years ago
11 months ago


(Reporter: Callek, Assigned: Callek)




(1 attachment)



5 years ago
So, this morning we got a slaverebooter alert:

[07:54:41]	nagios-releng	Mon 04:55:06 PDT [4905] Age - /builds/slaverebooter/slaverebooter.log is CRITICAL: FILE_AGE CRITICAL: /builds/slaverebooter/slaverebooter.log is 22034 seconds old and 27621448 bytes (
[08:54:41]	nagios-releng	Mon 05:55:06 PDT [4908] Age - /builds/slaverebooter/slaverebooter.log is CRITICAL: FILE_AGE CRITICAL: /builds/slaverebooter/slaverebooter.log is 25634 seconds old and 27621448 bytes (
[09:24:42]	nagios-releng	Mon 06:25:07 PDT [4909] Age - /builds/slaverebooter/slaverebooter.log is OK: FILE_AGE OK: /builds/slaverebooter/slaverebooter.log is 436 seconds old and 27621583 bytes (

The clear was from simone doing a kill on the slaverebooter process on bm74

when I grep slaverebooter.log for Aug 25 the first line is the very last line of the attached log.

The preceeding log lines are everything from the last run on Aug 24.

Note the *one* case of "2014-08-24 22:46:52,042 - ERROR - b-linux64-hp-0004 - Caught exception while processing" which is a slaverebooter error. telling as well is that theres no traceback there.

I'm going to leave this open and assigned to me for now, only bumping priority atm if it happens again.

Comment 1

5 years ago
There has been at least 1 or 2 other instances of this since this bug was filed, however in at least the last 2 weeks there hasn't been one (last week I was buildduty so was *hoping* to catch it then).

I'm going to resolve/wfm for now, and if it happens again pete/coop I'd suggest avoiding trying to fix, and just let me know, so I can triage "live". and we can probably still use this bug as a bouncing point for that.
Last Resolved: 5 years ago
Resolution: --- → WORKSFORME


11 months ago
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.