here is my vague bug summary while we diagnose what is going on
could be noteworthy. could be red herring: 14:01:25 <relengbot> [sns alert] Tue 14:01:06 PST buildbot-master03.bb.releng.use1.mozilla.com maybe_reconfig.sh: ERROR - Reconfig lockfile is older than 120 minutes. 14:01:45 <Callek> well *thats* interesting 14:11:00 <jlund> bm03 last tried starting a reconfig at master/twistd.log.2:2016-01-25 21:01:14-0800 [-] loading configuration from /builds/buildbot/tests1-linux32/master/master.cfg 14:11:17 <jlund> seems it got confused. 14:12:01 <jlund> ah, seta 14:12:07 <jlund> generic exception: Traceback (most recent call last): 14:12:07 <jlund> 2016-01-25 21:01:16-0800 [-] File "/builds/buildbot/tests1-linux32/master/config_seta.py", line 47, in get_seta_platforms 14:12:13 <jlund> not sure if that's still expected 14:13:18 <jlund> I'm going to remove the lock and manually reconfig now 14:13:28 <jlund> once checkconfig passes 14:15:10 <jlund> reconfig in progress bm03 should be in better shape now.
bm03 finished reconfig cleanly: master/twistd.log:2016-01-26 14:18:51-0800 [-] configuration update started master/twistd.log:2016-01-26 14:20:24-0800 [-] configuration update complete
and it's taking jobs again \o/ (after a graceful restart): http://buildbot-master03.bb.releng.use1.mozilla.com:8201/one_line_per_build I think all jobs that were 'claimed' by bm03 while this master was out of sync with the rest of the masters either never ran or the result was out of buildapi/status's knowledge. e.g. https://secure.pub.build.mozilla.org/buildapi/self-serve/mozilla-inbound/build/96636434 at this point, the evidence suggests this was a single master issue. To actually get results for those 'ghost' jobs, we need to either do some db hacking or maybe a re-trigger will do. either way we seem to be scheduling new jobs so I think we are okay moving forward (opening trees if closed)
We added retries to workaround this in bug 1247286.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.