Closed Bug 557268 Opened 10 years ago Closed 8 years ago
release dependent schedulers sometimes don't fire
We've had a few cases recently where the source/build dep scheduler didn't fire, for no discernible reason. The only commonality I've noticed is that it's always happened in cases where there is multiple reconfigs close together -- for a build2 in every case IIRC.
There's two possible work items here: * Figure out why the dep schedulers broke, and fix that issue or * Switch to Triggerable, giving careful thought to whether or not it's a bad thing that there's *no* easy way to prevent subsequent builds from firing like there is with Dependent.
How about a triggerable with a config flag of whether to trigger them? That would require a reconfig, but so does a lot of other release automation recovery.
Rather than use a flag, we could use a property which defaults to True. That way, we can override without a reconfig.
9 years ago
9 years ago
No longer blocks: 478420
I hit this issue for the 3.6.14 release. I filed http://trac.buildbot.net/ticket/1777 to keep track of it. ####################### We triggered a release on Friday and in between then and today 2 reconfigurations happened. On Monday the "updates" builder  got triggered by an ftpPoller  and it was supposed to trigger the "update_verify" builders . The problem is that a reconfigure happened before that and it made the Dependent scheduler to forget who to trigger. We could switch to trigger steps but then it prevents us from doing a "force build" and not have any dependent jobs to be triggered. The release was triggered at 14:32 on Friday. The updates builder was triggered at 13:01 on Monday. 2 reconfigures happened in between. Could this be the place where the factory is considered to be changed and the memory loss happen? 2011-01-21 21:56:03-0800 [-] updating builder release-mozilla-1.9.2-updates: factory changed nextSlave changed from <function _nextFastReservedSlave at 0x1953e64c> to <function _nextFastReservedSlave at 0x1d747b1c> 2011-01-21 21:56:03-0800 [-] consumeTheSoulOfYourPredecessor: <Builder release-mozilla-1.9.2-updates at 500572364> feeding upon <Builder release-mozilla-1.9.2-updates at 240793324> Reconfigs on masters listed chronologically (first two happened before "updates" job got triggered): twistd.log.228:2011-01-21 21:56:05-0800 [-] configuration update started twistd.log.228:2011-01-21 21:56:37-0800 [-] configuration update complete twistd.log.65:2011-01-24 12:48:34-0800 [-] configuration update started twistd.log.65:2011-01-24 12:48:55-0800 [-] configuration update complete twistd.log.52:2011-01-24 16:29:40-0800 [-] configuration update started twistd.log.52:2011-01-24 16:31:24-0800 [-] configuration update complete  Updates scheduler - http://hg.mozilla.org/build/buildbotcustom/file/tip/process/release.py#l313  ftpPoller - http://hg.mozilla.org/build/buildbotcustom/file/tip/process/release.py#l212  Update_verify builder - http://hg.mozilla.org/build/buildbotcustom/file/tip/process/release.py#l327
8 years ago
No longer blocks: 627271
8 years ago
I would suggest to wontfix this: * we are going to use AggregatingScheduler instead of Dependent for some builders * Other Dependent schedulers (tag and build) happen in a very short period of time after release sendchange. So it will be safe to reconfig ~30-60 minutes after we start a release. If you don't want to wontfix, I can grab the bug.
Yeah, let's WONTFIX.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.