Closed
Bug 557268
Opened 15 years ago
Closed 13 years ago
release dependent schedulers sometimes don't fire
Categories
(Release Engineering :: General, defect, P3)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: bhearsum, Assigned: bhearsum)
References
Details
(Whiteboard: [automation][buildmasters])
We've had a few cases recently where the source/build dep scheduler didn't fire, for no discernible reason. The only commonality I've noticed is that it's always happened in cases where there is multiple reconfigs close together -- for a build2 in every case IIRC.
| Assignee | ||
Comment 1•15 years ago
|
||
There's two possible work items here:
* Figure out why the dep schedulers broke, and fix that issue
or
* Switch to Triggerable, giving careful thought to whether or not it's a bad thing that there's *no* easy way to prevent subsequent builds from firing like there is with Dependent.
Updated•15 years ago
|
Whiteboard: [automation][buildmasters]
Comment 2•15 years ago
|
||
How about a triggerable with a config flag of whether to trigger them?
That would require a reconfig, but so does a lot of other release automation recovery.
| Assignee | ||
Comment 3•15 years ago
|
||
Rather than use a flag, we could use a property which defaults to True. That way, we can override without a reconfig.
Comment 4•14 years ago
|
||
I hit this issue for the 3.6.14 release.
I filed http://trac.buildbot.net/ticket/1777 to keep track of it.
#######################
We triggered a release on Friday and in between then and today 2 reconfigurations happened.
On Monday the "updates" builder [1] got triggered by an ftpPoller [2] and it was supposed to trigger the "update_verify" builders [3].
The problem is that a reconfigure happened before that and it made the Dependent scheduler to forget who to trigger.
We could switch to trigger steps but then it prevents us from doing a "force build" and not have any dependent jobs to be triggered.
The release was triggered at 14:32 on Friday.
The updates builder was triggered at 13:01 on Monday.
2 reconfigures happened in between.
Could this be the place where the factory is considered to be changed and the memory loss happen?
2011-01-21 21:56:03-0800 [-] updating builder release-mozilla-1.9.2-updates: factory changed
nextSlave changed from <function _nextFastReservedSlave at 0x1953e64c> to <function _nextFastReservedSlave at 0x1d747b1c>
2011-01-21 21:56:03-0800 [-] consumeTheSoulOfYourPredecessor: <Builder release-mozilla-1.9.2-updates at 500572364> feeding upon <Builder release-mozilla-1.9.2-updates at 240793324>
Reconfigs on masters listed chronologically (first two happened before "updates" job got triggered):
twistd.log.228:2011-01-21 21:56:05-0800 [-] configuration update started
twistd.log.228:2011-01-21 21:56:37-0800 [-] configuration update complete
twistd.log.65:2011-01-24 12:48:34-0800 [-] configuration update started
twistd.log.65:2011-01-24 12:48:55-0800 [-] configuration update complete
twistd.log.52:2011-01-24 16:29:40-0800 [-] configuration update started
twistd.log.52:2011-01-24 16:31:24-0800 [-] configuration update complete
[1] Updates scheduler - http://hg.mozilla.org/build/buildbotcustom/file/tip/process/release.py#l313
[2] ftpPoller - http://hg.mozilla.org/build/buildbotcustom/file/tip/process/release.py#l212
[3] Update_verify builder - http://hg.mozilla.org/build/buildbotcustom/file/tip/process/release.py#l327
Updated•14 years ago
|
Blocks: hg-automation
Comment 5•13 years ago
|
||
I would suggest to wontfix this:
* we are going to use AggregatingScheduler instead of Dependent for some builders
* Other Dependent schedulers (tag and build) happen in a very short period of time after release sendchange. So it will be safe to reconfig ~30-60 minutes after we start a release.
If you don't want to wontfix, I can grab the bug.
| Assignee | ||
Updated•13 years ago
|
Assignee: nobody → bhearsum
| Assignee | ||
Comment 6•13 years ago
|
||
Yeah, let's WONTFIX.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WONTFIX
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•