Closed Bug 629648 Opened 13 years ago Closed 9 years ago

TriggerBouncerCheck should survive after reconfig

Categories

(Release Engineering :: Release Automation: Other, defect, P5)

x86
Linux
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: rail, Unassigned)

References

Details

(Whiteboard: [releases])

Attachments

(1 file)

The poller itself continues working but can't trigger dependent builders:

        Traceback (most recent call last):
          File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-i686.egg/twisted/internet/defer.py", line 441, in _runCallbacks
            self.result = callback(self.result, *args, **kw)
          File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-i686.egg/twisted/internet/defer.py", line 664, in _cbDeferred
            self.callback(self.resultList)
          File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-i686.egg/twisted/internet/defer.py", line 318, in callback
            self._startRunCallbacks(result)
          File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-i686.egg/twisted/internet/defer.py", line 424, in _startRunCallbacks
            self._runCallbacks()
        --- <exception caught here> ---
          File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-i686.egg/twisted/internet/defer.py", line 441, in _runCallbacks
            self.result = callback(self.result, *args, **kw)
          File "/builds/buildbot/rail/buildbotcustom/scheduler.py", line 326, in checkUptake
            Triggerable.trigger(self, self.ss, self.set_props)
          File "/tools/buildbot-0.8.2/lib/python2.6/site-packages/buildbot-0.8.2_hg_a63f22816750_production_0.8-py2.6.egg/buildbot/schedulers/triggerable.py", line 66, in trigger
            d = self.parent.db.runInteraction(self._trigger, ss, props)
        exceptions.AttributeError: 'NoneType' object has no attribute 'db'
Priority: -- → P4
Dustin, do you have any idea how to fix this or in which direction should I dig here?
This means that the poller that's running is the *old* poller, from before the reconfig.  When a reconfig occurs, all schedulers that have changed get shut down, and all the new schedulers are started.  When they're shut down, their .parent gets set to None.

So your scheduler should delay the completion of stopService (by returning a Deferred) until the current poll operation is complete, lest it trigger a downstream build *after* the Scheduler has been stopped.
Attached patch buildbotcustomSplinter Review
Yeah, probably there is no easy way to retrigger this triggerable poller after reconfig/restart with the same values set.

* Not tested in staging, just want to make sure that I understand the issue properly.
* Added stopService method
* Added a standalone builder, w/o a scheduler, for triggering mirror uptake monitoring if someone has to reconfigure the master _after_ push to mirrors and _before_ the pollers finish their work.
Attachment #513093 - Flags: feedback?(dustin)
What about saving state in the DB, and then deciding to trigger on startup or not depending on the saved state?
Comment on attachment 513093 [details] [diff] [review]
buildbotcustom

This may help, as it looks like previously the LoopingCall would continue forever, even after the scheduler was stopped.

However, this doesn't address the following ordering:
 1. poll() method invoked, get_release_uptake starts
 2. scheduler stopped
 3. uptake is OK and Triggerable.trigger invoked

Since the poll() method is not atomic, you need to be sure it isn't interrupted.  Generally the best way to do this is to use a DeferredLock for the polling operation, and to try to acquire that DeferredLock during stopService and shut down the loop with the lock held.
Attachment #513093 - Flags: feedback?(dustin) → feedback-
Priority: P4 → P5
Component: Release Engineering → Release Engineering: Automation (Release Automation)
QA Contact: release → bhearsum
Back to the pool...
Assignee: rail → nobody
Product: mozilla.org → Release Engineering
Depends on: 1082823
Not going to fix this because we're moving to Taskcluster.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: