Today, an hg push to mozilla inbound resulted in an entry appearing on tbpl and treeherder, but with no builders triggered: Tree herder: https://secure.pub.build.mozilla.org/buildapi/self-serve/mozilla-inbound/rev/ecd4f1368b7a Tbpl: https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=ecd4f1368b7a hg log: https://hg.mozilla.org/integration/mozilla-inbound/rev/ecd4f1368b7a hg push log: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml No results returned when running: select * from changes where revision='ecd4f1368b7a' against the production scheduler database. No entries generated in: https://secure.pub.build.mozilla.org/builddata/buildjson/builds-4hr.js.gz Push log states a push time of: 06/10/2014 04:55:59 PDT According to: https://wiki.mozilla.org/index.php?title=ReleaseEngineering/Maintenance&oldid=1022446 a reconfig of all masters completed at: 06/10/2014 05:15 The suspected cause currently is that this reconfig interfered with the hg poller running on the buildbot scheduler master - am currently investigating buildbot scheduler logs.
First reference in scheduler log: 2014-10-06 04:56:12-0700 [HTTPPageGetter,client] last changeset ecd4f1368b7ac0fecac8dbd231da732fec39baa3 on https://hg.mozilla.org/integration/mozilla-inbound Last reference: 2014-10-06 05:34:13-0700 [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: https://hg.mozilla.org/integration/mozilla-inbound/json-pushes?full=1&fromchange=ecd4f1368b7ac0fecac8dbd231da732fec39baa3> Now scanning the lines between for clues...
Bingo! This looks like a reconfig error: 2014-10-06 04:56:12-0700 [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: https://hg.mozilla.org/releases/mozilla-aurora/json-pushes?full=1> 2014-10-06 04:56:12-0700 [-] Unhandled Error Traceback (most recent call last): File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/twisted/internet/base.py", line 1174, in mainLoop self.runUntilCurrent() File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/twisted/internet/base.py", line 796, in runUntilCurrent call.func(*call.args, **call.kw) File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/buildbot-0.8.2_hg_3ce9eb030a5f_production_0.8-py2.7.egg/buildbot/util/eventual.py", line 31, in _turn cb(*args, **kwargs) File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/buildbot-0.8.2_hg_3ce9eb030a5f_production_0.8-py2.7.egg/buildbot/util/loop.py", line 167, in _loop_next d = defer.maybeDeferred(p) --- <exception caught here> --- File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/twisted/internet/defer.py", line 125, in maybeDeferred result = f(*args, **kw) File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/buildbot-0.8.2_hg_3ce9eb030a5f_production_0.8-py2.7.egg/buildbot/schedulers/timed.py", line 228, in run db = self.parent.db exceptions.AttributeError: 'NoneType' object has no attribute 'db'
As you see above, the suspected reconfig error happened during the same second as the changeset was first found: 2014-10-06 04:56:12-0700 [HTTPPageGetter,client] last changeset ecd4f1368b7ac0fecac8dbd231da732fec39baa3 on https://hg.mozilla.org/integration/mozilla-inbound ... 2014-10-06 04:56:12-0700 [-] Unhandled Error Traceback (most recent call last): ... ...
We should find the root cause of comment 2 and solve...
That's a spurious message, we see that all the time. The root cause of missing pushes is that during the reconfig the old poller gets deleted and a new poller gets created. If a push happens between those events, then the new poller will assume the new push is the "latest" one, and will only find pushes that happen after that. Does this explain the events that happened? When did the reconfig happen on this master? When did the new poller get created?
Component: Tools → General
Product: Release Engineering → Release Engineering
buildbot is dying!
Status: NEW → RESOLVED
Last Resolved: 11 months ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.