Closed
Bug 1352020
Opened 8 years ago
Closed 8 years ago
Tier-1 windows tests jobs + windows pgo builds missing on autoland
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: cbook, Unassigned)
References
Details
Attachments
(1 file)
It seems Autoland runs now completly on taskcluster builds.
This means
-> last windows 7 pgo build with tests run on https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=53c4db985ef3fef079568f6063d8e8db954ecaef&filter-searchStr=windows+pgo
since than windows (7) pgo builds are missing:
-> https://treeherder.mozilla.org/#/jobs?repo=autoland&bugfiler&noautoclassify&fromchange=53c4db985ef3fef079568f6063d8e8db954ecaef&filter-searchStr=windows%20pgo
-> also since no buildbot jobs seems to run for as example windows on autoland the tier 1 tests for windows do not run and also only tier 2 taskcluster builds.
not sure if this is expected, but since mozilla-inbound still runs tier-1 buildbot tests and pgo builds i wonder :)
Reporter | ||
Comment 2•8 years ago
|
||
Autoland is closed now for the missing tier 1 tests
Reporter | ||
Updated•8 years ago
|
Severity: critical → blocker
Comment 3•8 years ago
|
||
I checked the build scheduler logs on bm81 and noticed that we started receiving some polling errors starting with: 2017-03-29 11:37:21 PDT:
2017-03-29 11:37:21-0700 [HTTPPageGetter,client] <HgPoller for https://hg.mozilla.org/integration/autoland>: polling failed, result.
That means we are not able to read the latest revision for the autoland repo and will get stuck on some older revisions:
2017-03-29 22:30:03-0700 [-] lastGoodRev: Skipping 60d7a0496a3673450ddbc37ec387525148c32604 since we've already built it
2017-03-30 00:30:03-0700 [-] lastGoodRev: Skipping 5ea57d0f42d23fd2522159b213644c6af565310a since we've already built it
2017-03-30 00:30:04-0700 [-] lastGoodRev: Skipping 53c4db985ef3fef079568f6063d8e8db954ecaef since we've already built it
2017-03-30 01:30:01-0700 [-] lastGoodRev: Skipping 60d7a0496a3673450ddbc37ec387525148c32604 since we've already built it
2017-03-30 03:30:04-0700 [-] lastGoodRev: Skipping 53c4db985ef3fef079568f6063d8e8db954ecaef since we've already built it
2017-03-30 04:02:01-0700 [-] lastGoodRev: Skipping 5ea57d0f42d23fd2522159b213644c6af565310a since we've already built it
2017-03-30 04:30:02-0700 [-] lastGoodRev: Skipping 60d7a0496a3673450ddbc37ec387525148c32604 since we've already built it
Autoland is the only repo affected. We don't have any buildbot jobs on autoland since revision 53c4db985ef3fef079568f6063d8e8db954ecaef
Comment 4•8 years ago
|
||
Error traceback:
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] <HgPoller for https://hg.mozilla.org/integration/autoland>: polling failed, result
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] Traceback (most recent call last):
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/twisted/internet/defer.py", line 441, in _runCallbacks
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] self.result = callback(self.result, *args, **kw)
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/buildbotcustom/changes/hgpoller.py", line 154, in succeeded
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] self.d.callback(result)
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/twisted/internet/defer.py", line 318, in callback
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] self._startRunCallbacks(result)
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/twisted/internet/defer.py", line 424, in _startRunCallbacks
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] self._runCallbacks()
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] --- <exception caught here> ---
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/twisted/internet/defer.py", line 441, in _runCallbacks
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] self.result = callback(self.result, *args, **kw)
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/buildbotcustom/changes/hgpoller.py", line 438, in processData
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] self.parent.addChange(c)
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/buildbot-0.8.2_hg_8b87b4974e3c_production_0.8-py2.7.egg/buildbot/changes/manager.py", line 114, in addChange
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] self.parent.addChange(change)
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/buildbot-0.8.2_hg_8b87b4974e3c_production_0.8-py2.7.egg/buildbot/master.py", line 1178, in addChange
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] self.db.addChangeToDatabase(change)
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/buildbot-0.8.2_hg_8b87b4974e3c_production_0.8-py2.7.egg/buildbot/db/connector.py", line 308, in addChangeToDatabase
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] self.runInteractionNow(self._txn_addChangeToDatabase, change)
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/buildbot-0.8.2_hg_8b87b4974e3c_production_0.8-py2.7.egg/buildbot/db/connector.py", line 212, in runInteractionNow
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] return self._runInteractionNow(interaction, *args, **kwargs)
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/buildbot-0.8.2_hg_8b87b4974e3c_production_0.8-py2.7.egg/buildbot/db/connector.py", line 237, in _runInteractionNow
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] result = interaction(c, *args, **kwargs)
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/buildbot-0.8.2_hg_8b87b4974e3c_production_0.8-py2.7.egg/buildbot/db/connector.py", line 326, in _txn_addChangeToDatabase
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] t.execute(q, values)
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/MySQLdb/cursors.py", line 174, in execute
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] self.errorhandler(self, exc, value)
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] File "/builds/buildbot/build_scheduler/lib/python2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] raise errorclass, errorvalue
2017-03-29 22:05:21-0700 [HTTPPageGetter,client] _mysql_exceptions.OperationalError: (1366, "Incorrect string value: '\\xF0\\x9F\\x8D\\xB7 (...' for column 'comments' at row 1")
Comment 5•8 years ago
|
||
And this is the error that I see before the first HG Poller error:
2017-03-29 11:37:21-0700 [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: https://hg.mozilla.org/projects/elm/json-pushes?version=2&full=1&startID=41>
2017-03-29 11:37:21-0700 [HTTPPageGetter,client] Stopping factory <HTTPClientFactory: https://hg.mozilla.org/releases/mozilla-aurora/json-pushes?version=2&full=1&startID=10973>
2017-03-29 11:37:21-0700 [HTTPPageGetter,client] adding change, who servo-vcs-sync@mozilla.com, 7 files, rev=566096db33a20917d74a0af803d6f2c746387c4a, branch=integration/autoland, repository=, comments servo: Merge #16180 - Make the WebSocket handshake ourselves
Reporter | ||
Comment 6•8 years ago
|
||
from irc its seems that the wine glass in https://hg.mozilla.org/integration/autoland/rev/566096db33a20917d74a0af803d6f2c746387c4a is causing the problem - also filed bug 1352052 for better monitoring
Reporter | ||
Updated•8 years ago
|
Flags: needinfo?(gps)
Comment 7•8 years ago
|
||
(In reply to Alin Selagea [:aselagea][:buildduty] from comment #5)
> And this is the error that I see before the first HG Poller error:
well, not error but log entries..
Comment hidden (mozreview-request) |
Comment 9•8 years ago
|
||
Catlee said he will r+ once he is able to (and cant right now)
https://hg.mozilla.org/build/buildbotcustom/rev/a8a9d62163a0c8859033fda56daf66576a7eb3d8
https://hg.mozilla.org/build/buildbotcustom/rev/6e26b053ab94b7aeb2295d3f80665d2afce0d6be
Alin told me he will make sure the scheduler reconfigs and then will use 'manhole' to reset it to the appropriate pushID, and lastly will comment here those steps, so someone like me can remember how to do it should we need to again before Buildbot is dead.
Flags: needinfo?(aselagea)
Comment 10•8 years ago
|
||
mozreview-review |
Comment on attachment 8852907 [details]
Bug 1352020 - Don't attempt to store unicode in the commit description.
https://reviewboard.mozilla.org/r/125046/#review127654
Attachment #8852907 -
Flags: review?(catlee) → review+
Reporter | ||
Comment 11•8 years ago
|
||
sheriff's won't open autoland right now, because plan is to leave autoland closed until things look green since we have a backlog of about 18 hours and 36 commits to autoland without tier-1 windows tests and builds from buildbot.
Comment 12•8 years ago
|
||
Okay, I deployed the changes on the build scheduler and reset the push ID. Below you can find the steps to do that:
On bm81:
#connect to the loopback address using cltbld user and port 7301
ssh cltbld@127.0.0.1 -p 7301
#list the names in the current local scope
>>> dir()
['_', '__builtins__', 'master', 'p', 'pollers', 'status']
>>> dir(status)
['__doc__', '__implemented__', '__init__', '__module__', '__providedBy__', '__provides__', '_builder_observers', '_builder_subscribe', '_builder_unsubscribe', '_buildreq_observers', '_buildrequest_subscribe', '_buildrequest_unsubscribe', '_buildset_finished_waiters', '_buildset_success_waiters', '_buildset_waitUntilFinished', '_buildset_waitUntilSuccess', '_db_buildrequest_added', '_db_buildrequest_cancelled', '_db_builds_changed', '_db_buildset_added', '_db_buildset_changed', '_db_buildsets_changed', '_handle_buildrequest_event', 'announceNewBuilder', 'asDict', 'basedir', 'botmaster', 'builderAdded', 'builderRemoved', 'buildreqs_retired', 'cancelCleanShutdown', 'changeAdded', 'cleanShutdown', 'db', 'generateFinishedBuilds', 'getBuildSets', 'getBuildbotURL', 'getBuilder', 'getBuilderNames', 'getChange', 'getChangeSources', 'getProjectName', 'getProjectURL', 'getSchedulers', 'getSlave', 'getSlaveNames', 'getURLForThing', 'get_buildreq_for_id', 'logCompressionLimit', 'logCompressionMethod', 'logMaxSize', 'logMaxTailSize', 'setDB', 'shuttingDown', 'slaveConnected', 'slaveDisconnected', 'subscribe', 'unsubscribe', 'watchers']
# list the attributes for the 'master' object
>>> dir(master)
['__doc__', '__getstate__', '__implemented__', '__init__', '__iter__', '__module__', '__providedBy__', '__provides__', '_handleSIGHUP', '_txn_submitBuildSet', 'addChange', 'addService', 'allSchedulers', 'basedir', 'botmaster', 'buildCacheSize', 'buildHorizon', 'buildbotURL', 'changeCacheSize', 'change_svc', 'checker', 'configFileName', 'db', 'db_poll_interval', 'db_url', 'debug', 'debugPassword', 'disownServiceParent', 'dispatcher', 'eventHorizon', 'getServiceNamed', 'getStatus', 'loadConfig', 'loadConfig_Builders', 'loadConfig_Database', 'loadConfig_Slaves', 'loadConfig_Sources', 'loadConfig_status', 'loadDatabase', 'loadTheConfigFile', 'logHorizon', 'log_rotation', 'manhole', 'master_incarnation', 'master_name', 'name', 'namedServices', 'parent', 'privilegedStartService', 'projectName', 'projectURL', 'properties', 'readConfig', 'removeService', 'running', 'scheduler_manager', 'services', 'setName', 'setServiceParent', 'slaveFactory', 'slavePort', 'slavePortnum', 'startService', 'status', 'statusTargets', 'stopService', 'submitBuildSet', 'triggerSlaveManager']
#list the value of the 'change_svc' instance(which manages the active change sources and the set of changes received from those sources
>>> list(master.change_svc)
[<buildbot.changes.pb.PBChangeSource instance at 0x1ef55a8>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x3407d50>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x28fce10>, <buildbotcustom.changes.hgpoller.HgPoller object at 0xdf50490>, <buildbotcustom.changes.hgpoller.HgPoller object at 0xdf50b50>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x7fd23c19bb90>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x7fd23cb47990>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x7fd23c53dc50>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x7fd23c503610>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x4769d10>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x3cd6290>, <buildbotcustom.changes.hgpoller.HgPoller object at 0xf850750>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x3fbf850>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x115d15d0>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x5c494d0>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x7fd235033390>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x36d5210>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x6b05890>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x6b05f50>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x7fd23429ea90>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x7fd23c8ad150>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x7fd23c1b0a10>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x4d34090>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x7fd234e925d0>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x2f06350>, <buildbotcustom.changes.hgpoller.HgPoller object at 0x2f06590>]
# assign the result of the above executed statement to a variable
>>> pollers = _
>>> pollers[1]
<buildbotcustom.changes.hgpoller.HgPoller object at 0x3407d50>
>>> pollers[1].branch
'projects/graphics'
# move to 'autoland' branch
>>> for p in pollers[1:]:
... if p.branch == 'integration/autoland':
... break
...
>>> p
<buildbotcustom.changes.hgpoller.HgPoller object at 0xdf50490>
>>> p.branch
'integration/autoland'
#display the ID of the last push
>>> p.lastPushID
39830
# finally, reset it to the value corresponding to the last known good push
>>> p.lastPushID = 39788
>>> p.lastPushID
39788
Flags: needinfo?(aselagea)
Comment 13•8 years ago
|
||
We're mostly caught up at this point. I've merged autoland around and reopened it.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•