Closed
Bug 1366029
Opened 7 years ago
Closed 7 years ago
add windows 10 machines to buildbot-configs so we can run new talos tests on there
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jmaher, Assigned: jmaher)
References
Details
Attachments
(4 files, 3 obsolete files)
10.05 KB,
patch
|
kmoir
:
review+
|
Details | Diff | Splinter Review |
1.47 KB,
patch
|
kmoir
:
review+
aselagea
:
checked-in+
|
Details | Diff | Splinter Review |
10.05 KB,
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
984 bytes,
patch
|
kmoir
:
checked-in+
|
Details | Diff | Splinter Review |
No description provided.
Comment 1•7 years ago
|
||
Do we know what the machine naming scheme will be?
Comment 2•7 years ago
|
||
t-w1064-ix-NNNN.wintest.releng.scl3.mozilla.com
Assignee | ||
Comment 3•7 years ago
|
||
I am familiar with list_builder_differences to verify changes for scheduling new jobs, I am not familiar with adding new machine names and platforms in the buildbot-configs (or maybe even buildbotcustom). If there is prior art for doing this, I would be happy to look at that as a starting point.
Comment 4•7 years ago
|
||
We previously removed win10 support from buildbot in bug 1330999. I would use those as a starting point.
Assignee | ||
Comment 5•7 years ago
|
||
Assignee | ||
Comment 6•7 years ago
|
||
Assignee | ||
Comment 7•7 years ago
|
||
assuming my patch looks good, we can go ahead and schedule a time to replace win8 talos with win10; ideally this is something we can line up with reimaging machines.
Assignee | ||
Comment 8•7 years ago
|
||
:catlee, I would like to know if this is a patch worth pursuing- maybe if you don't have time you can redirect to another buildbot hacker? Getting this ready to land would help us move forward in finishing the win10 project.
Comment 9•7 years ago
|
||
Comment on attachment 8869610 [details] [diff] [review] add win10-ix as a platform- shift win8 talos tests to win10 Review of attachment 8869610 [details] [diff] [review]: ----------------------------------------------------------------- ::: mozilla-tests/config.py @@ -152,5 @@ > 'config_file': 'talos/windows_config.py', > } > > PLATFORMS['win64']['slave_platforms'] = ['win8_64'] > -PLATFORMS['win64']['talos_slave_platforms'] = ['win8_64'] Will we want to make a hard transition from win8 to win10 talos testing?
Attachment #8869610 -
Flags: feedback?(catlee) → feedback+
Assignee | ||
Comment 10•7 years ago
|
||
in addition to buildbot-configs, we need support for slavehealth/slavealloc/puppet. I see a puppet patch when win10 was removed: https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=1330999&attachment=8827909 there is also a cloudtools patch: https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=1330999&attachment=8827910 but I am not sure what slavehealth/slaveconfig is, is that cloud-tools?
Flags: needinfo?(catlee)
Comment 11•7 years ago
|
||
buildduty can add the entries to slavealloc. I'm not sure about how machines get added to slavehealth. Alin, can you help Joel out?
Flags: needinfo?(catlee) → needinfo?(aselagea)
Assignee | ||
Comment 12•7 years ago
|
||
the plan here is to turn off win8 and turn on win10 at the same time. If there are problems with that plan, let me know and I can do this in 2 stages.
Attachment #8869610 -
Attachment is obsolete: true
Attachment #8871284 -
Flags: review?(kmoir)
Assignee | ||
Comment 13•7 years ago
|
||
support for windows 10 ix hardware inside of puppet.
Attachment #8871285 -
Flags: review?(kmoir)
Comment 14•7 years ago
|
||
(In reply to Chris AtLee [:catlee] from comment #11) > buildduty can add the entries to slavealloc. I'm not sure about how machines > get added to slavehealth. Alin, can you help Joel out? Yeah, I can take care of those.
Flags: needinfo?(aselagea)
Comment 15•7 years ago
|
||
According to https://bugzilla.mozilla.org/show_bug.cgi?id=1367102#c4, we're going to enable 75 Win 10 machines at this point. Added those to slavealloc. mysql> select count(*) from slaves where name like 't-w1064-ix%'; +----------+ | count(*) | +----------+ | 75 | +----------+ 1 row in set (0.00 sec)
Comment 16•7 years ago
|
||
Comment on attachment 8871285 [details] [diff] [review] add windows 10 ix to puppet Do we need to include $slave_trustlevel = 'try' here?
Comment 17•7 years ago
|
||
Comment on attachment 8871284 [details] [diff] [review] add windows 10 ix to buildbot configs I think this is fine except for PLATFORMS['win64-devedition']['win10_64_devedition'] = {'name': 'Windows 10 64-bit DevEdition', + 'try_by_default': True} try_by_default': True should be False we only run these tests on beta
Attachment #8871284 -
Flags: review?(kmoir) → review+
Assignee | ||
Comment 18•7 years ago
|
||
thanks! I think with the two patches attached here, we will be all set. I assume the puppet patch can land sooner rather than later, then the buildbot-config patch when we start shutting off win8 machines.
Comment 19•7 years ago
|
||
For the slave_health part, I simply reverted Coop's patch which actually disabled win10: https://hg.mozilla.org/build/slave_health/rev/ed1e646be536
Comment 20•7 years ago
|
||
manifests/moco-nodes.pp should not have any node definitions for w10, since we are using GPO and AD.
Assignee | ||
Comment 21•7 years ago
|
||
removed the moco-nodes.pp changes.
Attachment #8871285 -
Attachment is obsolete: true
Attachment #8871285 -
Flags: review?(kmoir)
Attachment #8871312 -
Flags: review+
Assignee | ||
Comment 22•7 years ago
|
||
Comment on attachment 8871312 [details] [diff] [review] add windows 10 ix to puppet sorry, this was not r+ from :kmoir already; the question about slavelevel='try' seems to be resolved by removing the changes for moco-nodes.pp
Attachment #8871312 -
Flags: review+ → review?(kmoir)
Assignee | ||
Comment 23•7 years ago
|
||
updated patch to set win10-devedition on try=False by default. thanks for the review
Attachment #8869611 -
Attachment is obsolete: true
Attachment #8871313 -
Flags: review+
Comment 24•7 years ago
|
||
One note here that I made in bug 1367102, the host regex is t-w1064-ix-NNN.wintest.releng.scl3.mozilla.com (3 digits instead of 4).
Updated•7 years ago
|
Attachment #8871312 -
Flags: review?(kmoir) → review+
Comment 25•7 years ago
|
||
Did a bit of research over what's needed in Treeherder so the new jobs show up and I think we have everything in place from our previous setup to run Win 10 tests. https://github.com/mozilla/treeherder/blob/master/ui/js/values.js#L38 https://github.com/mozilla/treeherder/blob/master/treeherder/etl/buildbot.py#L279 A test is also added: https://github.com/mozilla/treeherder/blob/master/tests/etl/test_buildbot.py#L1018
Comment 26•7 years ago
|
||
Comment on attachment 8871312 [details] [diff] [review] add windows 10 ix to puppet https://hg.mozilla.org/build/puppet/rev/66f603ea69af https://hg.mozilla.org/build/puppet/rev/868762962e40
Attachment #8871312 -
Flags: checked-in+
Comment 27•7 years ago
|
||
We ran into several problems with this deploy from the releng side of things. There were also relops issues but I'll also address them in their bug. There were two main problems 1) New w10 machines could not connect to buildbot masters 2) Huge windows pending counts were triggered New w10 machines could not connect to buildbot masters 1) The initial reconfig failed because the win10 devedition key was missing in puppet. Also there were windows eol characters in the patch, not sure if this caused an issue but I removed them as well. I deployed this fix https://hg.mozilla.org/build/puppet/rev/3f09b62b7c30 2) The puppet patch landed but a new reconfig was not triggered because the reconfig script did not see a change to the version from the last time when it failed bug 1369164 3) I triggered a reconfig and machines could connect Huge windows pending counts were triggered When we enabled w10 as a platform there were a huge increase in pending counts for w7 and w10 jobs. We have seen this happen before when adding a new platform. I opened bug 1369157 to investigate the root cause. Alin fixed the db issues as well Alin, can you include the db queries/updates you used to fix the issue on this bug. I looked in the mysql console history but you must have attached to the db from a different machine than I did.
Flags: needinfo?(aselagea)
Comment 28•7 years ago
|
||
noticed this alert because the range is not quite right [sns alert] Jun 01 08:00:02 buildbot-master119.bb.releng.scl3.mozilla.com watch_twistd_log.py: Count: 372 | First instance: 2017-06-01 07:38:09-0700 | Most recent instance: 2017-06-01 08:00:00-0700 | Twistd exception: twisted.cred.error.UnauthorizedLogin - t-w1064-ix-075.wintest.releng.scl3.mozilla.com 10.26.42.97
Updated•7 years ago
|
Attachment #8873465 -
Flags: checked-in+
Comment 29•7 years ago
|
||
(In reply to Kim Moir [:kmoir] from comment #27) > Alin, can you include the db queries/updates you used to fix the issue on > this bug. I looked in the mysql console history but you must have attached > to the db from a different machine than I did. I first created a temporary table to store the IDs of all build requests that were submitted *after* May 31 07:00 PDT, but corresponding to changes that were done *before* May 31 07:00 PDT. create temporary table ids select buildrequests.id from buildrequests, buildsets, sourcestamp_changes, changes where changes.changeid = sourcestamp_changes.changeid and sourcestamp_changes.sourcestampid = buildsets.sourcestampid and buildrequests.buildsetid = buildsets.id and buildrequests.complete = 0 and buildrequests.claimed_at =0 and buildername like 'Windows%' and buildrequests.submitted_at > 1496214000 and changes.when_timestamp < 1496214000; I then simply marked those jobs as completed. update buildrequests, ids2 set complete=1, results=2, complete_at=1496223480 where buildrequests.id=ids2.id and complete=0 and claimed_at=0;
Flags: needinfo?(aselagea)
Assignee | ||
Updated•7 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Updated•4 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•