Re-purpose mw32-ix-slave##, linux-ix-slave##, linux64-ix-slave##, bld-linux64-ix-05[1-3], mw32-ix-ref and linux-ix-ref as b-2008-ix-#### (rev2) machines

RESOLVED FIXED

Status

Release Engineering
Buildduty
P2
normal
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: armenzg, Assigned: armenzg)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: summary-in-comment-35 - final list of machines in comment 19)

Attachments

(9 attachments, 1 obsolete attachment)

(Assignee)

Description

4 years ago
These machines were supposed to be running on preproduction, however, preproduction didn't even exist.

Let's give them some usage until we shut scl1 down.

Thank you very much!
Assignee: relops → arich
Depends on: 940513
Depends on: 942093
(Assignee)

Comment 1

4 years ago
It seems that this bug is now completed from the relops side IIUC.

Can I take these machines and put them in the pool?
Flags: needinfo?(arich)
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Flags: needinfo?(arich)
Resolution: --- → FIXED
(Assignee)

Comment 2

4 years ago
I will re-use this bug for the releng side.

Thanks for your help!
Assignee: arich → armenzg
Status: RESOLVED → REOPENED
Component: RelOps → Buildduty
Product: Infrastructure & Operations → Release Engineering
QA Contact: arich → armenzg
Resolution: FIXED → ---
(Assignee)

Comment 3

4 years ago
Created attachment 8343987 [details] [diff] [review]
add more w64 machines
Attachment #8343987 - Flags: review?(jhopkins)
(Assignee)

Comment 4

4 years ago
We're so close for esr17 to be done that might as well use this bug for the remaining machines.

linux-ix-slave01.build.scl1.mozilla.com has address 10.12.48.195
linux-ix-slave02.build.scl1.mozilla.com has address 10.12.48.196
linux-ix-slave03.build.scl1.mozilla.com has address 10.12.48.197
linux-ix-slave06.build.scl1.mozilla.com has address 10.12.48.200
linux64-ix-slave03.build.scl1.mozilla.com has address 10.12.49.46
linux64-ix-slave04.build.scl1.mozilla.com has address 10.12.49.47
linux64-ix-slave05.build.scl1.mozilla.com has address 10.12.49.48
linux64-ix-slave06.build.scl1.mozilla.com has address 10.12.49.49

I would like them to turn into:
w64-ix-slave163
w64-ix-slave164
w64-ix-slave165
w64-ix-slave166
w64-ix-slave167
w64-ix-slave168
w64-ix-slave169
w64-ix-slave170
Summary: Re-purpose linux-ix-slave0{4,5} and linux64-ix-slave0{1,2} as w64-ix-slave[159-162] (rev2) → Re-purpose linux-ix-slave## and linux64-ix-slave## as w64-ix-slave### (rev2)
(Assignee)

Comment 5

4 years ago
Created attachment 8343994 [details] [diff] [review]
win64_machines.diff
Attachment #8343987 - Attachment is obsolete: true
Attachment #8343987 - Flags: review?(jhopkins)
Attachment #8343994 - Flags: review?(jhopkins)
Comment on attachment 8343994 [details] [diff] [review]
win64_machines.diff

So here's the full list:

[159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170]
Attachment #8343994 - Flags: review?(jhopkins) → review+
(Assignee)

Comment 7

4 years ago
I've found more machines in bug 849022.
I will post another patch.
Depends on: 849022
(Assignee)

Updated

4 years ago
Summary: Re-purpose linux-ix-slave## and linux64-ix-slave## as w64-ix-slave### (rev2) → Re-purpose mw32-ix-slave##, linux-ix-slave## and linux64-ix-slave## as w64-ix-slave### (rev2) machines
(Assignee)

Updated

4 years ago
Attachment #8343994 - Attachment is obsolete: true
Depends on: 947947
Depends on: 947951
(Assignee)

Updated

4 years ago
Duplicate of this bug: 784721
(Assignee)

Comment 9

4 years ago
It seems that this will be our final list of machines: 
w64-ix-slave159.build.scl1.mozilla.com - bug 942093
w64-ix-slave160.build.scl1.mozilla.com - bug 942093
w64-ix-slave161.build.scl1.mozilla.com - bug 942093
w64-ix-slave162.build.scl1.mozilla.com - bug 942093
w64-ix-slave163.build.scl1.mozilla.com - bug 947947
w64-ix-slave164.build.scl1.mozilla.com - bug 947947
w64-ix-slave165.build.scl1.mozilla.com - bug 947947
w64-ix-slave166.build.scl1.mozilla.com - bug 947947
w64-ix-slave167.build.scl1.mozilla.com - bug 947947
w64-ix-slave168.build.scl1.mozilla.com - bug 947947
w64-ix-slave169.build.scl1.mozilla.com - bug 947947
w64-ix-slave170.build.scl1.mozilla.com - bug 947947
(Assignee)

Comment 10

4 years ago
I used such dumb where clauses; whatever.
It seems like mw32-ix-slave09 and mw32-ix-slave10 had been deleted already from slavealloc.

mysql> select name  from slaves where notes like '%863236%';
+--------------------+
| name               |
+--------------------+
| linux-ix-slave01   |
| linux-ix-slave02   |
| linux-ix-slave06   |
| linux64-ix-slave03 |
| linux64-ix-slave04 |
| linux64-ix-slave05 |
| linux64-ix-slave06 |
+--------------------+
7 rows in set (0.08 sec)

mysql> delete from slaves where notes like '%863236%';
Query OK, 7 rows affected (0.08 sec)

mysql> select name  from slaves where notes like '%933768%';
+--------------------+
| name               |
+--------------------+
| linux-ix-slave04   |
| linux-ix-slave05   |
| linux64-ix-slave01 |
| linux64-ix-slave02 |
+--------------------+
4 rows in set (0.01 sec)

mysql> delete from slaves where notes like '%933768%';
Query OK, 4 rows affected (0.09 sec)

mysql> select name  from slaves where name like 'linux-ix-slave03';
+------------------+
| name             |
+------------------+
| linux-ix-slave03 |
+------------------+
1 row in set (0.31 sec)

mysql> delete from slaves where name like 'linux-ix-slave03';
Query OK, 1 row affected (0.51 sec)

mysql> select name from slaves where name like 'mw32-ix-slave%';
+-----------------+
| name            |
+-----------------+
| mw32-ix-slave01 |
| mw32-ix-slave02 |
| mw32-ix-slave03 |
| mw32-ix-slave04 |
| mw32-ix-slave05 |
| mw32-ix-slave06 |
| mw32-ix-slave07 |
| mw32-ix-slave08 |
| mw32-ix-slave11 |
| mw32-ix-slave12 |
+-----------------+
10 rows in set (0.00 sec)

mysql> delete from slaves where name like 'mw32-ix-slave%';
Query OK, 10 rows affected (0.00 sec)
(Assignee)

Comment 11

4 years ago
Created attachment 8346703 [details]
[slavealloc] add new win64 machines
(Assignee)

Comment 12

4 years ago
Created attachment 8346704 [details]
[graphs] add new win64 machines
(Assignee)

Comment 13

4 years ago
Comment on attachment 8343994 [details] [diff] [review]
win64_machines.diff

https://hg.mozilla.org/build/buildbot-configs/rev/5573505906dc

Adding 158 as well.
Attachment #8343994 - Attachment is obsolete: false
Attachment #8343994 - Flags: checked-in+
Patch is in production
(Assignee)

Comment 15

4 years ago
I've put w64-ix-slave{159..170} into the try pool after deploying the try keys.

We have these machines left to be put into the pool once bug 947951 is completed on Q1.
mw32-ix-slave01.build.mtv1.mozilla.com
mw32-ix-slave02.build.mtv1.mozilla.com
mw32-ix-slave03.build.mtv1.mozilla.com
mw32-ix-slave04.build.mtv1.mozilla.com
mw32-ix-slave05.build.mtv1.mozilla.com
mw32-ix-slave06.build.mtv1.mozilla.com
mw32-ix-slave07.build.mtv1.mozilla.com
mw32-ix-slave08.build.mtv1.mozilla.com
mw32-ix-slave09.build.mtv1.mozilla.com
mw32-ix-slave10.build.mtv1.mozilla.com
mw32-ix-slave11.build.mtv1.mozilla.com
mw32-ix-slave12.build.mtv1.mozilla.com
win32-ix-ref.build.mtv1.mozilla.com
linux-ix-ref.build.mtv1.mozilla.com
Whiteboard: waiting on dep bug
(Assignee)

Comment 16

4 years ago
I have rebooted the machines I put into produciton because I was looking at C:\slave\buildbot.tac instead of C:\builds\moz2_slave\buildbot.tac. Well done me!

I have not yet found out how to see those machines in here:
https://secure.pub.build.mozilla.org/builddata/reports/slave_health/slavetype.html?class=try&type=w64-ix
(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) (gone Thu. 12/19/2013 until 1/2/2014) from comment #16)
> I have not yet found out how to see those machines in here:
> https://secure.pub.build.mozilla.org/builddata/reports/slave_health/
> slavetype.html?class=try&type=w64-ix

I can see them on that page now.
(Assignee)

Comment 18

4 years ago
Time to add these:
b-2008-ix-0001.winbuild.releng.scl3.mozilla.com
b-2008-ix-0002.winbuild.releng.scl3.mozilla.com
b-2008-ix-0003.winbuild.releng.scl3.mozilla.com
b-2008-ix-0004.winbuild.releng.scl3.mozilla.com
b-2008-ix-0005.winbuild.releng.scl3.mozilla.com
b-2008-ix-0006.winbuild.releng.scl3.mozilla.com
b-2008-ix-0007.winbuild.releng.scl3.mozilla.com
b-2008-ix-0008.winbuild.releng.scl3.mozilla.com
b-2008-ix-0009.winbuild.releng.scl3.mozilla.com (waiting for a disk replacement)
b-2008-ix-0010.winbuild.releng.scl3.mozilla.com
b-2008-ix-0011.winbuild.releng.scl3.mozilla.com
b-2008-ix-0012.winbuild.releng.scl3.mozilla.com
b-2008-ix-0013.winbuild.releng.scl3.mozilla.com
b-2008-ix-0014.winbuild.releng.scl3.mozilla.com
b-2008-ix-0015.winbuild.releng.scl3.mozilla.com
b-2008-ix-0016.winbuild.releng.scl3.mozilla.com
b-2008-ix-0017.winbuild.releng.scl3.mozilla.com
Priority: -- → P2
Summary: Re-purpose mw32-ix-slave##, linux-ix-slave## and linux64-ix-slave## as w64-ix-slave### (rev2) machines → Re-purpose mw32-ix-slave##, linux-ix-slave## and linux64-ix-slave## as b-2008-ix-#### (rev2) machines
Whiteboard: waiting on dep bug
(Assignee)

Comment 19

4 years ago
What a messy mess I have turned this bug into.

Summary:
[IN PRODUCTION] w64-ix-slave{159..170} (try pool in scl1) come from [1]:
* linux-ix-slave0[1-6] 
* linux64-ix-slave0[1-6]

[WIP] b-2008-ix-00[01-17] (build pool in scl3) come from [2]:
* mw32-ix-slave[01-12]
* bld-linux64-ix-05[1-3] [3][4]
* win32-ix-ref 
* linux-ix-ref 

[1] bug 947947
[2] bug 947951
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=948997#c0
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=948997#c7
Summary: Re-purpose mw32-ix-slave##, linux-ix-slave## and linux64-ix-slave## as b-2008-ix-#### (rev2) machines → Re-purpose mw32-ix-slave##, linux-ix-slave##, linux64-ix-slave##, bld-linux64-ix-05[1-3], mw32-ix-ref and linux-ix-ref as b-2008-ix-#### (rev2) machines
Whiteboard: summary-in-comment-19
(Assignee)

Comment 20

4 years ago
Created attachment 8361893 [details] [diff] [review]
[configs] add b-2008-ix machines
Attachment #8361893 - Flags: review?(jhopkins)
(Assignee)

Comment 21

4 years ago
Created attachment 8361931 [details]
[graphs] add new b-2008 machines
Attachment #8361931 - Flags: checked-in+
(Assignee)

Comment 22

4 years ago
Created attachment 8361932 [details]
[slavealloc] add new b-2008 machines
Attachment #8361932 - Flags: checked-in+
(Assignee)

Updated

4 years ago
Attachment #8346703 - Flags: checked-in+
(Assignee)

Updated

4 years ago
Attachment #8346704 - Flags: checked-in+
(Assignee)

Updated

4 years ago
Depends on: 948997
(Assignee)

Updated

4 years ago
Duplicate of this bug: 949120
Attachment #8361893 - Flags: review?(jhopkins) → review+
(Assignee)

Comment 24

4 years ago
Comment on attachment 8361893 [details] [diff] [review]
[configs] add b-2008-ix machines

https://hg.mozilla.org/build/buildbot-configs/rev/b9e2b7b2e117
Attachment #8361893 - Flags: checked-in+
(Assignee)

Comment 25

4 years ago
Created attachment 8362659 [details] [diff] [review]
w2008_slave_health.diff

coop, how can I test if this is enough for slave_health?
Attachment #8362659 - Flags: feedback?(coop)
Comment on attachment 8362659 [details] [diff] [review]
w2008_slave_health.diff

Review of attachment 8362659 [details] [diff] [review]:
-----------------------------------------------------------------

You will also need to add the new slavetype here:

https://hg.mozilla.org/users/coop_mozilla.com/slave_health/file/0be8a81fe645/js/slave_health.js#l22

...and here:

https://hg.mozilla.org/users/coop_mozilla.com/slave_health/file/0be8a81fe645/js/slave_health.js#l76
Attachment #8362659 - Flags: feedback?(coop) → feedback+
(Assignee)

Comment 27

4 years ago
Created attachment 8362986 [details] [diff] [review]
[slavehealth] add b-2008 machines

coop, I'm looking at the code and I need further clarification.
You mention that a new slavetype should be added to "getSlavetypeForPendingJobs", however, we have an interesting case in here where a "w64-ix" machine can take the same jobs that a "b-2008-ix" can.

How should we deal with this case?
I'm thinking of adding it until the day we get rid of "w64-ix" machines at which point the clause will be reached.
Meanwhile, IIUC, the pending list for "b-2008-ix" will be 0 as we would be calculating it for "w64-ix" but not for "b-2008-ix".

Would this work for you?

function getSlavetypeForPendingJob(slaveclass, pending) {
    var slavetype = ""; 
    if (slaveclass == "try") {
    if (pending.match(/(OS X|macosx64)/)) {
        slavetype = "bld-lion-r5";
    } else if (pending.match(/(WINNT|win32|win64)/)) {
        slavetype = "w64-ix";
Attachment #8362986 - Flags: review?(coop)
in production
Comment on attachment 8362986 [details] [diff] [review]
[slavehealth] add b-2008 machines

Review of attachment 8362986 [details] [diff] [review]:
-----------------------------------------------------------------

html does not allow for the same id to be repeated, and can confuse many things (getElementById()). There are fallbacks specced out to make things *slightly* saner, but that won't work great.
Attachment #8362986 - Flags: feedback-
(Assignee)

Comment 30

4 years ago
I deployed it as-is without review to quite the cronjob.
I can adjust it once I receive the review.
http://hg.mozilla.org/users/coop_mozilla.com/slave_health/rev/128bcd9c0e58

WRT to comment 29 we have repeated IDs on the html page. I was just propagating the existing pattern without knowing that it needed to be fixed.
(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) from comment #27)
> coop, I'm looking at the code and I need further clarification.
> You mention that a new slavetype should be added to
> "getSlavetypeForPendingJobs", however, we have an interesting case in here
> where a "w64-ix" machine can take the same jobs that a "b-2008-ix" can.
> 
> How should we deal with this case?
> I'm thinking of adding it until the day we get rid of "w64-ix" machines at
> which point the clause will be reached.
> Meanwhile, IIUC, the pending list for "b-2008-ix" will be 0 as we would be
> calculating it for "w64-ix" but not for "b-2008-ix".

Where do we have the most capacity, i.e. which slaveclass is *likely* to run the job?

e.g. multiple machines can pick up linux jobs, but we display the pending count for spot only because that's where we shunt most of the work these days.
(Assignee)

Comment 32

4 years ago
(In reply to Chris Cooper [:coop] from comment #31)
> (In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4)
> from comment #27)
> > coop, I'm looking at the code and I need further clarification.
> > You mention that a new slavetype should be added to
> > "getSlavetypeForPendingJobs", however, we have an interesting case in here
> > where a "w64-ix" machine can take the same jobs that a "b-2008-ix" can.
> > 
> > How should we deal with this case?
> > I'm thinking of adding it until the day we get rid of "w64-ix" machines at
> > which point the clause will be reached.
> > Meanwhile, IIUC, the pending list for "b-2008-ix" will be 0 as we would be
> > calculating it for "w64-ix" but not for "b-2008-ix".
> 
> Where do we have the most capacity, i.e. which slaveclass is *likely* to run
> the job?
> 
It is more likely to run on w64-ix machines. We only have 12 b-2008-ix machines.

> e.g. multiple machines can pick up linux jobs, but we display the pending
> count for spot only because that's where we shunt most of the work these
> days.
Comment on attachment 8362986 [details] [diff] [review]
[slavehealth] add b-2008 machines

Review of attachment 8362986 [details] [diff] [review]:
-----------------------------------------------------------------

No blockers, but you should remove the code that won't match anything before landing.

::: index.html
@@ +96,5 @@
>        </tr>
>        <tr id="w64-ix">
>          <td class="slavetype"><a href="./slavetype.html?class=try&type=w64-ix">w64-ix</td><td class="pending">0</td><td class="status"></td>
>        </tr>
> +      <tr id="b-2008-ix">

Callek is right about the duplicate ids, but since the other slaveclasses have the same problem, I won't block review on it.

::: js/slave_health.js
@@ +25,5 @@
>  	if (pending.match(/(OS X|macosx64)/)) {
>  	    slavetype = "bld-lion-r5";
>  	} else if (pending.match(/(WINNT|win32|win64)/)) {
>  	    slavetype = "w64-ix";
> +	} else if (pending.match(/(WINNT|win32|win64)/)) {

This will never match anything.

@@ +35,5 @@
>  	if (pending.match(/(OS X|macosx64)/)) {
>  	    slavetype = "bld-lion-r5";
>  	} else if (pending.match(/(WINNT|win32|win64|fuzzer-win64)/)) {
>  	    slavetype = "w64-ix";
> +	} else if (pending.match(/(WINNT|win32|win64|fuzzer-win64)/)) {

This will never match anything.
Attachment #8362986 - Flags: review?(coop) → review+
(Assignee)

Comment 34

4 years ago
It seems that the b-2008-ix machines don't have the right keys.
(Assignee)

Comment 35

4 years ago
Fixed the non-matching lines:
http://hg.mozilla.org/users/coop_mozilla.com/slave_health/rev/673b516d52b5

Waiting for the b-2008-ix to have their keys deployed before putting them back into production.
Whiteboard: summary-in-comment-19 → summary-in-comment-35 - final list of machines in comment 19
(Assignee)

Comment 36

4 years ago
It seems that on bug 930595 we stopped deploying the keys to the Windows machines.
I will deploy them manually for now.
(Assignee)

Comment 37

4 years ago
I put 0001 into production.

Followed instructions: https://wiki.mozilla.org/ReleaseEngineering/How_To/Adjust_SSH_keys_on_a_slave#Production
(Assignee)

Comment 38

4 years ago
Created attachment 8363714 [details] [diff] [review]
[configs] remove mw32-ix, linux-ix and linux64-ix machines

It seems I had missed removing them from the configs.
Attachment #8363714 - Flags: review?(coop)

Updated

4 years ago
Attachment #8363714 - Flags: review?(coop) → review+
Comment on attachment 8363714 [details] [diff] [review]
[configs] remove mw32-ix, linux-ix and linux64-ix machines

Review of attachment 8363714 [details] [diff] [review]:
-----------------------------------------------------------------

Make sure staging_config.py doesn't have any mw32 slaves too, please.
(Assignee)

Comment 40

4 years ago
Comment on attachment 8363714 [details] [diff] [review]
[configs] remove mw32-ix, linux-ix and linux64-ix machines

Staging was updated as well.
https://hg.mozilla.org/build/buildbot-configs/rev/efff4f5862a4
Attachment #8363714 - Flags: checked-in+
in production.
Depends on: 963123
(Assignee)

Comment 42

4 years ago
Disabled b-2008-ix-0001 due to bug 963123
(Assignee)

Updated

4 years ago
Blocks: 963197
(Assignee)

Comment 43

4 years ago
I've added b-2008-ix-0001 after I fixed the basedir to start with lower case 'c' rather than 'C'.
(Assignee)

Comment 44

4 years ago
I see two leak builds that have failed:
https://tbpl.mozilla.org/php/getParsedLog.php?id=33723503&tree=Fx-Team&full=1 - 7 test failures on make check
https://tbpl.mozilla.org/php/getParsedLog.php?id=33715327&tree=Jamun&full=1 - 7 test failures on make check

It got started to bug 830931 (intermittent orange).

I want to see more jobs run.

FAILURES:
TIMEOUTS:
    --ion-eager --ion-parallel-compile=off --ion-check-range-analysis --no-sse3 c:\builds\moz2_slave\fx-team-w32-d-0000000000000000\build\js\src\jit-test\tests\basic\bug710947.js
    c:\builds\moz2_slave\fx-team-w32-d-0000000000000000\build\js\src\jit-test\tests\basic\bug710947.js
    --ion-eager --ion-parallel-compile=off c:\builds\moz2_slave\fx-team-w32-d-0000000000000000\build\js\src\jit-test\tests\basic\bug710947.js
    --baseline-eager c:\builds\moz2_slave\fx-team-w32-d-0000000000000000\build\js\src\jit-test\tests\basic\bug710947.js
    --baseline-eager --no-ti --no-fpu c:\builds\moz2_slave\fx-team-w32-d-0000000000000000\build\js\src\jit-test\tests\basic\bug710947.js
    --no-baseline --no-ion --no-ti c:\builds\moz2_slave\fx-team-w32-d-0000000000000000\build\js\src\jit-test\tests\basic\bug710947.js
    --no-baseline --no-ion c:\builds\moz2_slave\fx-team-w32-d-0000000000000000\build\js\src\jit-test\tests\basic\bug710947.js
Result summary:
Passed: 27524
Failed: 7
(Assignee)

Comment 45

4 years ago
I rebooted the other 16 b-2008-ix machines into production.
I will check again on them.
(Assignee)

Comment 46

4 years ago
I fixed them and put them back into the pool.
0009 should be ready now as well.
Depends on: 966771
(Assignee)

Updated

4 years ago
Status: REOPENED → RESOLVED
Last Resolved: 4 years ago4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.