implement c3.xlarge slave class for Linux64 test spot instances

RESOLVED FIXED

Status

Release Engineering
Platform Support
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: kmoir, Assigned: kmoir)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(12 attachments, 11 obsolete attachments)

6.69 KB, patch
rail
: review+
kmoir
: checked-in+
Details | Diff | Splinter Review
9.44 KB, patch
rail
: review+
kmoir
: checked-in+
Details | Diff | Splinter Review
1.71 KB, patch
rail
: review+
kmoir
: checked-in+
Details | Diff | Splinter Review
20.19 KB, text/plain
rail
: review+
Details
7.06 KB, patch
rail
: review+
kmoir
: checked-in+
Details | Diff | Splinter Review
1.02 KB, patch
rail
: review+
kmoir
: checked-in+
Details | Diff | Splinter Review
1.11 KB, patch
rail
: review+
kmoir
: checked-in+
Details | Diff | Splinter Review
12.18 KB, patch
rail
: review+
Details | Diff | Splinter Review
2.03 KB, patch
rail
: review+
kmoir
: checked-in+
Details | Diff | Splinter Review
12.92 KB, patch
kmoir
: checked-in+
Details | Diff | Splinter Review
800 bytes, patch
rail
: review+
kmoir
: checked-in+
Details | Diff | Splinter Review
20.39 KB, text/plain
Details
(Assignee)

Description

3 years ago
We are having capacity issues on ix because Android 2.3 Armv6 jobs are now enabled now on trunk.  gbrown tested a c3.xlarge instance and the reftests et al now run to completion on this instance type. They didn't before.  Thus we should enable a slave class in AWS for Linux64 test spot instances on this instance type so that we can migrate the 2.3 tests that are currently running on ix to AWS.
(Assignee)

Updated

3 years ago
Assignee: nobody → kmoir
(Assignee)

Comment 1

3 years ago
Created attachment 8450274 [details] [diff] [review]
bug1034055.patch

Initial patch for cloud tools to enable c3.xlarge as a new slave class

I'll add the patches for the other repos later.  As a fyi I plan to deploy these changes on ash first to ensure it doesn't break things.

I'm not sure how to match this platform in slavealloc.py here so it gets the correct slaves
http://hg.mozilla.org/build/cloud-tools/file/bfd92f8744a5/cloudtools/slavealloc.py#l70
Same thing applies to the regular expression for the builder types in water_pending.cfg since the tests will still be split across the existing instance types and the new c3.xlarge slaves

I'm not sure what the ami id should be in the configs for tst-emulator64.  Does this get generated automatically when puppet creates it?
Attachment #8450274 - Flags: feedback?(rail)
Comment on attachment 8450274 [details] [diff] [review]
bug1034055.patch

Review of attachment 8450274 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm!

::: configs/instance2ami.json
@@ +21,5 @@
> +     "regions": ["us-east-1", "us-west-2"]},
> +    {"ami-config": "ubuntu-12.04-x86_64-desktop",
> +     "instance-config": "tst-emulator64",
> +     "ssh-key": "aws-releng",
> +     "ssh-user": "ubuntu",

This code is not used anymore, but won't hurt.

::: configs/tst-emulator64
@@ +5,5 @@
> +        "domain": "test.releng.use1.mozilla.com",
> +        "ami": "ami-e48e1e8d",
> +        "subnet_ids": ["subnet-ae35ccc4", "subnet-8f32cbe5", "subnet-ff3542d7",
> +                       "subnet-b8643190", "subnet-fb97bc8f", "subnet-844b7ec2",
> +                       "subnet-ed35cc87", "subnet-5cd0d828", "subnet-7ca5f03a"],

I hope we have enough IP addresses! :)

@@ +7,5 @@
> +        "subnet_ids": ["subnet-ae35ccc4", "subnet-8f32cbe5", "subnet-ff3542d7",
> +                       "subnet-b8643190", "subnet-fb97bc8f", "subnet-844b7ec2",
> +                       "subnet-ed35cc87", "subnet-5cd0d828", "subnet-7ca5f03a"],
> +        "security_group_ids": ["sg-f0f1239f"],
> +        "instance_type": "c3.xlarge",

FTR. instance_type from this file is used only for on-demand instances.

@@ +11,5 @@
> +        "instance_type": "c3.xlarge",
> +        "distro": "ubuntu",
> +        "ssh_key": "aws-releng",
> +        "use_public_ip": true,
> +        "instance_profile_name": "tst-emulator64",

TODO: This will require adding a new profile in Amazon IAM. Copy paste of tst-linux64 IAM role will work.

@@ +17,5 @@
> +            "/dev/sda1": {
> +                "size": 20,
> +                "volume_type": "gp2",
> +                "instance_dev": "/dev/xvda1"
> +            }

Since this instance type comes with ephemeral storage, it's worth to copy /dev/sdb and /dev/sdc entries from http://hg.mozilla.org/build/cloud-tools/file/bfd92f8744a5/configs/bld-linux64#l21 (they are harmless).

@@ +23,5 @@
> +        "tags": {
> +            "moz-type": "tst-emulator64"
> +        }
> +    },
> +    "us-west-2": {

the same comments as above for usw2

::: configs/tst-emulator64.cloud-init
@@ +1,1 @@
> +#cloud-config

You can just symlink this file to tst-linux64 since they are identical

::: configs/watch_pending.cfg
@@ +23,5 @@
>          "^Ubuntu Mulet VM 12.04 x64.*": "tst-linux64",
>          "^Ubuntu ASAN VM 12.04 x64.*": "tst-linux64",
>          "^b2g_(emulator|ubuntu64)_vm": "tst-linux64",
>          "^Android 2.3( Armv6)? Emulator(?:(?!plain-reftest|crashtest|jsreftest).)*$": "tst-linux64",
> +        "^Android 2.3( Armv6)? Emulator(?:(?!plain-reftest|crashtest|jsreftest).)*$": "tst-emulator64",

This implies that no new tests are going to be run on this instance type. Only moving existing ones from m1.medium to c3.xlarge. Correct?

Everything except this line can (actually should) be landed in advance, so we generate AMIs and probably test them manually.
Attachment #8450274 - Flags: feedback?(rail) → feedback+
(In reply to Rail Aliiev [:rail] from comment #2)
> > +        "instance_profile_name": "tst-emulator64",
> 
> TODO: This will require adding a new profile in Amazon IAM. Copy paste of
> tst-linux64 IAM role will work.

Done.
Added the "golden" DNS entries (they will be used to puppetize the AMIs):

invtool A create --ip 10.134.50.244 --fqdn tst-emulator64-ec2-golden.test.releng.use1.mozilla.com  --private  --description "Golden AMI"
invtool PTR create --ip 10.134.50.244 --target tst-emulator64-ec2-golden.test.releng.use1.mozilla.com  --private --description "Golden AMI"

invtool A create --ip 10.132.49.72 --fqdn tst-emulator64-ec2-golden.test.releng.usw2.mozilla.com  --private  --description "Golden AMI"
invtool PTR create --ip 10.132.49.72 --target tst-emulator64-ec2-golden.test.releng.usw2.mozilla.com  --private --description "Golden AMI"
(In reply to Kim Moir [:kmoir] from comment #1)
> I'll add the patches for the other repos later.  As a fyi I plan to deploy
> these changes on ash first to ensure it doesn't break things.
> 
> I'm not sure how to match this platform in slavealloc.py here so it gets the
> correct slaves
> http://hg.mozilla.org/build/cloud-tools/file/bfd92f8744a5/cloudtools/
> slavealloc.py#l70

We may need to use something else to distinguish these... Maybe speed... The best approach would be adding a new column in the slavealloc DB table, I think.

> Same thing applies to the regular expression for the builder types in
> water_pending.cfg since the tests will still be split across the existing
> instance types and the new c3.xlarge slaves

Are we going to leave existing ones on m1.medium and use c3.xlarge for plain-reftest|crashtest|jsreftest?

> 
> I'm not sure what the ami id should be in the configs for tst-emulator64. 

The current value is OK. This is a base AMI, no puppet applied. This AMI is used to bootstrap and puppetize instances.

> Does this get generated automatically when puppet creates it?

Once we puppetize the "golden" AMIs (daily) we tag and publish them. Then aws_watch_pending.py uses http://hg.mozilla.org/build/cloud-tools/file/bfd92f8744a5/cloudtools/aws/ami.py#l127 to figure out the latest usable AMI.
(Assignee)

Comment 6

3 years ago
In regard to comment #5, yes I plan to leave the existing tests than can run on m1.medium because it isn't worth using a more expensive instance type that we need.  Thanks for all the feedback, I'm working on new patches.
(Assignee)

Comment 7

3 years ago
Created attachment 8450372 [details] [diff] [review]
bug1034055-2.patch
Attachment #8450274 - Attachment is obsolete: true
Attachment #8450372 - Flags: feedback?(rail)
Comment on attachment 8450372 [details] [diff] [review]
bug1034055-2.patch

Review of attachment 8450372 [details] [diff] [review]:
-----------------------------------------------------------------

::: configs/tst-emulator64
@@ +17,5 @@
> +            "/dev/sda1": {
> +                "size": 20,
> +                "volume_type": "gp2",
> +                "instance_dev": "/dev/xvda1"
> +            }

^ JSON syntax error :) missing coma

@@ +53,5 @@
> +            "/dev/sda1": {
> +                "size": 20,
> +                "volume_type": "gp2",
> +                "instance_dev": "/dev/xvda1"
> +            }

the same here

::: configs/watch_pending.cfg
@@ +23,5 @@
>          "^Ubuntu Mulet VM 12.04 x64.*": "tst-linux64",
>          "^Ubuntu ASAN VM 12.04 x64.*": "tst-linux64",
>          "^b2g_(emulator|ubuntu64)_vm": "tst-linux64",
>          "^Android 2.3( Armv6)? Emulator(?:(?!plain-reftest|crashtest|jsreftest).)*$": "tst-linux64",
> +        "^Android 2.3( Armv6)? Emulator(?:(?!plain-reftest|crashtest|jsreftest).)*$": "tst-emulator64",

I think you need:

"^Android 2.3( Armv6)? Emulator.* (plain-reftest|crashtest|jsreftest).*": "tst-emulator64",

(?!blah) means "not blah".

I'd test the regexp against the actual builder names... I know, this is bad :)

Once we swith to allthethings.json, this horrible block will be gone!
Attachment #8450372 - Flags: feedback?(rail) → feedback+
(Assignee)

Comment 9

3 years ago
Created attachment 8450384 [details] [diff] [review]
bug1034055puppet.patch
Attachment #8450384 - Flags: review?(rail)
Attachment #8450384 - Flags: review?(rail) → review+
(Assignee)

Updated

3 years ago
Attachment #8450384 - Flags: checked-in+
(Assignee)

Comment 10

3 years ago
Created attachment 8450470 [details] [diff] [review]
bug1034055-3.patch

I tested the regexp against builder names and it worked.  Like you said, I wouldn't land buildermap change in watch_pending.cfg until we tested the AMIs
Attachment #8450372 - Attachment is obsolete: true
Attachment #8450470 - Flags: review?(rail)
Comment on attachment 8450470 [details] [diff] [review]
bug1034055-3.patch

Review of attachment 8450470 [details] [diff] [review]:
-----------------------------------------------------------------

It looks fine to me as a not final patch for 2 reasons:

* we still need to distinguish tst-linux64 and tst-emulator64 in http://hg.mozilla.org/build/cloud-tools/file/bfd92f8744a5/cloudtools/slavealloc.py#l45

::: configs/watch_pending.cfg
@@ +23,5 @@
>          "^Ubuntu Mulet VM 12.04 x64.*": "tst-linux64",
>          "^Ubuntu ASAN VM 12.04 x64.*": "tst-linux64",
>          "^b2g_(emulator|ubuntu64)_vm": "tst-linux64",
>          "^Android 2.3( Armv6)? Emulator(?:(?!plain-reftest|crashtest|jsreftest).)*$": "tst-linux64",
> +        "^Android 2.3( Armv6)? Emulator.* (plain-reftest|crashtest|jsreftest).*": "tst-emulator64",

This would immediately start trying to start new instance types before they are ready to go (AMIs).

Can you put this hunk in a separate patch?
Attachment #8450470 - Flags: review?(rail) → review-
(Assignee)

Comment 12

3 years ago
Created attachment 8450507 [details] [diff] [review]
bug1034055-4.patch
Attachment #8450470 - Attachment is obsolete: true
Attachment #8450507 - Flags: review?(rail)
(Assignee)

Comment 13

3 years ago
Created attachment 8450530 [details] [diff] [review]
bug1034055bb.patch

buildbot patch, still need to test
(Assignee)

Comment 14

3 years ago
Created attachment 8450535 [details] [diff] [review]
bug1034055wp.patch

patch for watch pending after we test AMIs
Attachment #8450535 - Flags: review?(rail)
Comment on attachment 8450507 [details] [diff] [review]
bug1034055-4.patch

Review of attachment 8450507 [details] [diff] [review]:
-----------------------------------------------------------------

::: cloudtools/slavealloc.py
@@ +76,3 @@
>         slave.get("trustlevel") == "try":
>          return "tst-linux64"
> +    

Please kill the trailing space when you land it.

Kim, can you also file a bug to make this method more straight forward. Using speed here can be used as a temporary work around, but it may hit us in the future at some point. Please assign it to me.
Attachment #8450507 - Flags: review?(rail) → review+
Attachment #8450535 - Flags: review?(rail) → review+
Kim and Rail, thanks for working so quickly through all this yesterday - fantastic job! I can't quite believe how much you got done yesterday - really great work. =)
(Assignee)

Updated

3 years ago
Attachment #8450507 - Flags: checked-in+
(Assignee)

Comment 17

3 years ago
regarding comment 15, I opened bug 1034674
(Assignee)

Comment 18

3 years ago
So I couldn't get an ami to generate and I'm stuck so I think I'll wait until more people are back on Monday to help me debug.  

I tried to invoke the puppet script by hand but it failed because it couldn't create an c3.xlarge image in us-east-1b.   

As an aside, I thought adding -r us-west-2 to the script parameters in cron.pp for this instance type would help but invoking I see no error messages or log output. And no AMI created.  So not sure what's going on.
(Assignee)

Comment 19

3 years ago
I talked to rail this morning and the problem was that the address assigned to tst-emulator64-ec2-golden.test.releng.use1.mozilla.com.  He said to modify the config to use these subnets

  "subnet_ids": ["subnet-35a9835e", "subnet-8f21eeee", "subnet-8c20efed",
                        "subnet-173ff076", "subnet-0aa98361", "subnet-33a98358"],
        "security_group_ids": ["sg-f0f1239f"],

with python free_ips.py -c /tmp/tst-emulator64 -n 1 and then update the A and PTR records for ec2-golden.test.releng.use1.mozilla.com to reflect the new address. Once this is propagated, I should be able to rerun the ami generation script successfully.
See https://bugzilla.mozilla.org/show_bug.cgi?id=1034034#c11

Maybe we can still run these tests on IX after all?
Flags: needinfo?(kmoir)
Flags: needinfo?(coop)
(In reply to Pete Moore [:pete][:pmoore] from comment #20)
> See https://bugzilla.mozilla.org/show_bug.cgi?id=1034034#c11
> 
> Maybe we can still run these tests on IX after all?

The builders are configured differently from the testers, so we can't just bulk out the existing pools by re-imaging the builders.

However, since we're effectively creating a new pool of AWS machines for mobile testing, there's no reason why we couldn't use a new hardware pool instead, provided the build machines meet the other prerequisites for use as test machines. I'm specifically worried about the minimum graphics requirements here because I don't think the builders have graphics cards and, because of their design/configuration, may not be able to accept new cards if required.

I'm still a bigger fan of pushing work into AWS if possible, but I'll let Kim decide the best course of action here. If we want to try an experiment, we can pull a windows ix builder to test.
Flags: needinfo?(coop)
(Assignee)

Comment 22

3 years ago
I'm going to proceed with testing on AWS right now and get my patches tested on my dev-master and see if we can that working on a branch since we seem to be pretty close there and this would resolve some immediate capacity issues.

If we did decide to go with the builder ix machines we would have to get ateam to test them since they are a different hardware ref than our existing ix pool.
Flags: needinfo?(kmoir)
(Assignee)

Updated

3 years ago
Depends on: 1035270
(Assignee)

Comment 23

3 years ago
Created attachment 8452081 [details] [diff] [review]
bug1034055bb.patch wip

patches to enable on ash. The builder diff looks good.  I just can't get my AWS loaner to talk to my dev-master due to networking issues so I can invoke sendchanges.  Will have to talk to people tomorrow morning and see how I can get them to connect.
Attachment #8450535 - Attachment is obsolete: true
(Assignee)

Updated

3 years ago
Attachment #8450530 - Attachment is obsolete: true
(Assignee)

Comment 24

3 years ago
Talked to nthomas and got my dev-master ports fixed. Invoked sendchanges and running some tests overnight with this slave.
(Assignee)

Comment 25

3 years ago
The loaner script added the wrong instance type so the tests timed out last night.  I changed my loaner instance to c3.xlarge and now the tests are running green.  I'll attach patches to enable this configuration on ash.
(Assignee)

Updated

3 years ago
Depends on: 1035863
(Assignee)

Comment 26

3 years ago
Created attachment 8452349 [details] [diff] [review]
bug1034055bb-2.patch

patches to enable on ash
Attachment #8452081 - Attachment is obsolete: true
(Assignee)

Comment 27

3 years ago
Created attachment 8452352 [details] [diff] [review]
bug1034055puppet-2.patch

we need two names for the slave classes just like last time one for armv6, one for 2.3  I can delete the ix slave classes that I created earlier once this is in production
(Assignee)

Comment 28

3 years ago
It turns out the slave I used last night didn't have the correct AMI or instance tyoe.  I had to recreate the loaner twice today because there were some problems in our configs.  I'm now working through issues where the tests don't complete. I don't think it's a problem with the instance, just the test setup, still investigating.

http://dev-master1.srv.releng.scl3.mozilla.com:8036/builders/Android%202.3%20Armv6%20Emulator%20ash%20opt%20test%20plain-reftest-7/builds/3/steps/run_script/logs/stdio
Duplicate of this bug: 1031083
Blocks: 994920
(Assignee)

Comment 30

3 years ago
So it seems ash is in a bad state and this was the source of all the errors I saw.  When I changed my dev-master to run these tests on m-c and invoked the associated sendchanges, the testing is proceeding.
(Assignee)

Updated

3 years ago
Blocks: 1031083
(Assignee)

Comment 31

3 years ago
Comment on attachment 8452349 [details] [diff] [review]
bug1034055bb-2.patch

I won't land this until the slavealloc entries are in place that I will attach shortly but asking for review since it's ready
Attachment #8452349 - Flags: review?(rail)
(Assignee)

Comment 32

3 years ago
Comment on attachment 8452352 [details] [diff] [review]
bug1034055puppet-2.patch

ubuntu64_vm_large will still be used for Android 2.3 tests but I need a separate name for armv6 tests to avoid builder name contention
Attachment #8452352 - Flags: review?(rail)
(Assignee)

Updated

3 years ago
Attachment #8452349 - Flags: review?(rail)
(Assignee)

Comment 33

3 years ago
Created attachment 8453274 [details] [diff] [review]
bug1034055bb-3.patch
Attachment #8452349 - Attachment is obsolete: true
Attachment #8453274 - Flags: review?(rail)
(Assignee)

Comment 34

3 years ago
Created attachment 8453275 [details]
emulatorslave

List of slaves to add to slavealloc
Attachment #8453275 - Flags: review?(rail)
Attachment #8452352 - Flags: review?(rail) → review+
Attachment #8453275 - Flags: review?(rail) → review+
(Assignee)

Updated

3 years ago
Blocks: 1036609
(Assignee)

Comment 35

3 years ago
Created attachment 8453319 [details] [diff] [review]
bug10340550-4.patch
Attachment #8453274 - Attachment is obsolete: true
Attachment #8453274 - Flags: review?(rail)
Attachment #8453319 - Flags: review?(rail)
Comment on attachment 8453319 [details] [diff] [review]
bug10340550-4.patch

Review of attachment 8453319 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozilla-tests/production_config.py
@@ +69,5 @@
>  for i in range(1,100) + range(301,400):
>      SLAVES['ubuntu64_vm']['tst-linux64-ec2-%03i' % i] = {}
>  
> +for i in range(1,100) + range(301,400):
> +    SLAVES['ubuntu64_vm_large']['tst-emulator64-spot-%03i' % i] = {}  

Can you kill the trailing spaces above.
Attachment #8453319 - Flags: review?(rail) → review+
(Assignee)

Updated

3 years ago
Attachment #8450535 - Attachment is obsolete: false
(Assignee)

Comment 37

3 years ago
Comment on attachment 8453275 [details]
emulatorslave

imported to slavealloc
(Assignee)

Comment 38

3 years ago
Tests all ran green on m-c on my dev-master on this image. So I'll get ready to deploy on ash, then to other branches.
(Assignee)

Updated

3 years ago
Attachment #8452352 - Flags: checked-in+
(Assignee)

Comment 39

3 years ago
First I have to finish setting up two new masters in AWS to handle the additional slaves in bug 1035863
(Assignee)

Comment 40

3 years ago
Created attachment 8454517 [details] [diff] [review]
bug1034055wp-2.patch
Attachment #8450535 - Attachment is obsolete: true
Attachment #8454517 - Flags: review?(rail)
Attachment #8454517 - Flags: review?(rail) → review+
(Assignee)

Updated

3 years ago
Attachment #8453319 - Flags: checked-in+
(Assignee)

Updated

3 years ago
Attachment #8454517 - Flags: checked-in+
(Assignee)

Comment 41

3 years ago
In production and merged m-c to ash to see if the emulator images are spun up correctly for Android 2.3 Armv6 tests on this branch.
Once you deploy the changes to all branches, we should also adjust the slave_health regexes like in http://hg.mozilla.org/build/slave_health/rev/5a892e2e304f
Hi Rail, Kim,

I noticed we are getting emails like these, but maybe this is solved by comment 42?


On 14 Jul 2014, at 17:41, Cron Daemon <root@cruncher.srv.releng.scl3.mozilla.com> wrote:

Unknown slave_type for test: tst-emulator64-spot-059
Unknown slave_type for test: tst-emulator64-spot-058
Unknown slave_type for test: tst-emulator64-spot-325
Unknown slave_type for test: tst-emulator64-spot-324
Unknown slave_type for test: tst-emulator64-spot-323
Unknown slave_type for test: tst-emulator64-spot-322
Unknown slave_type for test: tst-emulator64-spot-321
Unknown slave_type for test: tst-emulator64-spot-320
Unknown slave_type for test: tst-emulator64-spot-051
Unknown slave_type for test: tst-emulator64-spot-050
Unknown slave_type for test: tst-emulator64-spot-053
Unknown slave_type for test: tst-emulator64-spot-052
.....
.....
(Assignee)

Comment 44

3 years ago
Created attachment 8455396 [details] [diff] [review]
bug1034055allbranches.patch

Patch to enable on relevant branches.  The builder diff just shows an ordering difference.

I'll attach a patch for the cloud-tools too to remove Ash from the regexp.  Also, I'll remove the slave classes from puppet after we roll this out because there will be jobs in progress when we reconfig.
Attachment #8455396 - Flags: review?(rail)
(Assignee)

Comment 45

3 years ago
Created attachment 8455406 [details] [diff] [review]
bug1034055wp-3.patch

watch all branches not just ash
Attachment #8455406 - Flags: review?(rail)
Comment on attachment 8455406 [details] [diff] [review]
bug1034055wp-3.patch

I think "^Android 2.3( Armv6)? Emulator.*" is what you want, the rest is redundant.
Attachment #8455406 - Flags: review?(rail) → review+
(Assignee)

Comment 47

3 years ago
Created attachment 8455434 [details] [diff] [review]
bug1034055-slavehealth.patch

adjust slave health
Attachment #8455434 - Flags: review?(rail)
Comment on attachment 8455434 [details] [diff] [review]
bug1034055-slavehealth.patch

Review of attachment 8455434 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/slave_health.js
@@ +91,5 @@
>  	} else if (pending.match(/(Ubuntu HW 12.04 x64|b2g_ics_armv7a_gecko_emulator_hw|b2g_emulator_hw)/)) {
>  	    slavetype = "talos-linux64-ix";
>  	} else if (pending.match(/Android (?:4.2 )?x86/)) {
>  	    slavetype = "talos-linux64-ix";
>  	} else if (pending.match(/Android 2.3( Armv6)? Emulator(?:(?!plain-reftest|crashtest|jsreftest).)*$/)) {

you also need to change the regexp to match the same rexep in watch_pending.cfg

@@ +92,5 @@
>  	    slavetype = "talos-linux64-ix";
>  	} else if (pending.match(/Android (?:4.2 )?x86/)) {
>  	    slavetype = "talos-linux64-ix";
>  	} else if (pending.match(/Android 2.3( Armv6)? Emulator(?:(?!plain-reftest|crashtest|jsreftest).)*$/)) {
> +	    slavetype = "tst-emulator64-spot";

Even though the line above looks ok, it's not enough. grepping slave_health gives me a lot of entries for tst-linux64, probably you need more changes to define this type. Coop may know more.
Attachment #8455434 - Flags: review?(rail) → review-
(In reply to Pete Moore [:pete][:pmoore] from comment #43)
> Hi Rail, Kim,
> 
> I noticed we are getting emails like these, but maybe this is solved by
> comment 42?
> 
> 
> On 14 Jul 2014, at 17:41, Cron Daemon
> <root@cruncher.srv.releng.scl3.mozilla.com> wrote:
> 
> Unknown slave_type for test: tst-emulator64-spot-059

root@cruncher?!! /me goes to get rid of this.
(In reply to Rail Aliiev [:rail] from comment #49)
> (In reply to Pete Moore [:pete][:pmoore] from comment #43)
> > Hi Rail, Kim,
> > 
> > I noticed we are getting emails like these, but maybe this is solved by
> > comment 42?
> > 
> > 
> > On 14 Jul 2014, at 17:41, Cron Daemon
> > <root@cruncher.srv.releng.scl3.mozilla.com> wrote:
> > 
> > Unknown slave_type for test: tst-emulator64-spot-059
> 
> root@cruncher?!! /me goes to get rid of this.


I think this is slave_health. phew...
(Assignee)

Comment 51

3 years ago
Created attachment 8455481 [details] [diff] [review]
bug1034055allbranches-2.patch
Attachment #8455396 - Attachment is obsolete: true
Attachment #8455396 - Flags: review?(rail)
Attachment #8455481 - Flags: review?(rail)
Comment on attachment 8455481 [details] [diff] [review]
bug1034055allbranches-2.patch

Review of attachment 8455481 [details] [diff] [review]:
-----------------------------------------------------------------

Can you also remove ubuntu64_hw_mobile from BuildSlaves.py.template when you land?

A separate patch to remove ubuntu64_hw_mobile from puppet is also appreciated.

::: mozilla-tests/mobile_config.py
@@ +1426,2 @@
>              'debug_unittest_suites': [],
>          },       

Can you also kill the trailing space above when you land this.

@@ +1599,5 @@
>          ANDROID_2_3_AWS_DICT['opt_unittest_suites'].append(suite)
>  
>  # enable android 2.3 tests to ride the trains bug 1004791
>  for name, branch in items_at_least(BRANCHES, 'gecko_version', 32):
>      # Loop removes it from any branch that gets beyond here

as a follow up, can you fie a bug to fix this loop using items_before(). It's much more convenient for merge duty patches.
Attachment #8455481 - Flags: review?(rail) → review+
(Assignee)

Comment 53

3 years ago
Created attachment 8455508 [details] [diff] [review]
bug1034055puppetremove.patch

remove old slave classes from puppet once buildbot config patches are landed and reconfiged
Attachment #8455508 - Flags: review?(rail)
(Assignee)

Updated

3 years ago
Blocks: 1038320
(Assignee)

Comment 54

3 years ago
Created attachment 8455516 [details] [diff] [review]
bug1034055allbranches-3.patch

better patch than comment 52 + bug filed for loop in bug 1038320
(Assignee)

Comment 55

3 years ago
Comment on attachment 8455434 [details] [diff] [review]
bug1034055-slavehealth.patch

coop is going to add the stuff to slave health so I won't worry about writing a patch for it.  Thanks coop!
Attachment #8455434 - Attachment is obsolete: true
Attachment #8455508 - Flags: review?(rail) → review+
(Assignee)

Comment 56

3 years ago
Tests are green on ash so I'll land my patches to enable on all branches and reconfig again first thing tomorrow morning.  Reconfigs are fast when people in California are still sleeping and the load is light on the masters :-)
(Assignee)

Updated

3 years ago
Attachment #8455516 - Flags: checked-in+
(Assignee)

Comment 57

3 years ago
Comment on attachment 8455406 [details] [diff] [review]
bug1034055wp-3.patch

like this as rail suggested
Android 2.3( Armv6)? Emulator.*
Attachment #8455406 - Flags: checked-in+
(Assignee)

Comment 58

3 years ago
In production
(Assignee)

Comment 59

3 years ago
We are seeing a few problems cloning hg on some spot instances, but this seems to be bug 1036176.

I can see in slave health that the entire pool of spot instances is up and running jobs.  However, there are still ~800 pending jobs so we'll have to wait and see if we need to expand the pool, especially given that there are b2g tests that want to run on this same instance type in bug 1031083.
(Assignee)

Comment 60

3 years ago
Created attachment 8456351 [details] [diff] [review]
bug1034055moarinstances.patch

more instances to reduce pending
Attachment #8456351 - Flags: review?(rail)
(Assignee)

Comment 61

3 years ago
Created attachment 8456352 [details]
emulatorlist2.txt

add new instances to slavealloc
Attachment #8456352 - Flags: review?(rail)
Attachment #8456351 - Flags: review?(rail) → review+
Comment on attachment 8456352 [details]
emulatorlist2.txt

conditional r+:
s/tst-linux64-spot/tst-emulator64-spot/

(it would fail inserting into the db)
Attachment #8456352 - Flags: review?(rail) → review+
(Assignee)

Comment 63

3 years ago
Comment on attachment 8455508 [details] [diff] [review]
bug1034055puppetremove.patch

and merged
Attachment #8455508 - Flags: checked-in+
(Assignee)

Updated

3 years ago
Attachment #8456351 - Flags: checked-in+
(Assignee)

Comment 64

3 years ago
Created attachment 8456406 [details]
emulatorlist2.txt

actually the hostnames were wrong, this is what I added to the db
Attachment #8456352 - Attachment is obsolete: true
(Assignee)

Updated

3 years ago
Blocks: 1038941
(Assignee)

Updated

3 years ago
Blocks: 1039227
(Assignee)

Comment 65

3 years ago
Comment on attachment 8456406 [details]
emulatorlist2.txt

added to and enabled in slavealloc
(Assignee)

Comment 66

3 years ago
New slave pool in production
(Assignee)

Comment 67

3 years ago
New instances are up but the pending count is still high (~1200).  Will watch it over the next 24 hours and see how it keeps up with load.
(Assignee)

Comment 68

3 years ago
Pending count looks good.  Closing.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.