Closed Bug 992969 Opened 6 years ago Closed 6 years ago

Move Android 2.3 reftests to ix slaves (ash only)

Categories

(Release Engineering :: General, defect)

x86_64
Linux
defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gbrown, Assigned: kmoir)

References

Details

Attachments

(4 files, 7 obsolete files)

Android 2.3 reftests, js-reftests, and crashtests run very slowly on the aws ec2 instances we currently use -- too slowly to run them to completion for any reasonable number of jobs (see bug 980519 for real numbers and experiments on different aws instance types). My efforts to make them run faster don't seem to be succeeding. 

These tests run significantly faster on ix slaves. I'd like to demonstrate that on ash: Continue running mochitests, robocop, and xpcshell on aws, but move the reftest, js-reftest, and crashtest jobs to ix slaves.

I recently verified that reftests run well on an ix loaner: all jobs ran to completion with the current chunk scheme. Most tests passed, but I will need to make some final manifest changes to avoid a few failures.


Running multiple tests concurrently (in "sets", as we do for android x86) is an option, but I do not recommend it. Running 4 reftest jobs concurrently more than doubles run-times, and even running just 2 jobs concurrently produces a significant increase in run-time. (Also, :ryanvm recently expressed frustration with sets, and I agree that sets are harder to understand at a glance.)
Assignee: nobody → kmoir
Blocks: 967913
Attached patch bug992969.patch (obsolete) — Splinter Review
will attach builder diff
Attachment #8404122 - Flags: review?(bugspam.Callek)
Attached patch bug992969.diff (obsolete) — Splinter Review
note diff shows changes in ordering only.  On my dev master the correct slave types are assigned to the different test suites.
Comment on attachment 8404122 [details] [diff] [review]
bug992969.patch

Review of attachment 8404122 [details] [diff] [review]:
-----------------------------------------------------------------

what about:

ANDROID_2_3_MOZHARNESS_DICT = ANDROID_2_3_MOZHARNESS_ENABLED_DICT + ANDROID_2_3_MOZHARNESS_DISABLED_DICT
ANDROID_2_3_MOZHARNESS_UNITTEST_DICT = {
    'opt_unittest_suites': ANDROID_2_3_MOZHARNESS_DICT,
    'debug_unittest_suites': [],
}
now?

also if this ever goes to another tree these ENABLED/DISABLED names stink for this.
Attachment #8404122 - Flags: review?(bugspam.Callek) → review+
Attached patch bug992969-2.patch (obsolete) — Splinter Review
So I removed the ANDROID_2_3_MOZHARNESS_DICT part when I checked it in.  Armen requested the disabled and enabled distinctions in bug 989462
Attachment #8404692 - Flags: checked-in+
Hi Kim,

This has caused a jenkins failure (http://10.134.48.37:8080/job/buildbot-configs_tests/1913/changes#detail) - ok for me to roll this back?

Pete
Flags: needinfo?(kmoir)
Attachment #8404692 - Flags: checked-in+ → checked-in-
Flags: needinfo?(kmoir)
Attached patch bug992969-3.patch (obsolete) — Splinter Review
So this patch addresses the duplicate builder name on the scheduling masters that caused the breakage during the last reconfig. I had tested this on the test master, not a scheduling master this is why it didn't show up until later.  I don't really like this because it means there the test jobs look different that are spawned from the same build.  A simpler solution since there isn't much traffic on ash is to change all ash tests for Android 2.3 to run on ix machines for the duration of the test.  :gbrown what do you think?
Flags: needinfo?(gbrown)
That's fine with me. In fact, I am curious to see how mochitests, etc will run on ix.

Assuming that reftests-on-ix work out, I will want to start running reftests on ix on trunk trees, alongside the other tests running on aws. Is that going to be possible, in the longer term?
Flags: needinfo?(gbrown)
Attached patch bug992969-4.patch (obsolete) — Splinter Review
patch to run on android 2.3 tests on ix machines only for all tests on ash

re gbrown's question:

>>Assuming that reftests-on-ix work out, I will want to start running reftests >>on ix on trunk trees, alongside the other tests running on aws. Is that going >>to be possible, in the longer term?

I don't think so but we'll have to change the builder name to be unique on each platform
Attachment #8404852 - Flags: review?(bugspam.Callek)
Attachment #8404852 - Flags: review?(bugspam.Callek) → review+
Attachment #8404852 - Flags: checked-in+
Comment on attachment 8404852 [details] [diff] [review]
bug992969-4.patch

still scheduler failures, with non-unique builder names, investigating
Attachment #8404852 - Flags: checked-in+ → checked-in-
Attached patch bug992969-5.patch (obsolete) — Splinter Review
So I think the problem with the duplicate builder names is fixed with this patch.  I tested it on a scheduling master using the steps here 
https://wiki.mozilla.org/ReleaseEngineering:TestingTechniques#setup_one_master_and_output_the_steps_for_it

and it looked fine
Attachment #8406908 - Flags: review?(bugspam.Callek)
Comment on attachment 8406908 [details] [diff] [review]
bug992969-5.patch

Review of attachment 8406908 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozilla-tests/mobile_config.py
@@ +100,5 @@
>  PLATFORMS['android']['vm_android_2_3'] = {
>      'name': "Android 2.3 Emulator",
>  }
> +PLATFORMS['android']['ubuntu64_hw'] = {
> +    'name': "Android 2.3 Emulator on ix",

nit if it passes without dupe names can you rename this back to "Android 2.3 Emulator" we'll get the "on ix" for free based on slave itself.

If it doesn't pass without dupe names then I think we still might have a problem (e.g. double mochitest-foo one on ix one not, on the same branch)
Attachment #8406908 - Flags: review?(bugspam.Callek) → review+
That didn't work either. Callek suggested 

Callek|Buildduty	: ahhhhh INFO - ValueError: Schedulers must have unique names, but 'tests-ash-ubuntu64_hw-opt-unittest' was a duplicate
	kmoir	huh
	Callek|Buildduty	kmoir: doing SLAVES['ubuntu64_hw_mobile'] = SLAVES['ubuntu64_hw'] and then using the former might solve your issue
Attached patch bug992969-6.patch (obsolete) — Splinter Review
Attachment #8404122 - Attachment is obsolete: true
Attachment #8404692 - Attachment is obsolete: true
Attachment #8404777 - Attachment is obsolete: true
Attachment #8404852 - Attachment is obsolete: true
Attachment #8406908 - Attachment is obsolete: true
Attached patch builder.diffSplinter Review
Attachment #8404125 - Attachment is obsolete: true
Attachment #8407648 - Flags: review?(bugspam.Callek)
Attachment #8407646 - Flags: review?(bugspam.Callek)
Comment on attachment 8407651 [details] [diff] [review]
builder.diff

not a fan of s/Emulator/Emulator on ix/

Also would be useful for a full dump-master diff (specifically of the scheduler masters)
Comment on attachment 8407648 [details] [diff] [review]
bug992969-6.patch

Review of attachment 8407648 [details] [diff] [review]:
-----------------------------------------------------------------

assuming the schedulers pass this time!

::: mozilla-tests/mobile_config.py
@@ +100,5 @@
>  PLATFORMS['android']['vm_android_2_3'] = {
>      'name': "Android 2.3 Emulator",
>  }
> +PLATFORMS['android']['ubuntu64_hw_mobile'] = {
> +    'name': "Android 2.3 Emulator on ix",

can we name this "Android 2.3 Emulator" instead?
Attachment #8407648 - Flags: review?(bugspam.Callek) → review+
Comment on attachment 8407646 [details] [diff] [review]
bug992969puppet.patch

Review of attachment 8407646 [details] [diff] [review]:
-----------------------------------------------------------------

we'll likely want a heads up to release@ about this change since dev masters will need manual update
Attachment #8407646 - Flags: review?(bugspam.Callek) → review+
Attached patch dump_master.diffSplinter Review
diff of dump_master.py output on test scheduling master
I'll remove the "on ix" part for the name when I land the bb change
Comment on attachment 8407690 [details] [diff] [review]
dump_master.diff

f+ with the above mentioned (in config patch) naming change please.
Attachment #8407690 - Flags: feedback+
The problem with the previous patch was that it listed ubuntu64_vm_mobile in the BuildSlaves.py.template instead of ubuntu64_hw_mobile.  WHen I fixed this and ran test-masters.sh it was all green :-)
Attachment #8407648 - Attachment is obsolete: true
Attachment #8407825 - Flags: review?(bugspam.Callek)
Attachment #8407825 - Flags: review?(bugspam.Callek) → review+
Comment on attachment 8407825 [details] [diff] [review]
bug992969-6b.patch

Landed this and the tests passed as green.  Thanks Callek for all your reviews.  I now know how to test for all the things....good for next time:-)
Attachment #8407825 - Flags: checked-in+
This was released to production on a reconfig last Thursday, and I've verified that it's working as expected on tbpl.  gbrown, let me know how your tests go and what the next steps will be.
I think I can close this given that bug 1004791 is now in production too.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.