Closed Bug 1031083 Opened 6 years ago Closed 6 years ago

buildbot changes to run selected b2g tests on c3.xlarge

Categories

(Release Engineering :: General, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jgriffin, Assigned: kmoir)

References

Details

Attachments

(7 files, 1 obsolete file)

We have a few B2G test suites that fail on the current m1.medium nodes either consistently of frequently because they become CPU bound.

I've experimented with different node types in bug 1026800 and found that the tests seem to run acceptably (albeit slowly) on c3.large instances.

Can we create a new platform that would allow us to assign jobs selectively to this AWS node type?
Works for me!
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1034055
Assignee: nobody → kmoir
Assignee: kmoir → nobody
This isn't a duplicate of bug 1034055.  Bug 1034055 will implement the new slave class.  But there are still changes required in the buildbot configs so that the b2g tests use the new slave class which this bug should address.
Status: RESOLVED → REOPENED
Depends on: 1034055
Resolution: DUPLICATE → ---
Summary: Add new buildbot platform for running tests on c3.large AWS instances → buildbot changes to run selected b2g tests on c3.xlarge
We may need to scale the pool further for this, but bug 1034055 is fixed. Could we get a list of suites that should move to the more powerful instances ?
There are only two jobs that need this node type, both only running on cedar:

Gip (aka gaia-ui-test) on B2G ICS Emulator Opt
mochitest-media on B2G ICS Emulator Opt

Once they're running on cedar on the new node type, we'll still likely have to do some work to green them up before we can roll them out everywhere.

So, in the short term, the impact on the slave pool should be minimal.
kmoir, is this something you could do relatively quickly ?
Flags: needinfo?(kmoir)
It would take a day or so to test, write patches and get reviews barring any unforseen problems.  We need to create a new slave class for the instance type so this means puppet, cloud tools changes etc in addition to buildbot-configs.  According to coop, my first priority is bug 1019724 right now but I can ask him when he returns tomorrow if this is a higher priority.
Flags: needinfo?(kmoir)
I talked to coop.  He was hesitant to add more jobs given our current load (89K test jobs yesterday!!).  Since bug 1042835 will remove significant jobs from the slave class, I'll fix this first and then take a look at this bug. But bug 1019724 is my current top priority.
Depends on: 1042835
Assignee: nobody → kmoir
Attached patch bug1031083.patchSplinter Review
Attachment #8466333 - Flags: review?(nthomas)
new puppet slave class
Attachment #8466336 - Flags: review?(nthomas)
cloud tools watch pending
Attachment #8466338 - Flags: review?(nthomas)
Attached file bug1031083builder.diff (obsolete) —
builder diff
Comment on attachment 8466333 [details] [diff] [review]
bug1031083.patch

Review of attachment 8466333 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm. We can cleanup later based on Jonathan responses.

::: mozilla-tests/b2g_config.py
@@ +123,5 @@
>      'hg_bin': 'hg',
>      'reboot_command': ['/tools/buildbot/bin/python'] + MOZHARNESS_REBOOT_CMD,
>  }
>  
> +PLATFORMS['emulator']['slave_platforms'] = ['ubuntu64_vm-b2g-emulator', 'ubuntu64_vm-b2g-lg-emulator', 'ubuntu64_hw-b2g-emulator']

Looks like ubuntu64_hw-b2g-emulator can be deprecated, assuming we have no further plans to use it. jgriffin ?

@@ +1640,5 @@
>  BRANCHES['cedar']['branch_name'] = "Cedar"
>  BRANCHES['cedar']['repo_path'] = "projects/cedar"
>  BRANCHES['cedar']['mozharness_tag'] = "default"
>  BRANCHES['cedar']['platforms']['emulator']['ubuntu64_vm-b2g-emulator']['opt_unittest_suites'] = \
> +    MOCHITEST + CRASHTEST + XPCSHELL + MARIONETTE + JSREFTEST + GAIA_UI + CPPUNIT

jgriffin, are we wanting to run gaia ui-test side-by-side on slow and fast VMs ? If not, it looks like GAIA_UI could be removed here.

::: mozilla-tests/production_config.py
@@ +93,5 @@
>  SLAVES['ubuntu32_vm-b2gdt'] = SLAVES['ubuntu32_vm']
>  SLAVES['ubuntu64_vm-b2g'] = SLAVES['ubuntu64_vm']
>  SLAVES['ubuntu64_vm-b2gdt'] = SLAVES['ubuntu64_vm']
>  SLAVES['ubuntu64_vm-b2g-emulator'] = SLAVES['ubuntu64_vm']
> +SLAVES['ubuntu64_vm-b2g-lg-emulator'] = SLAVES['ubuntu64_vm_large'] 

Nit, trailing whitespace.
Attachment #8466333 - Flags: review?(nthomas) → review+
Attachment #8466336 - Flags: review?(nthomas) → review+
Comment on attachment 8466338 [details] [diff] [review]
bug1031083c-t.patch

Not sure what the trailing .* is for, given builders names of
  b2g_emulator_vm_large cedar opt test gaia-ui-test
  b2g_emulator_vm_large cedar opt test mochitest-media
but otherwise looks good.
Attachment #8466338 - Flags: review?(nthomas) → review+
(In reply to Nick Thomas [:nthomas] from comment #13)
> Comment on attachment 8466333 [details] [diff] [review]
> bug1031083.patch
> 
> Review of attachment 8466333 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> lgtm. We can cleanup later based on Jonathan responses.
> 
> ::: mozilla-tests/b2g_config.py
> @@ +123,5 @@
> >      'hg_bin': 'hg',
> >      'reboot_command': ['/tools/buildbot/bin/python'] + MOZHARNESS_REBOOT_CMD,
> >  }
> >  
> > +PLATFORMS['emulator']['slave_platforms'] = ['ubuntu64_vm-b2g-emulator', 'ubuntu64_vm-b2g-lg-emulator', 'ubuntu64_hw-b2g-emulator']
> 
> Looks like ubuntu64_hw-b2g-emulator can be deprecated, assuming we have no
> further plans to use it. jgriffin ?

That's correct, it can be nuked.

> 
> @@ +1640,5 @@
> >  BRANCHES['cedar']['branch_name'] = "Cedar"
> >  BRANCHES['cedar']['repo_path'] = "projects/cedar"
> >  BRANCHES['cedar']['mozharness_tag'] = "default"
> >  BRANCHES['cedar']['platforms']['emulator']['ubuntu64_vm-b2g-emulator']['opt_unittest_suites'] = \
> > +    MOCHITEST + CRASHTEST + XPCSHELL + MARIONETTE + JSREFTEST + GAIA_UI + CPPUNIT
> 
> jgriffin, are we wanting to run gaia ui-test side-by-side on slow and fast
> VMs ? If not, it looks like GAIA_UI could be removed here.

No, we only need the fast VM, we know it doesn't work at all on the slow one.

> 
> ::: mozilla-tests/production_config.py
> @@ +93,5 @@
> >  SLAVES['ubuntu32_vm-b2gdt'] = SLAVES['ubuntu32_vm']
> >  SLAVES['ubuntu64_vm-b2g'] = SLAVES['ubuntu64_vm']
> >  SLAVES['ubuntu64_vm-b2gdt'] = SLAVES['ubuntu64_vm']
> >  SLAVES['ubuntu64_vm-b2g-emulator'] = SLAVES['ubuntu64_vm']
> > +SLAVES['ubuntu64_vm-b2g-lg-emulator'] = SLAVES['ubuntu64_vm_large'] 
> 
> Nit, trailing whitespace.
patch to address whitespace, remove duplicate gaia-ui tests and deprecate ubuntu64_hw-b2g-emulator as per review comments
Attachment #8466339 - Attachment is obsolete: true
Comment on attachment 8466336 [details] [diff] [review]
bug1031083puppet.patch

and merged to production
Attachment #8466336 - Flags: checked-in+
fixed cloud tools patch
remove ubuntu64_hw-b2g-emulator from puppet after reconfig
Attachment #8467815 - Flags: review?(nthomas)
Comment on attachment 8467806 [details] [diff] [review]
bug1031083c-t-2.patch

except removed extra , at eol
Attachment #8467806 - Flags: checked-in+
In production
Attachment #8467815 - Flags: review?(nthomas) → review+
Attachment #8467815 - Flags: checked-in+
Verified on tbpl that they are running on the correct instance type.  However, the GIP tests failed with a harness failure (red) and the gaia ui tests are orange. :jgriffin could you investigate?
Yes, I will look.  Thanks for making the switch!  Greening up the tests will be a separate project, so I think we can close this as resolved.
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.