Closed Bug 1143901 Opened 9 years ago Closed 9 years ago

Add more capacity in AWS for tests (tst-linux64-spot)

Categories

(Release Engineering :: General, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Assigned: massimo)

Details

Attachments

(3 files)

Currently 1500 linux64 test jobs running, and 1300+ pending (mostly in try). Time to add more masters and testers to meet the scaling up of push load.

Bug 1090139 for the last time we did this.
Summary: Add more capacity in AWS for tests → Add more capacity in AWS for tests (tst-linux64-spot)
our master lag metrics look ok so far, so maybe we could add a few hundred slaves first before worrying about masters?
I also filed bug 1143681 yesterday to make sure we're using the capacity we already have.
bug 1090568 was the last time we added more slaves
Assignee: nobody → mgervasini
Attachment #8578759 - Flags: review?(rail)
Attached file slavealloc.csv
slavealloc changes
Attachment #8578812 - Flags: review?(rail)
Attachment #8578759 - Flags: review?(rail) → review+
Attachment #8578782 - Flags: review?(rail) → review+
Attachment #8578812 - Attachment mime type: text/csv → text/plain
Comment on attachment 8578812 [details]
slavealloc.csv

Make sure you add this only after the reconfig for the buildbot-configs patch happened. Otherwise cloud-tools will start using these names and get unauthorized login errors from the masters.
Attachment #8578812 - Flags: review?(rail) → review+
Comment on attachment 8578782 [details] [diff] [review]
[buildbot-configs] Bug 1143901 - Increase the number of tst-linux64-spot instances; pep8 fixes.patch

thanks rail,

landed: https://hg.mozilla.org/build/buildbot-configs/rev/6e758f0c91ca
Attachment #8578782 - Flags: checked-in+
Comment on attachment 8578812 [details]
slavealloc.csv

imported into slavealloc
Attachment #8578812 - Flags: checked-in+
The cloud-tools change landed several hours ago, but the number of running jobs hasn't risen above our previous limit of ~1500 (and there are pending jobs which should drive it higher). I would have expected the combined effect to be enabling an extra 400 slaves. Are there some more steps to complete here ?
I think we forgot to enable the slaves in slavealloc (--enable during initial import would do that).
Enabled in slavealloc:
* tst-linux64-spot-{1700-1719} us-west2
* tst-linux64-spot-{1900-1919} us-east-1

If there are no issues, I am going to enable more instances tomorrow morning (Europe Time)
Enabled in slavealloc:
* tst-linux64-spot-{1720-1739} us-west2
* tst-linux64-spot-{1920-1939} us-east-1
Enabled in slavealloc:
* tst-linux64-spot-{1740-1769} us-west2
* tst-linux64-spot-{1940-1969} us-east-1
Enabled in slavealloc:
* tst-linux64-spot-{1770-1799} us-west2
* tst-linux64-spot-{1970-1999} us-east-1

Adding more in the next hour
All the new tst-linux64-spot-{1700-2099} instances are now enabled in slavealloc
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: