Closed Bug 1036609 Opened 10 years ago Closed 10 years ago

More capacity on the testing side

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: armenzg, Unassigned)

References

Details

Attachments

(2 files, 1 obsolete file)

I have been developing a lot on Ash and Try and the lack of capacity can sometimes delay me quite a bit (1/2 day to a whole day - I have to find something else to do). The days that m-i is closed is when I can actually get results in a timely manner.

From the screenshot of slave health:
https://secure.pub.build.mozilla.org/builddata/reports/slave_health/index.html

I can see that talos-linux64-ix, t-w864-ix and t-w732-ix are our worst pools (even above tegras and Mac 10.8!).

From the recent testpool emails I can see that we could improve by adding more:
* t-ix-xp32
* t-ix-w732
* t-ix-w864
* talos-linux64-ix

We could also improve times by fixing bug 1036468.

We can't do much about mountainlion until we move to a new 10.9 test pool.
I'm surprised we're not doing well for snowleopard.

-------- Original Message --------
Subject: Wait: 85763/78.49% (testpool)
Date: Wed, 09 Jul 2014 06:01:10 -0700
From: nobody@mozilla.org
To: dev-tree-management@lists.mozilla.org
Newsgroups: mozilla.dev.tree-management

fedora: 12989
  0:    11805    90.88%
 15:      609     4.69%

linux-mock: 9710
  0:     6565    67.61%
 15:     1414    14.56%

mountainlion: 5142
  0:     3214    62.50%
 15:      471     9.16%

panda-android: 5853
  0:     5853   100.00%

snowleopard: 6482
  0:     4959    76.50%
 15:      199     3.07%

tegra: 3835
  0:     3772    98.36%
 15:       55     1.43%

ubuntu32_hw: 963
  0:      963   100.00%

ubuntu32_vm: 8450
  0:     8261    97.76%
 15:      179     2.12%

ubuntu64_hw: 993
  0:      513    51.66%
 15:      198    19.94%
 30:      102    10.27%
 45:       84     8.46%
 60:       30     3.02%

ubuntu64_vm: 10860
  0:     9596    88.36%
 15:      596     5.49%
 30:      298     2.74%
 45:      138     1.27%
 60:       62     0.57%
 75:       32     0.29%
 90:      114     1.05%
105:       24     0.22%

win7-ix: 6739
  0:     3660    54.31%
 15:     1128    16.74%
 30:      280     4.15%

win8: 7392
  0:     4018    54.36%
 15:     1131    15.30%
 30:      539     7.29%
 45:      102     1.38%

xp-ix: 6313
  0:     4127    65.37%
 15:      800    12.67%
 30:      236     3.74%
Attached patch buildapi.diff (obsolete) — Splinter Review
Attachment #8453302 - Flags: review?(catlee)
Armen I'm moving Android 2.3 jobs off of ix to a new c3.xlarge slave pool AWS in bug 1034055.  If you look at the pending count many of the ix jobs are for Android 2.3.
Depends on: 1034055
(In reply to Armen Zambrano [:armenzg] (EDT/UTC-4) from comment #0)
> I have been developing a lot on Ash and Try and the lack of capacity can
> sometimes delay me quite a bit (1/2 day to a whole day - I have to find
> something else to do). The days that m-i is closed is when I can actually
> get results in a timely manner.

This hopefully isn't news to you. You were responsible for this up until a month ago. Welcome to life as a dev. :/

> I can see that talos-linux64-ix, t-w864-ix and t-w732-ix are our worst pools
> (even above tegras and Mac 10.8!).
> 
> From the recent testpool emails I can see that we could improve by adding
> more:
> * t-ix-xp32
> * t-ix-w732
> * t-ix-w864
> * talos-linux64-ix

Again, this was a bug you were driving until recently: bug 950226. I discussed this Monday with Amy and have put more details in that bug.
 
> We could also improve times by fixing bug 1036468.

This should be done as an offshoot of the slave pre-flight tasks that Ian is working on. I've added bug 1036468 as a dependency.
 
> We can't do much about mountainlion until we move to a new 10.9 test pool.
> I'm surprised we're not doing well for snowleopard.

I'm putting 11 mtnlion new slaves into production today: bug 1036509
Depends on: 1036509
Thank you Chris/Kim!
Attached patch buildapi.diffSplinter Review
Attachment #8453302 - Attachment is obsolete: true
Attachment #8453302 - Flags: review?(catlee)
Attachment #8453809 - Flags: review?(catlee)
The ix machine pool is not a problem anymore, the mtnlion numbers with the new slaves are now better than windows. Armen do you still want to keep this bug open for the windows tests pool?
Attachment #8453809 - Flags: review?(catlee) → review?(kmoir)
(In reply to Kim Moir [:kmoir] from comment #6)
> The ix machine pool is not a problem anymore, the mtnlion numbers with the
> new slaves are now better than windows. Armen do you still want to keep this
> bug open for the windows tests pool?

I have no preference. Whatever works for you. This is not as bad of an issue for me.

For the record, if need be, a large portion of the talos-linux64 could be re-purposed for Windows.
The extra linux64 hw capacity was for Android x86 which does not seem that will completely take off (at least from my external POV).
Attachment #8453809 - Flags: review?(kmoir) → review+
We'll be buying some more iX machines in bug 950226. Amy is working on numbers now.

We can't keep adding capacity indefinitely though, even in AWS. We need to start getting smarter about which tests we run and when. A good start would be bug 796087.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: