More capacity on the testing side

RESOLVED FIXED

Status

Release Engineering
Platform Support
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: armenzg, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

3 years ago
Created attachment 8453300 [details]
Screenshot from 2014-07-09 16:21:03.png

I have been developing a lot on Ash and Try and the lack of capacity can sometimes delay me quite a bit (1/2 day to a whole day - I have to find something else to do). The days that m-i is closed is when I can actually get results in a timely manner.

From the screenshot of slave health:
https://secure.pub.build.mozilla.org/builddata/reports/slave_health/index.html

I can see that talos-linux64-ix, t-w864-ix and t-w732-ix are our worst pools (even above tegras and Mac 10.8!).

From the recent testpool emails I can see that we could improve by adding more:
* t-ix-xp32
* t-ix-w732
* t-ix-w864
* talos-linux64-ix

We could also improve times by fixing bug 1036468.

We can't do much about mountainlion until we move to a new 10.9 test pool.
I'm surprised we're not doing well for snowleopard.

-------- Original Message --------
Subject: Wait: 85763/78.49% (testpool)
Date: Wed, 09 Jul 2014 06:01:10 -0700
From: nobody@mozilla.org
To: dev-tree-management@lists.mozilla.org
Newsgroups: mozilla.dev.tree-management

fedora: 12989
  0:    11805    90.88%
 15:      609     4.69%

linux-mock: 9710
  0:     6565    67.61%
 15:     1414    14.56%

mountainlion: 5142
  0:     3214    62.50%
 15:      471     9.16%

panda-android: 5853
  0:     5853   100.00%

snowleopard: 6482
  0:     4959    76.50%
 15:      199     3.07%

tegra: 3835
  0:     3772    98.36%
 15:       55     1.43%

ubuntu32_hw: 963
  0:      963   100.00%

ubuntu32_vm: 8450
  0:     8261    97.76%
 15:      179     2.12%

ubuntu64_hw: 993
  0:      513    51.66%
 15:      198    19.94%
 30:      102    10.27%
 45:       84     8.46%
 60:       30     3.02%

ubuntu64_vm: 10860
  0:     9596    88.36%
 15:      596     5.49%
 30:      298     2.74%
 45:      138     1.27%
 60:       62     0.57%
 75:       32     0.29%
 90:      114     1.05%
105:       24     0.22%

win7-ix: 6739
  0:     3660    54.31%
 15:     1128    16.74%
 30:      280     4.15%

win8: 7392
  0:     4018    54.36%
 15:     1131    15.30%
 30:      539     7.29%
 45:      102     1.38%

xp-ix: 6313
  0:     4127    65.37%
 15:      800    12.67%
 30:      236     3.74%
(Reporter)

Comment 1

3 years ago
Created attachment 8453302 [details] [diff] [review]
buildapi.diff
Attachment #8453302 - Flags: review?(catlee)

Comment 2

3 years ago
Armen I'm moving Android 2.3 jobs off of ix to a new c3.xlarge slave pool AWS in bug 1034055.  If you look at the pending count many of the ix jobs are for Android 2.3.
Depends on: 1034055

Comment 3

3 years ago
(In reply to Armen Zambrano [:armenzg] (EDT/UTC-4) from comment #0)
> I have been developing a lot on Ash and Try and the lack of capacity can
> sometimes delay me quite a bit (1/2 day to a whole day - I have to find
> something else to do). The days that m-i is closed is when I can actually
> get results in a timely manner.

This hopefully isn't news to you. You were responsible for this up until a month ago. Welcome to life as a dev. :/

> I can see that talos-linux64-ix, t-w864-ix and t-w732-ix are our worst pools
> (even above tegras and Mac 10.8!).
> 
> From the recent testpool emails I can see that we could improve by adding
> more:
> * t-ix-xp32
> * t-ix-w732
> * t-ix-w864
> * talos-linux64-ix

Again, this was a bug you were driving until recently: bug 950226. I discussed this Monday with Amy and have put more details in that bug.
 
> We could also improve times by fixing bug 1036468.

This should be done as an offshoot of the slave pre-flight tasks that Ian is working on. I've added bug 1036468 as a dependency.
 
> We can't do much about mountainlion until we move to a new 10.9 test pool.
> I'm surprised we're not doing well for snowleopard.

I'm putting 11 mtnlion new slaves into production today: bug 1036509
Depends on: 1036509
(Reporter)

Comment 4

3 years ago
Thank you Chris/Kim!
(Reporter)

Comment 5

3 years ago
Created attachment 8453809 [details] [diff] [review]
buildapi.diff
Attachment #8453302 - Attachment is obsolete: true
Attachment #8453302 - Flags: review?(catlee)
Attachment #8453809 - Flags: review?(catlee)

Comment 6

3 years ago
The ix machine pool is not a problem anymore, the mtnlion numbers with the new slaves are now better than windows. Armen do you still want to keep this bug open for the windows tests pool?
(Reporter)

Updated

3 years ago
Attachment #8453809 - Flags: review?(catlee) → review?(kmoir)
(Reporter)

Comment 7

3 years ago
(In reply to Kim Moir [:kmoir] from comment #6)
> The ix machine pool is not a problem anymore, the mtnlion numbers with the
> new slaves are now better than windows. Armen do you still want to keep this
> bug open for the windows tests pool?

I have no preference. Whatever works for you. This is not as bad of an issue for me.

For the record, if need be, a large portion of the talos-linux64 could be re-purposed for Windows.
The extra linux64 hw capacity was for Android x86 which does not seem that will completely take off (at least from my external POV).

Updated

3 years ago
Attachment #8453809 - Flags: review?(kmoir) → review+
(Reporter)

Comment 8

3 years ago
Comment on attachment 8453809 [details] [diff] [review]
buildapi.diff

https://hg.mozilla.org/build/buildapi/rev/3fde162e8a20
Attachment #8453809 - Flags: checked-in+

Comment 9

3 years ago
We'll be buying some more iX machines in bug 950226. Amy is working on numbers now.

We can't keep adding capacity indefinitely though, even in AWS. We need to start getting smarter about which tests we run and when. A good start would be bug 796087.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.