Closed Bug 915465 Opened 8 years ago Closed 8 years ago

Pushes to try should not trigger tegra test jobs by default

Categories

(Release Engineering :: General, defect)

x86
All
defect
Not set
blocker

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: joduinn, Assigned: Callek)

References

Details

Attachments

(1 file, 1 obsolete file)

Until now, pushes to try default to having android builds tested on pandas (android4) and also on tegras (android2.2).

During this week's b2g workweek, our panda pool is keeping up with load just fine, but our smaller tegra pool is getting overrun, and 976 of the pending 1078 tegra jobs are from try. So we are changing the default as follows:

* pandas: no change, still tested by default
* tegras: no longer included by default; anyone who wants Android2.2 specific testing on tegras on try will have to specify that explicitly using usual try syntax.


After this workweek, we can revisit whether this needs to be reverted. 



Note: this change is for default on try only. There is no change to what any other (non-try) branches do for testing on tegras, those remain as-is.
So, after some IRC chat we *think* we can achieve the try_by_default solution for android 2.2 opt jobs, but we have never tested that before.

With that change we expect to have syntax like:
"try: -b o -p all -u all just gives Pandas, same but with -u all[Tegra] just gives tegras, -u all[Tegra,Panda] gives both"

I'll be attaching/deploying tonight a simpler patch to just make armv6 and noion off by default first.

Data for the armv6/noion disable alone:

Before Job count
Armv6:   M(10)+R(8)+1      =  19 jobs
2.2 Opt: M(10)+R(8)+1+T(8) =  27 jobs
NoIon:   R(3)              =  3 jobs
                           === 49 jobs

After job count
2.2 Opt: M(10)+R(8)+1+T(8) =  27 jobs

Total job reduction: 22 jobs
Try load compared to today: 55%
Attached patch [configs] part 1 (obsolete) — Splinter Review
This disables armv6, noion and not-currently-enabled-on-try android-debug.
Attachment #803470 - Flags: review?(nthomas)
Comment on attachment 803470 [details] [diff] [review]
[configs] part 1

r+ for now. We need sheriff buy-in if this is to stay in.
Attachment #803470 - Flags: review?(nthomas) → review+
(In reply to Nick Thomas [:nthomas] from comment #3)
> Comment on attachment 803470 [details] [diff] [review]
> [configs] part 1
> 
> r+ for now. We need sheriff buy-in if this is to stay in.

Yep of course. cc-ing.
Sounds good to me, but Ed and tomcat will be the first to have to deal with this.
Comment on attachment 803470 [details] [diff] [review]
[configs] part 1

clearing review, after discussion with joduinn we don't want to do *this* version, since we don't want to disable the actual builds, just the tests.
Attachment #803470 - Flags: review+
Attached patch [configs] take 2Splinter Review
This should do as c#0 wanted
Assignee: nobody → bugspam.Callek
Attachment #803470 - Attachment is obsolete: true
Attachment #803504 - Flags: review?(sphink)
Comment on attachment 803504 [details] [diff] [review]
[configs] take 2

Review of attachment 803504 [details] [diff] [review]:
-----------------------------------------------------------------

This will only do unittest jobs, not talos, but the try parser currently doesn't allow setting try_by_default=False for talos.

After this patch, you can force a push to use tegras only with

  try: -b do -p all -u all[Tegra]

or both with either

  try: -b do -p all -u all[Tegra,Panda]

or

  try: -b do -p all -u all[]
Attachment #803504 - Flags: review?(sphink) → review+
(In reply to Justin Wood (:Callek) from comment #9)
> I did a series of pushes to test the patch:
> 
> https://tbpl.mozilla.org/?showall=1&tree=Try&rev=b5765c8eb769

All builds, only panda tests:
 try: -b do -p all -u all

> https://tbpl.mozilla.org/?showall=1&tree=Try&rev=7cd5ec2ebd77

All builds, tegra *opt* tests only
 try: -b do -p all -u all[Tegra]

> https://tbpl.mozilla.org/?showall=1&tree=Try&rev=96bcc623982e

All builds, tegra and panda opt tests only
 try: -b do -p all -u all[Tegra,Panda]

> https://tbpl.mozilla.org/?showall=1&tree=Try&rev=b2fa677125b2

All bulds, just panda tests
 try: -b do -p android -u all[Panda]

> https://tbpl.mozilla.org/?showall=1&tree=Try&rev=4fe4815fbfd7

All builds, panda and tegra tests, just panda talos!?
 try: -b do -p all -u all[] -t all

> Awaiting the [relevant] builds to be done to see what tests get scheduled now

These results give me a slight pause on allowing this patch to stay in the tree...

Steve, can you identify and tell me/us:

* How do we run tegra noion and armv6 jobs
* Why talos didn't run for tegra, and can we get it to run

Coop, needinfo-ing you so you have the info here, and know about the outstanding issues we're waiting on.

Recovery is just a backout of this patch and a reconfig (afaict)
Flags: needinfo?(sphink)
Flags: needinfo?(coop)
...so from a dump-master I see:

<buildbotcustom.scheduler.BuilderChooserScheduler> {'buildbotBranch': 'try',
 'builderNames': ['Android 4.0 Panda try talos remote-troboprovider',
                  'Android 4.0 Panda try talos remote-ts',
                  'Android 4.0 Panda try talos remote-trobopan',
                  'Android 4.0 Panda try talos remote-tsvg',
                  'Android 4.0 Panda try talos remote-tp4m_nochrome',
                  'Android 4.0 Panda try talos remote-trobocheck2',
                  'Android 2.2 Tegra try talos remote-troboprovider',
                  'Android 2.2 Tegra try talos remote-ts',
                  'Android 2.2 Tegra try talos remote-trobopan',
                  'Android 2.2 Tegra try talos remote-tsvg',
                  'Android 2.2 Tegra try talos remote-tp4m_nochrome',
                  'Android 2.2 Tegra try talos remote-trobocheck2'],
 'change_filter': <ChangeFilter on branch == try-android-talos>,
 'chooserFunc': <function tryChooser>,
 'fileIsImportant': None,
 'name': 'tests-try-android-talos',
 'prettyNames': {'android': ['Android 4.0 Panda',
                             'Android 4.0 Panda',
                             'Android 2.2 Tegra try-nondefault'],
                 'android-armv6': ['Android 2.2 Armv6 Tegra try-nondefault'],
                 'android-noion': ['Android 2.2 no-ionmonkey Tegra try-nondefault'],
                 'android-x86': ['Android 4.2 x86 Emulator']},
 'properties': {'scheduler': 'tests-try-android-talos'},
 'talosSuites': ['remote-tsspider',
                 'remote-troboprovider',
                 'remote-ts',
                 'remote-tsvgx',
                 'remote-tcanvasmark',
                 'remote-trobopan',
                 'remote-tsvg',
                 'remote-tp4m_nochrome',
                 'remote-trobocheck2'],
 'treeStableTimer': None,
 'unittestPrettyNames': None,
 'unittestSuites': None}


Which explicitly shows try-nondefault for talos here... however I also note that you said -t all[foo] doesn't actually work so I suspect we entirely broke these for try for now.
(In reply to Justin Wood (:Callek) from comment #10)
> Steve, can you identify and tell me/us:
> 
> * How do we run tegra noion and armv6 jobs

Build jobs, I guess? I don't see any history of talos jobs for these, and tryChooser isn't passed in any, so I'm assuming they don't exist.

But I just now finally got it through my head what you were asking, and I don't know the answer yet. I'll have to switching to a different test environment for this. I will not clear needinfo yet. (It's just barely possible that you can get these jobs if you give try: -p full instead of -p all. I will check.)

> * Why talos didn't run for tegra, and can we get it to run

Talos didn't run for tegra because they were marked as try_nondefault, and I was foolishly passing in a hardcoded default filter to the test selection function (getTestBuilders). That meant that only try_by_default talos builders would ever get chosen, since there was no way to pass in a different filter.

Fixing this turned out to be pretty easy, see bug 915826. (Setting up a test environment, not so much.) Sadly, I have no workaround for the current situation; there's nothing you can put in a try push that will give you nondefault talos jobs.
Depends on: 915826
I'm clearing needinfo requests due to IRC pushback from sheriffs who were getting some confused and annoyed devs who had not seen any announcement or indication this would be the case.

The sheriffs prefer the option with this bug wontfixed (backed out) and taking the extra wait times for tegras. Of note is we also recovered a whole bunch of devices yesterday so wait times wont likely be THAT bad anyway.

http://hg.mozilla.org/build/buildbot-configs/rev/d6e8484c1606
http://hg.mozilla.org/build/buildbot-configs/rev/97caef8e4fd6
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(sphink)
Flags: needinfo?(coop)
Resolution: --- → WONTFIX
Component: Tools → General
You need to log in before you can comment on or make changes to this bug.