Mozharness test configs should live in-tree

RESOLVED FIXED

Status

Release Engineering
Mozharness
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: ahal, Assigned: ahal)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [use the patch in comment 30 if your try push is broken])

Attachments

(3 attachments, 6 obsolete attachments)

16.64 KB, patch
aki
: review+
Details | Diff | Splinter Review
29.62 KB, patch
ahal
: review+
Details | Diff | Splinter Review
1.23 KB, patch
aki
: review+
Details | Diff | Splinter Review
I think ideally both the scripts and configs would live in tree, but as a first step I would be happy if the options that get passed to the harness via command line live in tree.. i.e the 'mochitest_options' etc..

We keep getting bit by the fact that we can't add a new command line option to a harness since that option doesn't exist on older branches.
I assume the bits talking to those mozharness scripts also live out of tree?  (Sorry for the dumb questions, I am new here.)
(In reply to Nathan Froyd (:froydnj) from comment #1)
> I assume the bits talking to those mozharness scripts also live out of tree?
> (Sorry for the dumb questions, I am new here.)

Yep. Buildbot calls a mozharness script and passes a config in to it [1]. The script is what sets up the test slave environment and invokes the harness. The config is where the test harness commandline we use is defined. As of now, both script and config live in the mozharness repo.

I believe both should live in-tree (at least for testing). Getting the scripts in-tree would likely be a lot of work, but it shouldn't be too difficult to get most of the stuff defined in the config to live in-tree. We actually already have an in-tree mozharness config [2]. We'd need to have at least one config per platform (fennec, b2g, b2g-desktop, firefox) and teach all the test scripts how to use them. I think the best way to solve this is have the build copy the proper config into tests.zip.

[1] http://mxr.mozilla.org/build/source/buildbot-configs/mozilla-tests/b2g_config.py#585
[2] http://mxr.mozilla.org/mozilla-central/source/testing/config/mozharness_config.py
Component: Release Automation → General Automation
QA Contact: bhearsum → catlee
Assignee: nobody → ahalberstadt
Status: NEW → ASSIGNED
Created attachment 8392424 [details] [diff] [review]
(mozharness) Use in-tree configs for test options

This should do the trick on the mozharness side. I have an Ash run going here:
https://tbpl.mozilla.org/?tree=Ash&rev=f5bbca24993c

I'd be a bit surprised if I didn't mess anything up, so I'll hold off review until I get results.
Created attachment 8392428 [details] [diff] [review]
(m-c) Add platform specific mozharness config files

This is the gecko side that's needed. This would need to land on all branches (including ESR). But seeing as it's just adding a few files it's not a big problem.

Comment 5

3 years ago
Comment on attachment 8392424 [details] [diff] [review]
(mozharness) Use in-tree configs for test options

>diff --git a/scripts/desktop_unittest.py b/scripts/desktop_unittest.py
>--- a/scripts/desktop_unittest.py
>+++ b/scripts/desktop_unittest.py
>@@ -233,18 +233,17 @@ class DesktopUnittest(TestingMixin, Merc
>             # self.symbols_path when downloading/extracting.
>             if self.symbols_path:
>                 str_format_values['symbols_path'] = self.symbols_path
> 
>             # set pluginsPath
>             abs_app_plugins_dir = os.path.join(abs_app_dir, 'plugins')
>             str_format_values['test_plugin_path'] = abs_app_plugins_dir
> 
>-            name = '%s_options' % suite_category
>-            options = list(self.tree_config.get(name, c.get(name)))
>+            options = list(self.tree_config['%s_options' % suite_category])

Hm, do you want to leave this line alone, until the in-tree configs land on all trees?
Or are you going to land that patch on all branches that run desktop unittests, including esr/b2g* ?

Comment 6

3 years ago
(In reply to Andrew Halberstadt [:ahal] from comment #4)
> Created attachment 8392428 [details] [diff] [review]
> (m-c) Add platform specific mozharness config files
> 
> This is the gecko side that's needed. This would need to land on all
> branches (including ESR). But seeing as it's just adding a few files it's
> not a big problem.

Nm, answered :)
Yeah, I just figured it was worthwhile to land the gecko side everywhere to avoid dealing with the confusion of having the test options live in two places :)

Comment 8

3 years ago
(In reply to Andrew Halberstadt [:ahal] from comment #7)
> Yeah, I just figured it was worthwhile to land the gecko side everywhere to
> avoid dealing with the confusion of having the test options live in two
> places :)

This will break Try pushes of code that are pre-gecko-landing, or any sort of bisecting of pre-gecko-landing changesets via desktop_unittest.py.  I think we know this and are ok with it, just pointing it out explicitly.
If you'd prefer to phase out the old definitions, we could do that instead.. as long as we make it clear that they aren't being used so people don't get confused when their changes have no effect.
Hmm, all the builds on Ash are busted:
https://tbpl.mozilla.org/php/getParsedLog.php?id=36291129&tree=Ash&full=1

I don't think this is from my patches.. but at the same time I don't see similar bustage anywhere on m-c.. Not sure what is going on here.
Oh, I think this is bug 983957
Depends on: 983957
So we're hitting this on Ash I assume because of gbrown's prior push:
https://hg.mozilla.org/projects/ash/rev/20c9b342c0a4

Geoff, do you mind if we back that out until bug 983957 is fixed?
Flags: needinfo?(gbrown)
Oh darn. I agree my push is probably causing this problem.

I was trying to repro a crash with my push 20c9b342c0a4, have not hit it yet, and cannot re-trigger because of your push http://hg.mozilla.org/users/asasaki_mozilla.com/ash-mozharness/rev/ae0e4db14285 !  

I would prefer to get my re-triggers working (back out your change), see if I can repro my crash, and then backout 20c9b342c0a4, but if your need is urgent, go ahead and back me out.
Flags: needinfo?(gbrown) → needinfo?(ahalberstadt)
Sorry, I was not thinking about this right. I only need re-triggers, so there's no harm in backing out my ash change - I'll do that now.

When you can, backout your ash-mozharness push so that I can re-trigger, and let me know.
(In reply to Geoff Brown [:gbrown] from comment #13)
> Oh darn. I agree my push is probably causing this problem.

It wasn't your fault, just really unlucky coincidence that bug 983957 got merged in right after you disabled the profiler :). That bug just got an r+ so maybe you can re-land your patch soon.. actually I might just land the fix in there before it gets to m-c, that way we can both test out our changes at the same time. I also landed a patch which should fix the Android bustage I caused.

> I was trying to repro a crash with my push 20c9b342c0a4, have not hit it
> yet, and cannot re-trigger because of your push
> http://hg.mozilla.org/users/asasaki_mozilla.com/ash-mozharness/rev/
> ae0e4db14285 !

Sorry about that, I backed out my ash changes and re-triggered some of the tests you wanted. Let me know when you've re-triggered enough and I can re-land my changes. It isn't urgent, so no rush.
  
(In reply to Geoff Brown [:gbrown] from comment #14)
> Sorry, I was not thinking about this right. I only need re-triggers, so
> there's no harm in backing out my ash change - I'll do that now.
> 
> When you can, backout your ash-mozharness push so that I can re-trigger, and
> let me know.

Bug 983957 has a fix in it, so I re-landed your ash change along with the fix. That should allow us both to test our changes at the same time.
Flags: needinfo?(ahalberstadt)
(In reply to Andrew Halberstadt [:ahal] from comment #15)
> (In reply to Geoff Brown [:gbrown] from comment #13)
> Sorry about that, I backed out my ash changes and re-triggered some of the
> tests you wanted. Let me know when you've re-triggered enough and I can
> re-land my changes. It isn't urgent, so no rush.

Thanks. I am all done now - go ahead and re-land your changes.
   
> Bug 983957 has a fix in it, so I re-landed your ash change along with the
> fix. That should allow us both to test our changes at the same time.

Thanks again! But I don't want to keep this patch going forward. I'll back it out again on my next push, or you can if you push first.
This seems to be working on Ash! https://tbpl.mozilla.org/?tree=Ash&rev=a89ebef5dd15

I checked and all the oranges there seem to exist on previous pushes (before I landed these patches) too.
Created attachment 8393513 [details] [diff] [review]
(mozharness) Use in-tree configs for test options

Feel free to be extra harsh in your review :). Whatever we do here will probably be copied for any future "move stuff into tree" related work.
Attachment #8392424 - Attachment is obsolete: true
Attachment #8393513 - Flags: review?(aki)
Created attachment 8393514 [details] [diff] [review]
(m-c) Add platform specific mozharness config files
Attachment #8392428 - Attachment is obsolete: true
Attachment #8393514 - Flags: review?(aki)

Comment 20

3 years ago
Comment on attachment 8393514 [details] [diff] [review]
(m-c) Add platform specific mozharness config files

Stamping.

I'm wondering if someone will decide to change chunking in these files, without touching the appropriate buildbot portions, which will cause busted test chunks (if chunks are reduced) or missing tests (if chunks are increased).

Ideally we schedule from in-tree configs as well, but until that happens, a comment may help prevent that.

(Now that I mention this, changing # of chunks will now become a bigger process.  Either we need to change the # of chunks across all trees at the same time, or we have to have the # of chunks ride the trains.  gecko_version may help here.  http://hg.mozilla.org/build/buildbot-configs/file/f187ef094c33/mozilla-tests/config.py#l1987 )
Attachment #8393514 - Flags: review?(aki) → review+

Comment 21

3 years ago
Comment on attachment 8393513 [details] [diff] [review]
(mozharness) Use in-tree configs for test options

>diff --git a/mozharness/mozilla/testing/testbase.py b/mozharness/mozilla/testing/testbase.py
>--- a/mozharness/mozilla/testing/testbase.py
>+++ b/mozharness/mozilla/testing/testbase.py
>@@ -221,19 +221,19 @@ 2. running via buildbot and running the 
>         self.run_command(unzip_cmd, cwd=test_install_dir,
>                          halt_on_failure=True, success_codes=[0, 11])
> 
>     def _read_tree_config(self):
>         """Reads an in-tree config file"""
>         dirs = self.query_abs_dirs()
>         test_install_dir = dirs.get('abs_test_install_dir',
>                                     os.path.join(dirs['abs_work_dir'], 'tests'))
>-        tree_config_path = os.path.join(test_install_dir, 'config', 'mozharness_config.py')
> 
>-        if os.path.isfile(tree_config_path):
>+        if 'in_tree_config' in self.config:
>+            tree_config_path = os.path.join(test_install_dir, self.config['in_tree_config'])

I'm still concerned about the fact that we will no longer be able to run unit tests on revisions before the appropriate in-tree file lands.  Even if you land on all repos, there are still potential pushes-to-try from older revisions.  I'm not as worried about chemspill RELBRANCHes since we tend to run the tests off default and transplant to the relbranch.

Could you please add an os.path.exists for the tree_config_path, and fatal() with a human-parsable message if it doesn't exist?  (|"%s doesn't exist; see bug 981030" % tree_config_path| or something?)  I think that will help highlight when that is an issue.

>             self.tree_config.update(parse_config_file(tree_config_path))

If you think developers may edit the in-tree configs, we probably want to make this a little more resilient.
A try/except could help with malformed json/python, which would allow us to fatal() with a human-parsable message if parse_config_file() fails.
We don't have bug 699343 yet, but checking to make sure that self.tree_config looks somewhat sane after the update might also be a good thing.  That check might be script-specific, though, so might be more complex.

>             self.dump_config(file_path=os.path.join(dirs['abs_log_dir'], 'treeconfig.json'),
>                              config=self.tree_config)
>         self.tree_config.lock()

I'd like to see the os.path.exists and try/except with human-parsable fatal()s before proceeding with this.
If you can figure out how to do tree_config verification sanely (calling a script-specific helper method if it exists?) that's a nice to have.
(In reply to Aki Sasaki [:aki] from comment #20)
> Ideally we schedule from in-tree configs as well, but until that happens, a
> comment may help prevent that.
> 
> (Now that I mention this, changing # of chunks will now become a bigger
> process.  Either we need to change the # of chunks across all trees at the
> same time, or we have to have the # of chunks ride the trains. 
> gecko_version may help here. 
> http://hg.mozilla.org/build/buildbot-configs/file/f187ef094c33/mozilla-tests/
> config.py#l1987 )

Ah, my mistake.. I thought this *was* scheduling from in-tree. Yeah let's not do this then. I can either remove those chunks from my patch, or we can figure out how to define the --this-chunk and --total-chunks options directly in buildbot-configs, while leaving the rest in tree.
Attachment #8393513 - Flags: review?(aki)
This bug isn't really on my goals list or anything, it was just some low hanging fruit that's been tempting me for awhile. So I'm just going to do the easy thing and move Android's test_suite_definitions back into the mozharness config. Once in-tree scheduling lands (is there bug tracking that?) it'll be easy to move it back in-tree.

Also I like your idea of having robust error messages. I'll add those in as well.
Created attachment 8394181 [details] [diff] [review]
(mozharness) Use in-tree configs for test options

This adds some error handling and removes the "test_suite_definitions" from the android configs.

This will be tested by:
https://tbpl.mozilla.org/?tree=Ash&rev=64474ff93ee9
Attachment #8393513 - Attachment is obsolete: true
Attachment #8394181 - Flags: review?(aki)
Created attachment 8394187 [details] [diff] [review]
(m-c) Add platform specific mozharness config files

The m-c configs minus android "test_suite_definitions"
Attachment #8393514 - Attachment is obsolete: true
Attachment #8394187 - Flags: review?(aki)

Updated

3 years ago
Attachment #8394187 - Flags: review?(aki) → review+

Comment 26

3 years ago
Comment on attachment 8394181 [details] [diff] [review]
(mozharness) Use in-tree configs for test options

>diff --git a/mozharness/mozilla/testing/testbase.py b/mozharness/mozilla/testing/testbase.py
>--- a/mozharness/mozilla/testing/testbase.py
>+++ b/mozharness/mozilla/testing/testbase.py
>@@ -3,16 +3,17 @@
> # This Source Code Form is subject to the terms of the Mozilla Public
> # License, v. 2.0. If a copy of the MPL was not distributed with this file,
> # You can obtain one at http://mozilla.org/MPL/2.0/.
> # ***** END LICENSE BLOCK *****
> 
> import copy
> import os
> import platform
>+import traceback

You may not need this import; see below.

>+            try:
>+                self.tree_config.update(parse_config_file(tree_config_path))
>+            except:
>+                traceback.print_exc()

self.exception()?
You could call self.exception() and then self.fatal(), or self.exception(message="There was a problem...", level=FATAL)
Otherwise the traceback gets sent to stdout without getting logged.
r=me with this change.
Thanks for the patch!

>+                self.fatal("There was a problem parsing the in-tree configuration file '%s'!" %
>+                           os.path.join('gecko', 'testing', rel_tree_config_path))
Attachment #8394181 - Flags: review?(aki) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/284661ffd813
Whiteboard: [leave-open]
Created attachment 8394391 [details] [diff] [review]
(mozharness) Use in-tree configs for test options

Addresses review comment, carry r+ forward.
Attachment #8394181 - Attachment is obsolete: true
Attachment #8394391 - Flags: review+
Created attachment 8394392 [details] [diff] [review]
(mozharness) Use in-tree configs for test options

Sigh.. forgot to qref the fix for a syntax error that got caught on Ash. Carry r+ forward.
Attachment #8394391 - Attachment is obsolete: true
Attachment #8394392 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/284661ffd813
https://hg.mozilla.org/releases/mozilla-aurora/rev/cb30d2a59e2e
https://hg.mozilla.org/releases/mozilla-beta/rev/5f617c8542f6
https://hg.mozilla.org/releases/mozilla-release/rev/553170d0550f
https://hg.mozilla.org/releases/mozilla-b2g18/rev/24ed63715d1b
https://hg.mozilla.org/releases/mozilla-b2g18_v1_1_0_hd/rev/6c6ee476132d
https://hg.mozilla.org/releases/mozilla-b2g26_v1_2/rev/6e63dae37f6c
https://hg.mozilla.org/releases/mozilla-b2g28_v1_3t/rev/ddb2936e3936
https://hg.mozilla.org/releases/mozilla-b2g28_v1_3/rev/739f23826b27
https://hg.mozilla.org/releases/mozilla-esr24/rev/088645b0ecac

Note, this patch shouldn't have any affect on the tests. I didn't use DONTBUILD though because we'll want something to re-trigger to be sure the mozharness patch didn't break anything.

Should be ok to land the mozharness patch now, but I'll wait until Monday to give this patch a chance to get merged around to all the integration and project branches.
Also in case someone comes here wondering why this got pushed to all trees.. essentially this patch is needed to prevent test jobs from breaking. See comment 2 for a brief explanation.
Needed a follow-up fix on mozilla-esr24 because stage-config in testsuite-targets.mk was outdated:
https://hg.mozilla.org/releases/mozilla-esr24/rev/28728b57ec93
https://hg.mozilla.org/build/mozharness/rev/7f85fd954b6e
Comment on attachment 8394392 [details] [diff] [review]
(mozharness) Use in-tree configs for test options

This is in production now.
Attachment #8394392 - Flags: checked-in+
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
Whiteboard: [leave-open] → [use the patch in comment 30 if your try push is broken]
This is great to see. I had been thinking about this idea for few weeks. Awesome to see it live!
b2g18 seems broken?
https://tbpl.mozilla.org/php/getParsedLog.php?id=36747686&tree=Mozilla-B2g18
https://tbpl.mozilla.org/?tree=Mozilla-B2g18
Hmm, that file is definitely added. There must be something different with how the "config" directory is packaged into the tests.zip.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Thanks for noticing! This should fix it:
https://hg.mozilla.org/releases/mozilla-b2g18/rev/610edd7a4a2b

It was basically the same fix that was needed for esr24.
Status: REOPENED → RESOLVED
Last Resolved: 3 years ago3 years ago
Resolution: --- → FIXED
I forgot that toolkit-makefiles.sh used to exist:
https://hg.mozilla.org/releases/mozilla-b2g18/rev/39ca424b2003
Solving that seemed to expose another problem in b2g18. I think I need to set "use_puppet_packages" in the configs.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
https://hg.mozilla.org/releases/mozilla-b2g18/rev/170d87349060

This will fix emulator unittests but not Mn. I kind of backed myself into a corner here. The only way to properly fix this is to add a config for the marionette script too which again means landing a patch on every branch :(.. I should have just added an empty one to begin with while I was at it. I'll have to figure this out tomorrow.
Created attachment 8398456 [details] [diff] [review]
(mozharness) Hack to make Mn use puppetagain packages on b2g18

I'd rather do this gross hack than land a patch everywhere and break people's try pushes twice in the same week. At least we can remove this as soon as b2g18 is no longer supported..
Attachment #8398456 - Flags: review?(aki)

Comment 44

3 years ago
Comment on attachment 8398456 [details] [diff] [review]
(mozharness) Hack to make Mn use puppetagain packages on b2g18

Yeah, not pretty, but I'm ok with short lived hacks like this when they're preferable to the alternative.
Attachment #8398456 - Flags: review?(aki) → review+
https://hg.mozilla.org/build/mozharness/rev/da56d20558c5
Attachment #8398456 - Flags: checked-in+
in production.
Finally fixed.. :)
Status: REOPENED → RESOLVED
Last Resolved: 3 years ago3 years ago
Resolution: --- → FIXED
As appears to be the case all too often with this bug, I spoke too soon. Forgot to add mozilla-b2g18_v1_1_0_hd to the hack. Since it doesn't really make the hack any better or worse, and the change is trivial, landed with r=bustage:

https://hg.mozilla.org/build/mozharness/rev/e72022ec5011
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
in production.
Status: REOPENED → RESOLVED
Last Resolved: 3 years ago3 years ago
Resolution: --- → FIXED

Updated

3 years ago
Component: General Automation → Mozharness
Depends on: 1040383
Blocks: 1067535
You need to log in before you can comment on or make changes to this bug.