Often when I get a failure in a try push, I then proceed to spend a couple hours figuring out how to run the failing tests locally. It would be really useful to note in the information panel in the bottom left how to run the test suite (|./mach mochitest -blah -flag testdir|)
Yes, we really need to do this.
We should probably make this mandatory for Tier 1 jobs, though this arguably falls under the "Has sufficient documentation" requirement that all visible by default jobs fall under.
Bonus points for also linking to the wiki page documenting the job that we're also supposed to have for all visible jobs.
I'm sure technically this isn't too demanding, but what exactly is required? Just running the tests under mach or under mozharness? Just the test that failed, just the chunk that failed, or everything? There are certainly cases in which making the less-like-automation choices here will make the bug disappear.
I think instructions for how to reproduce the job would be ideal. In the case of a failure and we have something like run-by-dir, then have commands to just run the directory of failing tests. Maybe a template system for each job and the ability to take the errors and input them into the template?
I think the best way to surface this is via some artifact in the log (eg TinderboxPrint, or some more modern equivalent) - which would (a) be surfaced in the Treeherder UI [tinderboxprints already are], (b) also be visible in the raw log and/or console output locally for developers to read directly. (As opposed to trying to add some feature to Treeherder)
that is a great idea :emorley. Can we ensure that it doesn't get lost? sometimes I have trouble finding things in the info pane at the bottom
Longer term I'd love for us to move away from TinderboxPrints and use some more structured notation in the log (like is used for TalosResults). Treeherder already has the concept of different types of notations (eg links) - we could expand that to a handful of categories (eg screenshots, results stats, instructions to repro) and then have Treeherder do different things with them in the UI. The main thing is that I don't want Treeherder itself to have to generate the command line arguments from the job types etc - that should be provided to Treeherder via the harness/mozharness/... itself. It's worth also noting things like TaskCluster can already submit their own arbitrary artifacts to Treeherder - which would make for an alternative to having to output TinderboxPrints (/whatever replaces them) to the log itself.
fx ui harnesseses provide the "developer mode" command to copy/paste into the developer's machine. Showing that (or an equivalent mach command) would probably be a good start.
So, the as-close-to-automation-as-possible approach would be to finish bug 1207377 then get mozharness to dump out |mach mozharness <target> <args-passed-on-this-run>|. But I'm still not really clear if trying to be that correct is better than giving people something easier to run (i.e. not involving mozharness), or which is a smaller subset of tests. FWIW doing a the kind of minimum-atomic-block-of-tests thing that Joel suggested seems like it would require a lot of harness/testsuite-specific code.
I think ideally we'd want two things: 1 - for failing jobs, instructions on how to most simply run (the chunk/dir that contains) the failing tests 2 - for all jobs, instructions on how to reproduce the job locally with as much fidelity with CI as possible, probably using |mach mozharness|
as it stands, all tests are typically run through 'mach', so we should print out mach commands. mozharness commands are more complicated, and if we want that, then mach should encapsulate it. Fine grain stuff should be bonus points for an eager hacker, ideally mach will have easy usage for reducing the scope.
A quick fix in the interim, however, would be just to expose the |mach test| command (probably via TinderboxPrint) that would give a close approximation to running the job locally. That's far better than making devs sift through logs trying to figure this out themselves.
While certainly do-able, I think this problem is much harder than people are giving it credit for. You can't just copy/paste the command line that the mozharness script uses and then s/runtests.py/mach. There are all sorts of of options (e.g --appname/--xre-path/--utility-path) that automation needs to pass in, but you don't want to pass in when running locally. There are also all sorts of "if build_obj" statements, which set different defaults depending on whether a local build is detected or not. And, of course, every test suite has its own little nuances. Finally, some mach commands still use a completely different argument parser than their associated test harness. We'd need a set of rules, likely per harness, that can transform an automation test harness command line to a local test harness command line. This set of rules will need to be maintained regularly so it doesn't get out of date (since we don't currently catch regressions in mach commands). Alternatively test harnesses could be set up to read config files in addition to the command line arguments (a la configman, or similar). Automation could then pass in all "automation specific config options" via a config file. Then, whatever is left, would be the command line that you could pass in verbatim to mach. Finally, I think it's important to distinguish "get the command to run against a local build" from "get the command to run against a downloaded build". Is |mach test| for the former and |mach mozharness| for the latter? Which one are we talking about here?
I was assuming that you'd want to run against a local build and that, at least in the case of |mach <testsuite> <options>| mozharness would need per-testsuite logic to decide on the set of options to print. At lest the first part of that seems sensible to me because developers generally want to do extra things like attach a debugger or add logging when trying to reproduce an issue, so I suspect that trying to reproduce on the actual bits made on CI is less common. The second part would indeed be some effort to maintain. But if it's really just the |mach <testsuite>| then having to pass in many options in order to get a working run that's broadly comparable to CI suggests a problem with our default configurations/mach targets. Clearly chunking options will have to be added but is it going to always be half a dozen more options that I haven't thought of?
I thought we were wanting to move towards a model where automation used mach too? In that world, wouldn't this problem go away to a certain extent?
(In reply to Ed Morley [:emorley] from comment #16) > I thought we were wanting to move towards a model where automation used mach > too? In that world, wouldn't this problem go away to a certain extent? No, no matter how we refactor things, there will always be the "automation" case and the "local" case. Even if they have the exact same entrypoint, they'll still have separate configuration and separate code paths. (As an aside, I'm quite opposed to any automation using mach.)
This is a problem we have been facing forever and it is partly due to our automation testing model being different than the local testing model. Instead of testing in the same way that a developer would do with their local build, we have a model based on packaged tests. We could bend over backwards to try to make both as close as possible without changing the model, however, I believe bringing the two models to be similar again would be the first step to minimize differences. IMHO that would make it easier to minimize differences in the long term.
For now, the easiest and lowest hanging fruit is to output the developer mode mozharness command which can be run directly from the gecko source tree. After that, have the same entry points for automation versus local runs.