Closed Bug 1612345 Opened 5 years ago Closed 5 years ago

Better document retrigger with logging, fix broken cases, and support more test suites

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: jmaher, Assigned: gbrown)

References

(Blocks 1 open bug,
URL
)

Details

(Whiteboard: dev-prod-2020)

Attachments

(10 files)

custom action - retrigger-mochitest 5 years ago Armen [:armenzg] 191.79 KB, image/png		Details
Bug 1612345 - Generalize the retrigger-mochitest action; r= 5 years ago Geoff Brown [:gbrown] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1612345 - Support --setpref in geckoview-junit; r= 5 years ago Geoff Brown [:gbrown] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1612345 - Fix retrigger support for android mochitest variable parameters; r= 5 years ago Geoff Brown [:gbrown] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1612345 - Add custom retrigger support for geckoview-junit; r= 5 years ago Geoff Brown [:gbrown] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1612345 - Convert gtest argument parser to argparse; r= 5 years ago Geoff Brown [:gbrown] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1612345 - Change defaults for custom retrigger action; r= 5 years ago Geoff Brown [:gbrown] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1612345 - Add custom retrigger support for gtest; r= 5 years ago Geoff Brown [:gbrown] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1612345 - Ensure that most custom retriggers repeat the original task by default; r= 5 years ago Geoff Brown [:gbrown] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1612345 - Restrict custom retrigger to docker-worker tasks; r= 5 years ago Geoff Brown [:gbrown] 47 bytes, text/x-phabricator-request		Details \| Review

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Description

•

5 years ago

currently we have the ability via treeherder and custom actions to retrigger jobs with special flags (such as --gecko-profiler). There are other flags, I would like to find ways to simplify this for developers who are investigating failures.

This might be something to add to treeherder, or in-tree action tasks. Here is an example of an action task to retrigger a task with --gecko-profile:
https://searchfox.org/mozilla-central/source/taskcluster/taskgraph/actions/gecko_profile.py

I think moving forward here are some steps:

find flags to add to env/browser/etc. for getting more logging
add support to do it via custom tricks
make shortcuts in pushhealth :)

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 1

•

5 years ago

Alex, could you outline some logging flags you would like to see for your use case and indicate where to add the flags?

Flags: needinfo?(achronop)

Geoff Brown [:gbrown]

Assignee

Comment 2

•

5 years ago

I would like to see Android included, as there have been similar requests from the geckoview team, and that case is slightly more complicated, at least for environment variables (setting the host env does not affect the device or test app).

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 3

•

5 years ago

:gbrown, are there some use cases you could identify here (or cc/needinfo users)?

Alex Chronopoulos [:achronop]

Comment 4

•

5 years ago

We have several log flags that are all activated the same way so I will give an example with just one.

In a local run, all you need to do is to set the MOZ_LOG env. You can do it per run with something like:
MOZ_LOG=timestamps,MediaTrackGraph:4 ./mach run

On a try run, in order to capture logs I modify specific files by appending the falg(s) that I would like to activate here and here. I am not sure if this is the best way to do it but it is reliable.

In general, I would expect that the logic will allow the user to specify the desired flags.

Flags: needinfo?(achronop)

Armen [:armenzg]

Comment 5

•

5 years ago

(In reply to Alex Chronopoulos [:achronop] from comment #4)

We have several log flags that are all activated the same way so I will give an example with just one.

Which kind of jobs can take advantage of such flags? Builds? Tests? Perf tests?
I would like to determine the context.

In a local run, all you need to do is to set the MOZ_LOG env. You can do it per run with something like:
MOZ_LOG=timestamps,MediaTrackGraph:4 ./mach run

I've tried such command locally and it is an unknown command. Is this instead ./mach run-desktop?

On a try run, in order to capture logs I modify specific files by appending the falg(s) that I would like to activate here and here. I am not sure if this is the best way to do it but it is reliable.

Do you mean that you make code changes and then push to try?

Flags: needinfo?(achronop)

Alex Chronopoulos [:achronop]

Comment 6

•

5 years ago

(In reply to Armen [:armenzg] from comment #5)

Which kind of jobs can take advantage of such flags? Builds? Tests? Perf tests?
I would like to determine the context.

I am not familiar with context, I believe the answer here is Tests.

I've tried such command locally and it is an unknown command. Is this instead ./mach run-desktop?

You need to build firefox.

[17:36:34 firefox]$ ./mach run -h
usage: mach [global arguments] run [command arguments]

Run the compiled program, possibly under a debugger or DMD.
...

Do you mean that you make code changes and then push to try?

Yes, I change the files I pointed out and push on try.

Flags: needinfo?(achronop)

Armen [:armenzg]

Comment 7

•

5 years ago

Attached image custom action - retrigger-mochitest — Details

I believe we have the ability to rerun a task with customizations of the env variables, however, it is restricted to mochitests and reftests.

After evaluating the steps below, could you please answer the following?

Should this be better documented? and where?
Is the current feature sufficient to support your workflow?

I assume it might just be sufficient to reduce the restriction of just mochitests and reftests. The action is defined here.

STR:

Load a mochitest or reftest. For instance this
In the panel details look for the 3 dots icon ("Other job details")
Click "Custom actions..."
I modified the payload: changed MOZ_LOG, repeat of 1 and no runUntilFail [1]
You can see both jobs in here
The live log is here (until the job actually completes)

If you search for "MOZ_LOG" in the live log you will see that on line 760 the MOZ_LOG value is set.

Is this the kind of output you would expect it to show up?

[task 2020-01-30T18:17:17.345Z] GECKO(1752) | [Child 1831: GraphRunner]: D/MediaTrackGraph Moving tracks between suspended and runningstate: mTracks: 0, mSuspendedTracks: 1
I don't see such output in the non-custom mochitest.

[1]

environment:
  MOZ_LOG: 'timestamps,MediaTrackGraph:4'
logLevel: debug
path: ''
preferences:
  mygeckopreferences.pref: myvalue2
repeat: 1
runUntilFail: false

Alex Chronopoulos [:achronop]

Comment 8

•

5 years ago

That's awesome thank you. I am pretty happy with this workflow. It's not something easy to find if you don't know where to look at, but since I know how to do it it's more than enough.

(In reply to Armen [:armenzg] from comment #7)

I believe we have the ability to rerun a task with customizations of the env variables, however, it is restricted to mochitests and reftests.

That's most of our tests. They are 70% of our test collection. If we can enable gtest, which is the rest 30%, it will be everything.

After evaluating the steps below, could you please answer the following?

Should this be better documented? and where?

Please do, if we use a separate page specific to logging (in order to pop up in a google/wiki search), and link it in the general wiki try page will benefit a lot of people. I mean this is the first place I would have looked for it. I am thinking to create a wiki page for my teammates so if you do not mind nongeneric instructions we can go with it.

On top of that it would be equally beneficial to mention in the general wiki try page a handy way to create a new try run with logs activated, without modifying files, if possible.

Is the current feature sufficient to support your workflow?

Absolutely, I've verified the logs in the custom retrigger and that is the outcome that I was looking for.

I assume it might just be sufficient to reduce the restriction of just mochitests and reftests. The action is defined here.

As I said, if you consider adding gtests it will be golden for my workflow.

Armen [:armenzg]

Comment 9

•

5 years ago

I will handle it next week. Happy to help :)

Assignee: nobody → armenzg

Status: NEW → ASSIGNED

Summary: expose and allow for "retrigger" with logging → Better document retrigger with logging & add gtest

Geoff Brown [:gbrown]

Assignee

Comment 10

•

5 years ago

(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #3)

:gbrown, are there some use cases you could identify here (or cc/needinfo users)?

My main goal is to achieve the same level of support for Android/geckoview as for desktop. All of the existing support in mochitest_retrigger_action (MOZ_LOG, --repeat, --runUntilFailure, prefs, etc) are equally useful for geckoview. Some of those features probably already work on geckoview, but I'm sure anything involving environment variables will not.

Also, geckoview devs rely heavily on the geckoview-junit test suite, so it would be great if that suite was supported.

Armen [:armenzg]

Comment 11

•

5 years ago

I've added this documenation:
https://wiki.mozilla.org/ReleaseEngineering/TryServer#Re-running_tasks_with_custom_parameters_from_Treeherder

Please add changes as you see fit.

Armen [:armenzg]

Comment 12

•

5 years ago

My first patch to include test-type tasks for gtest and geckoview-junit, however, those tasks don't have those values defined.

Here's the code that sets test-type as tags

@transforms.add
def set_test_type(config, tests):
    for test in tests:
        for test_type in ['mochitest', 'reftest', 'talos', 'raptor']:
            if test_type in test['suite'] and 'web-platform' not in test['suite']:
                test.setdefault('tags', {})['test-type'] = test_type
        yield test

This second patch aims to permit this action for all tasks defined as kind: test.

We can either adjust the def set_test_type(...) to add test-type to a larger set of suites OR we make this action applicable for all kinds of tasks marked as kind:test.

The former option requires opting in which suites will permit this kind of actions which means more trial and error.
The latter option will permit triggering this action on tasks that will perhaps not actually apply the changes requested and cause some confusion on developers trying. The advantage of this option is that it will not require having to opt-in specific suites.

Please specify the preferred approach. I prefer the latter.

The current state of affairs can be seen in this push. It seems that I need to fix one more thing.

For the curious, here's the documentation for the context section.

Armen [:armenzg]

Comment 13

•

5 years ago

achronop, gbrown, Could you please look around in this push and try some custom scheduling?

Flags: needinfo?(gbrown)

Flags: needinfo?(achronop)

Geoff Brown [:gbrown]

Assignee

Comment 14

•

5 years ago

I tried but none of my attempts actually ran any tests. I see --no-run-tests specified and then an attempt to run mach which results in "It looks like you are trying to run an unknown mach command:". :(

Flags: needinfo?(gbrown)

Alex Chronopoulos [:achronop]

Comment 15

•

5 years ago

I've just tried a new run with a different log flag. We don't have that many gtests for MediaTrackGraph. I'll let you know when it is finished.

Flags: needinfo?(achronop)

Armen [:armenzg]

Comment 16

•

5 years ago

achronop: Could you please check again? Thanks!

Flags: needinfo?(achronop)

Armen [:armenzg]

Comment 17

•

5 years ago

Nvm. I need to look into something first.

Flags: needinfo?(achronop)

Alex Chronopoulos [:achronop]

Comment 18

•

5 years ago

I copy here the output when I run locally one gtest with logs on:

$ MOZ_LOG=MediaTrackGraph:4 ./mach gtest TestAudioCallbackDriver.*
Running GTest tests...
Note: Google Test filter = TestAudioCallbackDriver.*
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from TestAudioCallbackDriver
[ RUN      ] TestAudioCallbackDriver.StartStop
[(null) 10493: Main Thread]: D/MediaTrackGraph 0x7f3025e9bbc0: AudioCallbackDriver ctor
[(null) 10493: Main Thread]: D/MediaTrackGraph 0x7f3025e9bbc0: AudioCallbackDriver 0x7f2ffd62e200 Falling back to SystemClockDriver.
[(null) 10493: Main Thread]: D/MediaTrackGraph Starting thread for a SystemClockDriver  0x7f2ffd25d6c0
[(null) 10493: Main Thread]: D/MediaTrackGraph Starting new audio driver off main thread, to ensure it runs after previous shutdown.
[(null) 10493: MediaTrackGrph]: D/MediaTrackGraph Starting a new system driver for graph 0x7f2ffd25d6c0
[(null) 10493: MediaTrackGrph]: W/MediaTrackGraph 0x7f2ffd25d6c0: Global underrun detected
[(null) 10493: MediaTrackGrph]: D/MediaTrackGraph 0x7f2ffd25d6c0: Time did not advance
[(null) 10493: CubebOperation #1]: D/MediaTrackGraph 0x7f3025e9bbc0: AsyncCubebOperation::INIT driver=0x7f2ffd62e200
[(null) 10493: CubebOperation #1]: D/MediaTrackGraph Effective latency in frames: 512
[(null) 10493: CubebOperation #1]: D/MediaTrackGraph AudioCallbackDriver State: STARTED
[(null) 10493: CubebOperation #1]: D/MediaTrackGraph 0x7f3025e9bbc0: AudioCallbackDriver started.
[(null) 10493: Main Thread]: D/MediaTrackGraph 0x7f3025e9bbc0: Releasing audio driver off main thread (GraphDriver::Shutdown).
[(null) 10493: CubebOperation #1]: D/MediaTrackGraph 0x7f3025e9bbc0: AsyncCubebOperation::SHUTDOWN driver=0x7f2ffd62e200
Couldn't convert chrome URL: chrome://branding/locale/brand.properties
[10493, Main Thread] WARNING: Could not get the program name for a cubeb stream.: 'NS_SUCCEEDED(rv)', file /home/achronop/repos/mozilla/firefox/dom/media/CubebUtils.cpp, line 381
[(null) 10493: CubebOperation #1]: D/MediaTrackGraph AudioCallbackDriver State: STOPPED
[       OK ] TestAudioCallbackDriver.StartStop (202 ms)
[----------] 1 test from TestAudioCallbackDriver (202 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (203 ms total)
[  PASSED  ] 1 test.

Alex Chronopoulos [:achronop]

Comment 19

•

5 years ago

I tried using the custom re-trigger for mochitests but it did not work. The run is in [1]. The log file is very small and there is a failure message at the end.

[1]https://treeherder.mozilla.org/#/jobs?repo=try&author=achronop%40gmail.com&selectedJob=288524781

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Updated

•

5 years ago

Whiteboard: dev-prod-2020

Joel Maher ( :jmaher ) (UTC -8)

Reporter

Comment 20

•

5 years ago

Armen is moving to another project. Gbrown will work to figure out next steps on this for specific test harnesses and Sarah (in a few weeks) can help out with Treeherder related work items

Status: ASSIGNED → NEW

Geoff Brown [:gbrown]

Assignee

Updated

•

5 years ago

Assignee: armenzg → gbrown

Priority: -- → P2

Geoff Brown [:gbrown]

Assignee

Comment 21

•

5 years ago

:armenzg - I intend to continue where you left off in comment 12; if you have any other work-in-progress, tips, thoughts, etc. please let me know.

Flags: needinfo?(armenzg)

Armen [:armenzg]

Comment 22

•

5 years ago

Would you mind metting next week with me to go over it?
Except Wednesdays all other days are available.

My suggestion is moving to one action script per suite rather than trying to use one for many.
It will also be good to talk about how to make these kind of changes easy to test locally without having to use the CI.

Flags: needinfo?(armenzg)

Geoff Brown [:gbrown]

Assignee

Updated

•

5 years ago

Updated

•

5 years ago

URL: https://wiki.mozilla.org/ReleaseEngin...

Geoff Brown [:gbrown]

Assignee

Updated

•

5 years ago

Summary: Better document retrigger with logging & add gtest → Better document retrigger with logging, fix broken cases, and support more test suites

Geoff Brown [:gbrown]

Assignee

Comment 23

•

5 years ago

Attached file Bug 1612345 - Generalize the retrigger-mochitest action; r= — Details

Simple update to strings and names for the custom retrigger action, in preparation
for the addition of more tasks.

Geoff Brown [:gbrown]

Assignee

Comment 24

•

5 years ago

Attached file Bug 1612345 - Support --setpref in geckoview-junit; r= — Details

Add a --setpref option to geckoview-junit with the same meaning and
help description as used in mochitest.

Geoff Brown [:gbrown]

Assignee

Comment 25

•

5 years ago

Attached file Bug 1612345 - Fix retrigger support for android mochitest variable parameters; r= — Details

The retrigger custom action is busted for Android tasks, failing with "KeyError: u'remote_webserver'",
because it assumes a mozharness configuration format that was changed long ago. This patch brings
things up to date.

Geoff Brown [:gbrown]

Assignee

Updated

•

5 years ago

Keywords: leave-open

Pulsebot

Comment 26

•

5 years ago

Pushed by gbrown@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/1e4b56bff5a1 Generalize the retrigger-mochitest action; r=bc https://hg.mozilla.org/integration/autoland/rev/2caf817caafd Support --setpref in geckoview-junit; r=bc https://hg.mozilla.org/integration/autoland/rev/524ea269c239 Fix retrigger support for android mochitest variable parameters; r=bc

Natalia Csoregi [:nataliaCs]

Comment 27

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/1e4b56bff5a1
https://hg.mozilla.org/mozilla-central/rev/2caf817caafd
https://hg.mozilla.org/mozilla-central/rev/524ea269c239

Geoff Brown [:gbrown]

Assignee

Comment 28

•

5 years ago

Still to do:

add geckoview-junit
add gtest
better defaults (or new UI?)
add other suites that might already be mostly supported
environment pass-through for android
chunk support (keep chunk arguments of original task)
allow override of harness arguments
crashreporting in new task (esp symbols-path)

Geoff Brown [:gbrown]

Assignee

Comment 29

•

5 years ago

Attached file Bug 1612345 - Add custom retrigger support for geckoview-junit; r= — Details

Pulsebot

Comment 30

•

5 years ago

Pushed by gbrown@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/99aa249db142 Add custom retrigger support for geckoview-junit; r=bc

Cristina Coroiu [:ccoroiu]

Comment 31

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/99aa249db142

Geoff Brown [:gbrown]

Assignee

Comment 32

•

5 years ago

Attached file Bug 1612345 - Convert gtest argument parser to argparse; r= — Details

Convert the gtest option parser from optparse to argparse. mochitest, reftest,
and other suites use argparse. Using argparse will simplify the integration
of gtest with the custom retrigger action.

Pulsebot

Comment 33

•

5 years ago

Pushed by gbrown@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/53ad80b01022 Convert gtest argument parser to argparse; r=bc

Oana Pop-Rus

Comment 34

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/53ad80b01022

Geoff Brown [:gbrown]

Assignee

Comment 35

•

5 years ago

Attached file Bug 1612345 - Change defaults for custom retrigger action; r= — Details

Update the default values to avoid common pitfalls, such as trying to repeat
a 30-minute long tasks 30x times with extra logging!
The new defaults allow a simple re-run of most tasks with no changes.
While we are here, tweak the parameter descriptions.

Pulsebot

Comment 36

•

5 years ago

Pushed by gbrown@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/fdbb08f3e279 Change defaults for custom retrigger action; r=bc

Bogdan Tara[:bogdan_tara | bogdant]

Comment 37

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/fdbb08f3e279

Geoff Brown [:gbrown]

Assignee

Comment 38

•

5 years ago

Attached file Bug 1612345 - Add custom retrigger support for gtest; r= — Details

Add test package mach support for gtest and hook into the custom retrigger
action. Some existing custom retrigger features, like setting gecko prefs,
are not (easily) applicable to gtest, which doesn't use mozprofile; for
this reason, use a separate action context with items suitable for gtest.

Geoff Brown [:gbrown]

Assignee

Updated

•

5 years ago

Comment 39

•

5 years ago

Pushed by gbrown@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9e60199e3597 Add custom retrigger support for gtest; r=bc

Cosmin Sabou [:CosminS]

Comment 40

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/9e60199e3597

Geoff Brown [:gbrown]

Assignee

Comment 41

•

5 years ago

Attached file Bug 1612345 - Ensure that most custom retriggers repeat the original task by default; r= — Details

Various updates to the custom retrigger action so that, without any custom changes to
parameters, the retriggered task runs with the same parameters as the original task.
Several issues were found and corrected, notably:

parameters like --allow-software-gl-layers were ignored
MOZHARNESS_TEST_PATHS was ignored
using repeat=1 by default meant that each test ran twice

Geoff Brown [:gbrown]

Assignee

Comment 42

•

5 years ago

(In reply to Geoff Brown [:gbrown] from comment #28)

Still to do:
x add geckoview-junit
x add gtest
x better defaults (or new UI?)

add other suites that might already be mostly supported

environment pass-through for android

chunk support (keep chunk arguments of original task)

allow override of harness arguments

crashreporting in new task (esp symbols-path)

windows/mac/android-hw (generic-worker) support

Pulsebot

Comment 43

•

5 years ago

Pushed by gbrown@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/27f290b2a3e9 Ensure that most custom retriggers repeat the original task by default; r=bc

Noemi Erli[:noemi_erli]

Comment 44

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/27f290b2a3e9

Geoff Brown [:gbrown]

Assignee

Comment 46

•

5 years ago

Attached file Bug 1612345 - Restrict custom retrigger to docker-worker tasks; r= — Details

The custom retrigger actions work well on linux and android-em, but fail
on windows, osx, and android-hw. At least part of the problem seems to be
the worker implementation, but I am not entirely clear on what goes wrong.
It looks like I won't have much more time for retrigger improvements in the
near future, so I'd prefer to "turn off" the actions on tasks known to fail.
I found helpful examples for the 'context' parameter in
https://searchfox.org/mozilla-central/source/taskcluster/docs/actions.rst

Pulsebot

Comment 47

•

5 years ago

Pushed by gbrown@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/818a39ddca16 Restrict custom retrigger to docker-worker tasks; r=bc

Razvan Maries

Comment 48

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/818a39ddca16

Geoff Brown [:gbrown]

Assignee

Comment 49

•

5 years ago

I'm declaring victory here. The main remaining element I was waiting for was environment pass-through for android, but there is a separate bug for that. Otherwise, I think the retrigger action is in good shape and reasonably documented now.

Status: NEW → RESOLVED

Closed: 5 years ago

Resolution: --- → FIXED

Geoff Brown [:gbrown]

Assignee

Updated

•

5 years ago

Keywords: leave-open

Mike Hommey [:glandium]

Updated

•

4 years ago

Blocks: 1690174

You need to log in before you can comment on or make changes to this bug.