1322433 - Make it easier to retrigger a job with failing test with extra logging and debugging options

Reporter

Description

•

8 years ago

In the Stockwell discussion, :jesup suggested that we should make it easier to (re)trigger a test job with specific options:

* Extra logging (higher mozlog levels)
* Run just one specific test (probably the failing test) with specific options

Right now, it's possible to do this by formulating a try push with the right set of options, but there's probably only a very small people that know how to do this. A one click loaner is a possible alternative, but it still requires a bunch of babysitting/setup of the loaner.

Not exactly sure what the solution here is, but it should be more accessible than that. The ideal, if possible, would be to just have some kind of button / option in Treeherder to perform this type of action. i.e. "This test failed in a suspicious way, I want to click a button and have useful debugging information"

Hopefully that's an accurate summary of the discussion/request, let me know if I'm missing something.

Joel Maher ( :jmaher ) (UTC -8)

Comment 1

•

7 years ago

I really like this bug, there is another use case we discussed earlier this year in bug 1241535.

Even if we can only do this for taskcluster, that is fine- I see the ideal solution as a button in treeherder that pops up a small form prefilled out with the suggested debug flags for logging, and/or other mozharness options to make the harness behave properly.

As discussed in bug 1241535, we would need to indicate in the UI that this is not a typical job and to ensure that the data from the log is not accumulated in other tools (i.e. perfherder summaries, sheriff portals/autoclassification, mozreview summary).  While I am not a fan of tier-3 or hidden jobs, this might fit well in there.

Blocks: 1307197

Armen [:armenzg]

Comment 2

•

7 years ago

There are few ways this can be implemented. Ignoring Buildbot for now.

Once the developer has selected the job they want to re-schedule with modified options we can then take them there in various ways.

1 - The developer has logged in to TH with TC credentials and we can schedule tasks for them hitting TC task creation APIs

2 - The developer is redirected to the task creator [1] and let him modify the task there

3 - A pulse message is sent with some extra information (not the whole task) which we can pass to Mozharness + selected task ID
  * Pulse_actions handles the scheduling
  * A TC pulse listener handles the scheduling

Adding dustin to check if I'm missing something.
I would like to see #1 happen. It should not be too difficult.

[1] https://tools.taskcluster.net/task-creator/

Joel Maher ( :jmaher ) (UTC -8)

Comment 3

•

7 years ago

I don't like #2 as it is many steps and possibilities to get incorrect- and it is not discoverable via treeherder- this makes it harder to share unless you have the exact link.  #1 sounds good- especially if there are task creation APIs which exist :)

Joel Maher ( :jmaher ) (UTC -8)

Comment 4

•

7 years ago

hmm, wait, if #1 is just an automated #2, then it is hard to discover without the exact taskcluster link- unless there is a better way to easily find this information in the future from looking at a push on treeherder, I would prefer another method.

Armen [:armenzg]

Comment 5

•

7 years ago

In all of the options I listed above I expect a TH UI to help the developer select certain options.
No discoverability issues besides find the UI element on TH to get to the selection page.

We then decide if to:
1 - Take that information and schedule the task directly via TC APIs
2 - Fill up the task-creator with the right information (the dev can still make further modifications before hitting 'create task')
3 - Enough info is sent over Pulse and some tool will fulfill the scheduling

Joel Maher ( :jmaher ) (UTC -8)

Comment 6

•

7 years ago

the issue I have is that by looking at a push on treeherder, how will I know how to find the log and artifacts of the created task?  Whenever I edit&create a task, the job is not seen on treeherder and I have not found a way to get the information when I lose the link.  I imagine the default method will be "run task with preset options", but it would be nice to have more options for advanced task editing- there is a lot of value there although almost all developers will find it overwhelming.

Armen [:armenzg]

Comment 7

•

7 years ago

With TH routes/scopes you will be able to see directly on TH and won't need to keep track a link to the task created.

IIRC ATM developers don't get TH scopes.

Andrew Halberstadt [:ahal]

Comment 8

•

7 years ago

My initial thought is that we should use the task-creator for this. I agree that as it is now, the task-creator is too confusing to figure out what to do.. but we should be able to improve the UX there a bit, and fix bugs like adding an option to make them visible on treeherder (if they aren't already).

I suspect building a new system in treeherder will be a lot more work than iterating on the task-creator. My vote would be option #2 under the caveat that we spend time improving the task-creator.

Dustin J. Mitchell [:dustin] (he/him)

Comment 9

•

7 years ago

Bearing in mind that task-creator is not gecko-specific, that might be a bit tricky.  That said, task-creator is a few hundred lines of React, so it shouldn't be too hard to create a similar thing elsewhere, be that in releng web or treeherder.

William Lachance (:wlach)

Reporter

Comment 10

•

7 years ago

I think I agree with Dustin that the task creator isn't the right tool for this task, as it exposes too much of the internals of Taskcluster which isn't relevant here. Instead, I think a custom dialog inside Treeherder would be a better/easier user interface.

Tentative plan:

1. Have taskcluster jobs upload some kind of json file indicating extra configurable options that may be passed to them
2. Expose a GUI from Treeherder that reads from the above list, let's the user pick/enter the ones they want, and then retrigger the job using them.

Going to land a loose dependency on bug 1285007, the new menu there would be a great place to expose this option.

Assignee: nobody → wlachance

Depends on: 1285007

Dustin J. Mitchell [:dustin] (he/him)

Comment 11

•

7 years ago

This sounds like an action task, actually.  So far we don't have support for parameterizing action tasks, but that's probably relatively straightforward to add (if laborious, requiring specifying a list of fields and their types).

(not currently active) Ted Mielczarek

Comment 12

•

7 years ago

This is a great idea, and I just want to say that I agree that we should have some sort of specialized UI for it. What precisely that looks like, I dunno. The simplest thing I could imagine would be an entry in Treeherder's "Retrigger" button menu that says "Retrigger with extra logging", which would only cover about half of jesup's request. If we let tasks specify the knobs that could be fiddled then we could have that menu item pop up a little dialog to choose specific options, like:
```
[x] Enable verbose logging
[x] Enable WebRTC logs
...
[x] Run just tests in this path: [dom/media/webrtc   ]
```

William Lachance (:wlach)

Reporter

Comment 13

•

7 years ago

(In reply to Ted Mielczarek [:ted.mielczarek] from comment #12)
> This is a great idea, and I just want to say that I agree that we should
> have some sort of specialized UI for it. What precisely that looks like, I
> dunno. The simplest thing I could imagine would be an entry in Treeherder's
> "Retrigger" button menu that says "Retrigger with extra logging", which
> would only cover about half of jesup's request. If we let tasks specify the
> knobs that could be fiddled then we could have that menu item pop up a
> little dialog to choose specific options, like:
> ```
> [x] Enable verbose logging
> [x] Enable WebRTC logs
> ...
> [x] Run just tests in this path: [dom/media/webrtc   ]
> ```

Yes, I chatted with :dustin and a few others on #taskcluster about something like exactly this. I should have a more concrete proposal soon. :)

William Lachance (:wlach)

Reporter

Comment 14

•

7 years ago

Jonas just proposed a detailed design on retriggerable tasks which we could use as the basis for this: 

https://groups.google.com/d/msg/mozilla.tools/VeyYYCVuzak/ehaF_ScqBAAJ

I imagine we will file seperate bugs for all the taskcluster and/or treeherder components that would go into this, but we can continue to use this bug to track the overall status of this feature.

William Lachance (:wlach)

Reporter

Updated

•

7 years ago

Depends on: 1332506

William Lachance (:wlach)

Reporter

Comment 15

•

7 years ago

Attached file bug submitting — Details

Ok, so update, this is coming along. I have an in-tree implementation of mochitest retriggering set up, and I'm testing it via a try push. I think I have just about everything wired up, though I'm hitting a snag when actually submitting the job to taskcluster, getting the above error in a notification (after a fresh login on a private browsing instance).

Brian, do you know what might be up here? I based my code to submit the modified task on your work. The code is here:

https://github.com/mozilla/treeherder/blob/custom-job-actions/ui/js/controllers/tcjobactions.js#L30

Flags: needinfo?(bstack)

William Lachance (:wlach)

Reporter

Comment 16

•

7 years ago

(In reply to William Lachance (:wlach) (use needinfo!) from comment #15)
> Created attachment 8838691 [details]
> bug submitting
> 
> Ok, so update, this is coming along. I have an in-tree implementation of
> mochitest retriggering set up, and I'm testing it via a try push. I think I
> have just about everything wired up, though I'm hitting a snag when actually
> submitting the job to taskcluster, getting the above error in a notification
> (after a fresh login on a private browsing instance).
> 
> Brian, do you know what might be up here? I based my code to submit the
> modified task on your work. The code is here:
> 
> https://github.com/mozilla/treeherder/blob/custom-job-actions/ui/js/
> controllers/tcjobactions.js#L30

Brian was kind enough to work through this with me on irc. We've filed bug 1340668.

Depends on: 1340668

Flags: needinfo?(bstack)

William Lachance (:wlach)

Reporter

Updated

•

7 years ago

Depends on: 1341727

William Lachance (:wlach)

Reporter

Comment 17

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=846fb683a94e

William Lachance (:wlach)

Reporter

Comment 18

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a1399a2e26bd

Comment hidden (mozreview-request)

This will be used to restrict mochitest actions to mochitest jobs only.

Review commit: https://reviewboard.mozilla.org/r/115084/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/115084/

William Lachance (:wlach)

Reporter

Comment 20

•

7 years ago

I'm filing a review request for some work to add tags to mochitest jobs. The rest of this work is close, but not quite there, and I didn't want to let the tag work bitrot anymore (it had already broken once).

Jonas Finnemann Jensen (:jonasfj)

Comment 21

•

7 years ago

mozreview-review

Comment on attachment 8840631 [details]
Bug 1322433 - Make it possible to add tags + add a mochitest tag to mochitest jobs

https://reviewboard.mozilla.org/r/115084/#review116600

Attachment #8840631 - Flags: review?(jopsen) → review+

Pulsebot

Comment 22

•

7 years ago

Pushed by wlachance@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9a85f428c314
Make it possible to add tags + add a mochitest tag to mochitest jobs r=jonasfj

Carsten Book [:Tomcat]

Comment 23

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/9a85f428c314

Status: NEW → RESOLVED

Closed: 7 years ago

status-firefox54: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla54

William Lachance (:wlach)

Reporter

Comment 24

•

7 years ago

Sorry, should have marked this bug as leave-open. We're not done here yet.

Status: RESOLVED → REOPENED

status-firefox54: fixed → ---

Resolution: FIXED → ---

Target Milestone: mozilla54 → ---

William Lachance (:wlach)

Reporter

Comment 25

•

7 years ago

Ok so making some more progress. Thanks to :ahal, I got the idea of running the mach command to execute the mochitest *after* a "setup only" mozharness run. This simplifies things considerably, though it also introduces the problem of how to figure out the required arguments to mach that we need to run things here, which includes (at least):

* Whether to enable e10s
* mochitest flavor

I could make these environment variables, tags, or something else entirely in the taskcluster configs. Thoughts?

William Lachance (:wlach)

Reporter

Updated

•

7 years ago

Depends on: 1343327

William Lachance (:wlach)

Reporter

Comment 26

•

7 years ago

reviewboard seems to be getting angry with me for having multiple review requests belonging to this bug, so let's just split off the mochitest parts of this (which I wanted feedback on) to bug 1343327

William Lachance (:wlach)

Reporter

Updated

•

7 years ago

Depends on: 1347696

William Lachance (:wlach)

Reporter

Updated

•

7 years ago

Depends on: 1347698

William Lachance (:wlach)

Reporter

Updated

•

7 years ago

Depends on: 1347732

GitHub Autolander Bot

Comment 27

•

7 years ago

Attached file [treeherder] wlach:1322433 > mozilla:master (obsolete) — Details

William Lachance (:wlach)

Reporter

Comment 28

•

7 years ago

Comment on attachment 8847814 [details] [review]
[treeherder] wlach:1322433 > mozilla:master

mislabeled patch, sorry for the noise

Attachment #8847814 - Attachment is obsolete: true

William Lachance (:wlach)

Reporter

Updated

•

7 years ago

Depends on: 1348833

William Lachance (:wlach)

Reporter

Updated

•

7 years ago

Depends on: 1347654

William Lachance (:wlach)

Reporter

Updated

•

7 years ago

Assignee: wlachance → nobody

William Lachance (:wlach)

Reporter

Comment 29

•

7 years ago

Not actively working on this at the moment, current state described here:

https://wlach.github.io/blog/2017/04/easier-reproduction-of-intermittent-test-failures-in-automation/

Geoff Brown [:gbrown] (pto Apr 11-18)

Updated

•

5 years ago

Priority: -- → P3

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

bug submitting 7 years ago William Lachance (:wlach) 719 bytes, text/plain		Details
Bug 1322433 - Make it possible to add tags + add a mochitest tag to mochitest jobs 7 years ago William Lachance (:wlach) 59 bytes, text/x-review-board-request	jonasfj : review+	Details
[treeherder] wlach:1322433 > mozilla:master 7 years ago GitHub Autolander Bot 47 bytes, text/x-github-pull-request		Details \| Review