Closed Bug 1672967 Opened 4 years ago Closed 4 years ago

retriggering backfill task (*-bk) creates non-backfill version of chunk containing different tests

Categories

(Tree Management :: Treeherder, defect)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: aryx, Assigned: ahal)

References

(Depends on 1 open bug)

Details

Attachments

(5 files, 1 obsolete file)

Retriggering a backfill task (task label similar to *-bk) creates a non-backfill version of the chunk containing different tests. It should create a new *-bk task with the same test manifests in it.

Example:

  1. Open this Treeherder link.
  2. After the tasks loaded (and you are logged in with an account with level 3 commit access), press the 'R' key to request a retrigger.
  3. Wait until task gets shown.

Actual result:
M(mda2) task gets added with runs dom/media/webaudio/test/mochitest.ini

Expected result:
Another M-bk(mda2-dafa26b89ed-bk) containing dom/media/webaudio/test/mochitest.ini gets added

Assignee: nobody → ahal
Status: NEW → ASSIGNED

If we aren't able to assign the appropriate manifests to the backfill, the
backfill is useless. Let's just let the exception propagate and fail loudly,
rather than log an error and keep going.

According to the documentation, actions are supposed to have an "order" system,
where the first action that applies to a task via its context, is supposed to
take precedence. But when running the match algorithm, we never break out of
the loop. So we end up applying the last action that matches a tasks' context.

As far as I can tell, this means we have been using the 'retrigger-disabled'
action for everything and things like retriggering Decision tasks don't work.

Depends on D97350

Backfill tasks might run with a slightly different definition than the task
with the same label in the current task graph. When we encounter them, ensure
we create a new task with an (almost) identical definition, rather than
grabbing it from the current push's full-task-graph.json.

There's already an action that has this behaviour (that is used for the
decision task). So rather than re-implement this behaviour in the regular
retrigger action, this patch ensures that the 'retrigger-decision' action
applies to backfills rather than the regular one.

Depends on D97351

Pushed by ahalberstadt@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e50aaa2c8db5
[taskgraph.action] Don't proceed with creating task if 'test_manifest_modifier' failed, r=taskgraph-reviewers,jmaher
https://hg.mozilla.org/integration/autoland/rev/32f3ce3abd8f
[taskgraph.action] Fix bug in actions lookup, r=taskgraph-reviewers,jmaher
https://hg.mozilla.org/integration/autoland/rev/ca7036284a0f
[taskgraph.action] Ensure we copy the task definition when retriggering backfills, r=taskgraph-reviewers,jmaher

Looks like the above doesn't quite fix this bug (this was tricky to test on try, so not entirely unexpected), though it does enable an easy workaround. I attempted to retrigger a backfill on the autoland push where this landed:
https://treeherder.mozilla.org/jobs?repo=autoland&revision=ca7036284a0f339ed3a3c304d0c80414cd5180cd&searchStr=Linux%2C18.04%2Cx64%2Ctsan%2Copt%2CXpcshell%2Ctests%2Ctest-linux1804-64-tsan%2Fopt-xpcshell-e10s%2CX4

When I click the normal "retrigger" button, it created those X4 tasks (not what we want). But when I choose "Custom Actions => Retrigger", then it duplicates the backfill task as expected.

Sheriffs: until this is fixed, you can retrigger backfill tasks properly via 'Custom Actions => Retrigger'.

Sarah, do you know which action the retrigger button is toggling and why it might be different from the retriggering in the "Custom Actions" menu?

Flags: needinfo?(sclements)
Keywords: leave-open
Attached file Retrigger-button.rar

I noticed that if the cursor is on the upper part of the button, it will say "Repeat the selected job" and if it is on the lower part it will say "Retrigger job".

After checking the elements I noticed that there seems to be an action for the <button> element which is a little higher than the <svg> element that contains the spin arrow and another action for the <svg>.

I attached some screenshots (Retrigger-button.rar).

Attached file Treeherder button (obsolete) —

Pasting link to image directly.

Attachment #9188689 - Attachment is obsolete: true
Attached image retrigger-job.JPG

Try again

Attachment #9188682 - Attachment is obsolete: true

(In reply to Razvan Maries from comment #6)

Created attachment 9188682 [details]
Retrigger-button.rar

I noticed that if the cursor is on the upper part of the button, it will say "Repeat the selected job" and if it is on the lower part it will say "Retrigger job".

After checking the elements I noticed that there seems to be an action for the <button> element which is a little higher than the <svg> element that contains the spin arrow and another action for the <svg>.

I attached some screenshots (Retrigger-button.rar).

That doesn't have anything to do with it, there's just two titles: one for the icon and one for the button but there's only one event handler/action called.

Comment on attachment 9188682 [details]
Retrigger-button.rar

Sorry, I guess the .rar has four images.. shouldn't have obsoleted it.

Attachment #9188682 - Attachment is obsolete: false

(In reply to Sarah Clements [:sclements] from comment #9)

That doesn't have anything to do with it, there's just two titles: one for the icon and one for the button but there's only one event handler/action called.

I confirmed retriggering the backfill by clicking either the top or the bottom of the button has the same behaviour.

(In reply to Andrew Halberstadt [:ahal] from comment #5)

Sarah, do you know which action the retrigger button is toggling and why it might be different from the retriggering in the "Custom Actions" menu?

When I try retriggering via the action bar retrigger button, the 'retrigger-multiple' action object is used and we're not supplying the task id, so it seems the solution would be changing this: https://github.com/mozilla/treeherder/blob/master/ui/models/job.js#L115 to what custom actions is doing here: https://github.com/mozilla/treeherder/blob/master/ui/job-view/CustomJobActions.jsx#L148

retrigger button action object:

{
  "context": [],
  "description": "Create a clone of the task.",
  "extra": {
    "actionPerm": "generic"
  },
  "hookGroupId": "project-gecko",
  "hookId": "in-tree-action-3-generic/1bfd42f8b7",
  "hookPayload": {
    "decision": {
      "action": {
        "cb_name": "retrigger-multiple",
        "description": "Create a clone of the task.",
        "name": "retrigger-multiple",
        "symbol": "rt",
        "taskGroupId": "Lz2APtBVSJCBr_q7234cdA",
        "title": "Retrigger"
      },
      "push": {
        "owner": "mozilla-taskcluster-maintenance@mozilla.com",
        "pushlog_id": "131289",
        "revision": "98986391515e2dc0db41881316d9baa8ea228108"
      },
      "repository": {
        "level": "3",
        "project": "autoland",
        "url": "https://hg.mozilla.org/integration/autoland"
      }
    },
    "user": {
      "input": {
        "$eval": "input"
      },
      "taskGroupId": {
        "$eval": "taskGroupId"
      },
      "taskId": {
        "$eval": "taskId"
      }
    }
  },
  "kind": "hook",
  "name": "retrigger-multiple",
  "schema": {
    "properties": {
      "additionalProperties": false,
      "requests": {
        "items": {
          "additionalProperties": false,
          "tasks": {
            "description": "An array of task labels",
            "items": {
              "type": "string"
            },
            "type": "array"
          },
          "times": {
            "description": "How many times to run each task.",
            "maximum": 100,
            "minimum": 1,
            "title": "Times",
            "type": "integer"
          }
        },
        "type": "array"
      }
    },
    "type": "object"
  },
  "title": "Retrigger"
}

Versus what is used for custom actions --> retrigger.

{
  "context": [
    {
      "retrigger": "true"
    }
  ],
  "description": "Create a clone of the task.",
  "extra": {
    "actionPerm": "generic"
  },
  "hookGroupId": "project-gecko",
  "hookId": "in-tree-action-3-generic/1bfd42f8b7",
  "hookPayload": {
    "decision": {
      "action": {
        "cb_name": "retrigger",
        "description": "Create a clone of the task.",
        "name": "retrigger",
        "symbol": "rt",
        "taskGroupId": "Lz2APtBVSJCBr_q7234cdA",
        "title": "Retrigger"
      },
      "push": {
        "owner": "mozilla-taskcluster-maintenance@mozilla.com",
        "pushlog_id": "131289",
        "revision": "98986391515e2dc0db41881316d9baa8ea228108"
      },
      "repository": {
        "level": "3",
        "project": "autoland",
        "url": "https://hg.mozilla.org/integration/autoland"
      }
    },
    "user": {
      "input": {
        "$eval": "input"
      },
      "taskGroupId": {
        "$eval": "taskGroupId"
      },
      "taskId": {
        "$eval": "taskId"
      }
    }
  },
  "kind": "hook",
  "name": "retrigger",
  "schema": {
    "properties": {
      "downstream": {
        "default": false,
        "description": "If true, downstream tasks from this one will be cloned as well. The dependencies will be updated to work with the new task at the root.",
        "type": "boolean"
      },
      "times": {
        "default": 1,
        "description": "How many times to run each task.",
        "maximum": 100,
        "minimum": 1,
        "title": "Times",
        "type": "integer"
      }
    },
    "type": "object"
  },
  "title": "Retrigger"
}
Flags: needinfo?(sclements)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 1678329

That makes sense, thanks!

Yeah, since retrigger-multiple takes a list of labels rather than a task id, I don't think there is anyway to use it with backfills in our case. As far as I know it shouldn't be possible to retrigger multiple tasks using that "retrigger" button anyway, so your proposal makes sense! I filed bug 1678329 to track it.

I think I'm going to resolve this one for now after all, since it's fixed as far as mozilla-central is concerned. Sheriffs can use the Custom Actions version of "retrigger" until bug 1678329 is resolved, at which point the retrigger button will work as well.

Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Keywords: leave-open
Resolution: --- → FIXED
Component: Treeherder: Job Triggering & Cancellation → TreeHerder
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: