docker image jobs no longer retry on failure

RESOLVED FIXED

Status

enhancement
P1
normal
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: nthomas, Assigned: rail)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [releaseduty])

Attachments

(1 attachment)

55 bytes, text/x-github-pull-request
mtabara
: review+
Details | Review
(Reporter)

Description

2 years ago
Since we moved to the Queue API (bug 1259627) we don't retry the flaky docker image generation (bug 1367491). 

In announcing the move rail said:
There is one thing to watch out. The old API supports `reruns`, which helped us rerunning failed tasks automatically. I tried to work around the lack of this feature in the Queue API by using docker-worker's `onExitStatus`, but it may behave a bit differently.
Priority: -- → P1
Whiteboard: [releaseduty]
(Assignee)

Updated

2 years ago
Assignee: nobody → rail
(Assignee)

Comment 2

2 years ago
Posted file retry on 255
Attachment #8902131 - Flags: review?(mtabara)
(In reply to Rail Aliiev [:rail] ⌚️UTC+3 from comment #2)
> Created attachment 8902131 [details] [review]
> retry on 255

Replied in PR with requested changes. Will change the flags here as well once we've merged to avoid re-setting flag for review.
Attachment #8902131 - Flags: review?(mtabara) → review+
(Assignee)

Comment 4

2 years ago
Comment on attachment 8902131 [details] [review]
retry on 255

Deployed
Attachment #8902131 - Flags: checked-in+
(Assignee)

Comment 5

2 years ago
56.0b8 is not helping, there were not failures! :)
(Reporter)

Comment 6

2 years ago
Strongly suspect https://hg.mozilla.org/mozilla-central/rev/84fd52d2832a#l4.14 is the reason for that, and fixes bug 1367491.
(Reporter)

Comment 7

2 years ago
That hasn't been uplifted to beta though, so maybe something else/coincidence.
(Reporter)

Comment 8

2 years ago
We got retries for hg errors in tasks 0, 1, and 2 in https://tools.taskcluster.net/groups/PFi2U7q2SCWNvW-ud7TkWw/tasks/csCjgMfVQxqOyBV6aKAB3w/details. Then it failed in task 3 on a clamav error, where we get an exit status of -1.
(Assignee)

Comment 9

2 years ago
Added -1 to the list in https://github.com/mozilla-releng/releasetasks/pull/276 and deployed
(Reporter)

Updated

2 years ago
See Also: → 1398964
(Assignee)

Comment 10

2 years ago
Closing this. Bug 1398964 is a good to have.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.